MRF optimization via dual decomposition

From testwiki
Jump to navigation Jump to search

Template:Multiple issues

In dual decomposition a problem is broken into smaller subproblems and a solution to the relaxed problem is found. This method can be employed for MRF optimization.[1] Dual decomposition is applied to markov logic programs as an inference technique.[2]

Background

Discrete MRF Optimization (inference) is very important in Machine Learning and Computer vision, which is realized on CUDA graphical processing units.[3] Consider a graph G=(V,E)with nodes Vand Edges E. The goal is to assign a label lpto each pVso that the MRF Energy is minimized:

(1) minΣpVθp(lp)+Σpqεθpq(lp)(lq)

Major MRF Optimization methods are based on Graph cuts or Message passing. They rely on the following integer linear programming formulation

(2) minxE(θ,x)=θ.x=pVθp.xp+pqεθpq.xpq

In many applications, the MRF-variables are {0,1}-variables that satisfy: xp(l)=1label l is assigned to p, while xpq(l,l)=1 , labels l,l are assigned to p,q.

Dual Decomposition

The main idea behind decomposition is surprisingly simple:

  1. decompose your original complex problem into smaller solvable subproblems,
  2. extract a solution by cleverly combining the solutions from these subproblems.

A sample problem to decompose:

minxΣifi(x)where xC

In this problem, separately minimizing every single fi(x)over x is easy; but minimizing their sum is a complex problem. So the problem needs to get decomposed using auxiliary variables {xi}and the problem will be as follows:

min{xi},xΣifi(xi)where xiC,xi=x

Now we can relax the constraints by multipliers {λi} which gives us the following Lagrangian dual function:

g({λi})=min{xiC},xΣifi(xi)+Σiλi.(xix)=min{xiC},xΣi[fi(xi)+λi.xi](Σiλi)x

Now we eliminate x from the dual function by minimizing over x and dual function becomes:

g({λi})=min{xiC}Σi[fi(xi)+λi.xi]

We can set up a Lagrangian dual problem:

(3) max{λi}Λg(λi)=Σigi(xi), The Master problem

(4) gi(xi)=minxifi(xi)+λi.xiwhere xiCThe Slave problems

MRF optimization via Dual Decomposition

The original MRF optimization problem is NP-hard and we need to transform it into something easier.

τis a set of sub-trees of graph Gwhere its trees cover all nodes and edges of the main graph. And MRFs defined for every tree T in τwill be smaller. The vector of MRF parameters is θTand the vector of MRF variables is xT, these two are just smaller in comparison with original MRF vectors θ,x. For all vectors θTwe'll have the following:

(5) Tτ(p)θpT=θp,Tτ(pq)θpqT=θpq.

Where τ(p)and τ(pq)denote all trees of τthan contain node pand edge pqrespectively. We simply can write:

(6) E(θ,x)=TτE(θT,xT)

And our constraints will be:

(7) xTχT,xT=x|T,Tτ

Our original MRF problem will become:

(8) min{xT},xΣTτE(θT,xT)where xTχT,Tτand xTx|T,Tτ

And we'll have the dual problem we were seeking:

(9) max{λT}Λg({λT})=TτgT(λT),The Master problem

where each function gT(.)is defined as:

(10) gT(λT)=minxTE(θT+λT,xT)where xTχTThe Slave problems

Theoretical Properties

Theorem 1. Lagrangian relaxation (9) is equivalent to the LP relaxation of (2).

min{xT},x{E(x,θ)|xpT=sp,xTCONVEXHULL(χT)}

Theorem 2. If the sequence of multipliers {αt}satisfies αt0,limtαt=0,t=0αt=then the algorithm converges to the optimal solution of (9).

Theorem 3. The distance of the current solution {θT}to the optimal solution {θ¯T}, which decreases at every iteration.

Theorem 4. Any solution obtained by the method satisfies the WTA (weak tree agreement) condition.

Theorem 5. For binary MRFs with sub-modular energies, the method computes a globally optimal solution.

References

Template:Reflist