Dual Linear Programs


This is just a quick summary of some proofs and derivations that I needed to do for some lab work review, really really simple stuff. This is more of a place for me to make sure that I have absolutely no holes in my understanding and less of a resource for others.

Canonical Form Linear Progarm:

We can describe a general form of a linear optimization problem as follows, in what is called the “Canonical Form”. The decision variable xRnx\in\mathbb{R}^n forms an objective with a cost vector cRnc\in\mathbb{R}^n and is constrained by a set of constraint variables ARm×nA\in\mathbb{R}^{m\times n} and bRmb\in\mathbb{R}^m, where s.t.\text{s.t.} stands for subject to\text{subject to}. The constraint x0x\geq 0 is standard for removing impossible solutions in physical problems and also for deriving the things later.

minx  cTxs.t.Axb,x0\min_x\space\space c^Tx\quad\text{s.t.}\quad Ax\geq b,x\geq 0

Lagrangian Form:

Let’s say we wanted to convey this optimization without explicit constraints and instead handle them within the function itself. We can do this with the Lagrangian L(x,y)L(x,y) as follows where yRmy\in\mathbb{R}^m. In general terms, f(x)f(x) is the objective of the problem and h(x)h(x) describes each functional constraint (in this case ignoring x0x\geq 0) such that gi(x)0g_i(x)\leq 0 satisfies the respective constraint.

L(x,y)=f(x)+yTh(x)=cTx+yT(bAx)minx0maxy0cTx+yT(bAx)\begin{gather*} L(x,y)=f(x)+y^Th(x)=c^Tx+y^T(b-Ax)\\ \min_{x\geq 0}\max_{y\geq 0}c^Tx+y^T(b-Ax) \end{gather*}

Since we have y0y\geq 0, for the maximum over yy to not collapse to infinity, we need bAx0b-Ax\leq 0, which naturally gives rise to the constraint AxbAx\geq b. Since the maximum of yy would be 00 when multiplied with a negative term, whenever the constraint is satisfied this leaves the original optimization problem. This idea also naturally gives rise to the necesitty of having y0y\geq 0, since without it we would have degenerate maximums both when the constraint is satisfied and when it is not.

Lagrangian Dual Form:

If we flip the min\min and max\max in the problem, we can derive the dual form of the problem. Duality has a very malleable concept in mathematics, being used to define a way that two structures can be paired with some level of intrinsic symmetry. In our case, this is done by turning our minimization problem, our original problem known as the primal, into a maximization problem, known as the dual.

The function g(y)g(y) below is often known as the Lagrange Dual Function and will have a couple of notable properties proven later.

g(y)=minxL(x,y)=minx0[cTx+yT(bAx)]g(y)=yTb+minx0[cTxyTAx]=yTb+minx0[(cATy)Tx]\begin{gather*} g(y)=\min_xL(x,y)=\min_{x\geq0}[c^Tx+y^T(b-Ax)]\\ g(y)=y^Tb+\min_{x\geq0}[c^Tx-y^TAx]=y^Tb+\min_{x\geq0}[(c-A^Ty)^Tx] \end{gather*}

We can use the same reasoning of avoiding degenerate minimums for this minimization to prove that we need cATy0c-A^Ty\geq 0 since x0x\geq 0. The same reasoning that if cATy0c-A^Ty\leq 0 and x0x\geq 0 we can know that the minimum is when x=0x=0 can also be used as previous to derive a constraint based optimization problem for the dual problem, along with the fact that we know the transpose of a scalar is equal to the scalar.

maxy0  bTys.t.ATyc\begin{gather*} \max_{y\geq 0}\space\space b^Ty\quad\text{s.t.}\quad A^Ty\geq c \end{gather*}

This problem may not seem significant at all, but we can prove that the values of the primal and dual are linked.

Weak Duality:

Weak Duality guarantees that the solution to any dual problem will always act as a lower bound for the solution to the primal problem. In the case of the simple linear programs here the benefit of optimizing over the dual form instead of the primal may seem restricted to efficiency, but in harder optimization schemes the dual often has a much better optimization landscape. For our cases with Linear Programs, it is often used when a problem has many more decision variables than it does constraints, since the dual has decision variables in line with the number of constraints instead.

For our basic linear program, we can trivially prove weak duality with the bounds of each problem. Since we know that AxbAx\geq b and y0y\geq 0, we also know that bTy(Ax)Tyb^Ty\leq (Ax)^Ty, which we can combine with the fact that ATycA^Ty\geq c and x0x\geq 0 to prove the following.

bTy(Ax)Ty=xTATyxTc=cTxb^Ty\leq (Ax)^Ty=x^TA^Ty\leq x^Tc=c^Tx

Since we have bTycTxb^Ty\leq c^Tx, we know that maxbTymincTx\max b^Ty\leq \min c^Tx, proving weak duality for this problem.

We can also prove this for the general Lagrangian case by proving that g(y)g(y) is always less than f(x)f(x). This can be proven below using the solution of the primal xx^*, which we know satisfies h(x)0h(x^*)\leq 0 since it has to satisfy the constraints.

L(x,y)=f(x)+yg(x)\begin{gather*} L(x^*,y)=f(x^*)+yg(x^*) \end{gather*}

Since xx^* has to satisfy all of the constraints, we know that h(x)0h(x^*)\leq 0, which combined with y0y\geq 0 can remove yg(x)yg(x^*) from the Lagrangian with the same reasoning as before. By definition of the minimum, we know that the following is satisfied, thus proving weak duality for all dual problems.

g(y)=minx0L(x,y)L(x,y)=f(x)\begin{gather*} g(y)=\min_{x\geq 0}L(x,y)\leq L(x^*,y)=f(x^*) \end{gather*}

Strong Duality:

Strong Duality is a condition of an optimization problem that guarantees that value of the solution to the dual problem will give exactly the value of the primal’s solution. This does not hold for all primal dual pairs, but we can prove that it holds for our specific case of linear programs, which means that there exists some z=cTxz=c^Tx^* and z=bTyz=b^Ty^*.

This can be proven by setting up a linear system that would find such a yy^*, assuming that no such yy^* exists, and then finding a contradiction using Farkas’ Lemma, specifically the Inequality Form, which states that exactly one of the following is true for a matrix AA and a vector bb.

  1. There exists x0x\geq 0 such that AxbAx\leq b
  2. There exists y0y\geq 0 such that ATy0A^Ty\geq 0 and bTy<0b^Ty<0

Since we want a yy^* that satisfies the constraints, we know that yy^* needs to satisfy both ATycA^Ty^*\leq c to be feasible and bTyzb^Ty^*\geq z. This lets us set up a system of inequalities as follows.

[ATbT]y[cz]\begin{bmatrix}A^T\\-b^T\end{bmatrix}y\leq\begin{bmatrix}c\\-z\end{bmatrix}

If we assume that this system has no solutions, then we know that there has to be some vector [x;λ]0[x;\lambda]\geq 0 such that Axλb0Ax-\lambda b\geq 0 and cTxλz<0c^Tx-\lambda z<0, which can be rewritten as AxλbAx\geq\lambda b and cTx<λzc^Tx<\lambda z. We can break this up into cases based on the value of λ\lambda to show that this is not possible.

In the case where λ>0\lambda>0, we can show a contradiction by dividing by λ\lambda, giving A(x/λ)bA(x/\lambda)\geq b and cT(x/λ)<zc^T(x/\lambda)<z. The first inequality shows that (x/λ)(x/\lambda) is feasible for the primal, which combined with the second would show that there is a feasible xx that leads to a minimum strictly less than zz, which leads to a contradiction because we defined zz as the minimum.

In the case where λ=0\lambda=0, this implies that Ax0Ax\geq 0 and cTx<0c^Tx<0. We can add this point to the optimal xx^* to get A(x+x)=Ax+Axb+0A(x+x^*)=Ax+Ax^*\geq b+0 and cT(x+x)=cTx+cTx=(some value <0)+cTxc^T(x+x^*)=c^Tx+c^Tx^*=(\text{some value }<0) + c^Tx^*, which means that (x+x)(x+x^*) would be a feasible solution leading to a lower value than zz, which again leads to a contradiction.

Since this leads to an impossible situation, we know that there must exist some yy that satisfies both inequalities. Since this yy satisfies bTyzb^Ty^*\geq z, and we have already proven bTyzb^Ty\leq z from weak duality, we know that bTy=zb^Ty=z, thus proving strong duality for this linear program pair.