Nov. 13, 2018 $\newcommand{\bs}{\boldsymbol}$ $\newcommand{\argmin}[1]{\underset{\bs{#1}}{\text{arg min}}\,}$ $\newcommand{\argmax}[1]{\underset{\bs{#1}}{\text{arg max}}\,}$ $\newcommand{\tr}{^{\top}}$ $\newcommand{\norm}[1]{\left|\left|\,#1\,\right|\right|}$ $\newcommand{\given}{\,|\,}$ $\DeclareMathOperator{\E}{\mathbb{E}}$ $\newcommand{\blue}[1]{{\color{blue} {#1}}}$ $\newcommand{\red}[1]{{\color{red} {#1}}}$ $\newcommand{\orange}[1]{{\color{orange} {#1}}}$

Let $P \subset \mathbb{R}^n$ be a `polyhedron`

in standard form, characterized by

$$ P = \{\bs{x} \in \mathbb{R}^n : \bs{A}\bs{x} =\bs{b}, \bs{x}\geq \bs{0}\}, \tag{1} $$

such that $\bs{A}\in \mathbb{R}^{m\times n}$ is a matrix with full row rank, and $\bs{b}\geq \bs{0}$.

We call $\overline{\bs{x}} \in \mathbb{R}^n$ a `basic solution`

(BS) if

- $\bs{A}\overline{\bs{x}} = \bs{b}$.
- At least $n$ active constraints are linearly indepedent. (A constraint $f(x) \geq 0$ is active at $\overline{x}$ if $f(\overline{x})=0$.)

A basic solution $\overline{\bs{x}}$ is a `basic feasible solution`

(BFS) if $\overline{\bs{x}}\in P$. This definition makes sense intuitively as it characterizes the "corners" in a polyhedron.

Any linear program (LP) can be reformulated as

$$ \min_{\bs{x}\in P} \quad \bs{c}\tr \bs{x} \tag{2} $$

where $\bs{c}$ is a fixed $n$-vector and $P$ is given by $\blue{(1)}$. We have the following important theorem.

Theorem 1Suppose we have a non-empty polyhedron in standard form (1), and a linear program given by (2). Then we have that $\bs{x}$ is a basic solution if and only if $\bs{x}$ solves the linear system $\bs{A}\bs{x}=\bs{b}$ and there exists $m$ column indices$$ B = \{B_1, B_2,..., B_m\} $$

such that the columns $\bs{A}_{B_1},..., \bs{A}_{B_m}$ of $\bs{A}$ are linearly independent and $x_j=0$ if $j \not\in B$.

*Proof*

$(\Rightarrow)$ Suppose that $\bs{x}$ is a basic solution. Then $\bs{A}\bs{x}=\bs{b}$. Let $B_1,..., B_k$ be the nonzero indices of $\bs{x}$. Then we have the linear system

$$ \left\{\begin{aligned} & \bs{b}=\bs{A}\bs{x} = \left(\bs{A}_{B_1},..., \bs{A}_{B_k}\right)\begin{pmatrix} x_{B_1} \\ \vdots \\ x_{B_k} \end{pmatrix} \\ & x_j=0, \quad \forall j\not\in \{B_1,..., B_k\}. \end{aligned}\right. $$

This is a $(n\times n)$ linear system with rank $n$ since by the definition of basic solution, there must be at least $n$ linearly independent active constraints. As a direct result, the system has a unique solution. Hence, the system

$$ \left(\bs{A}_{B_1},..., \bs{A}_{B_k}\right)\begin{pmatrix} x_{B_1} \\ \vdots \\ x_{B_k} \end{pmatrix} = \bs{b} $$

has a unique solution, showing that the columns $\bs{A}_{B_1},..., \bs{A}_{B_k}$ are linearly independent.

We also assumed that $\bs{A}$ has full row rank, i.e. $\text{rank}(\bs{A})=m$. Hence, $k\leq m$. Extending basis if necessary, we obtain $\{B_1,..., B_m\}$ that satisfies Thereom 1. This proves the forward direction.

$(\Leftarrow)$ Now suppose that $\bs{x} \in \mathbb{R}^n$ satisfies $\bs{A}\bs{x} = \bs{b}$ and there exists column indieces $B=\{B_1,..., B_m\}$ such that $\bs{A}_{B_1},..., \bs{A}_{B_m}$ are linearly independent and $x_j=0$ if $j\not\in B$. Similar as before, we have the following system of active constraints:

$$ \left\{\begin{aligned} & \bs{b}=\bs{A}\bs{x} = \left(\bs{A}_{B_1},..., \bs{A}_{B_m}\right)\begin{pmatrix} x_{B_1} \\ \vdots \\ x_{B_m} \end{pmatrix} \\ & x_j=0, \quad \forall j\not\in \{B_1,..., B_m\}. \end{aligned}\right. $$

After reordering the variables, the matrix of the system becomes

$$ \begin{pmatrix} (\bs{A}_B)_{m\times m} & \\ & \bs{I}_{n-m} \end{pmatrix}. $$

where $\bs{A}_B = \left(\bs{A}_{B_1},..., \bs{A}_{B_m}\right)$, is the $m\times m$ matrix consisting of the $m$ `basic columns`

of $\bs{A}$. This matrix has rank $n$; thererfore, we obtain $n$ active constraints at $\bs{x}$ that are linearly independent. Hence $\bs{x}$ is a basic solution.

$\blacksquare$ *QED* $\blacksquare$

Thus, for a basic solution $\bs{x}$, the variables $x_{B_1},..., x_{B_m}$ are called `basic variables`

. The remaining variables are called `non-basic variables`

. For notational convenience, we represent the set of non-basic indices by

$$ N=\{N_1,..., N_{n-m}\}. $$

Also we let $\bs{x}_B = (x_{B_1},..., x_{B_m})\tr$ denote the vector of basic variables and let $\bs{x}_N = (x_{N_1},..., x_{N_{n-m}})\tr$ denote the vector of non-basic variables.

Using this notation, we can rewrite $\bs{A}\bs{x} = \bs{b}$ as

$$ \begin{pmatrix} \bs{A}_B & \bs{A}_N \end{pmatrix}\begin{pmatrix} \bs{x}_B \\ \bs{x}_N \end{pmatrix} = \bs{b}, $$

where $\bs{A}_N = (\bs{A}_{N_1},..., \bs{A}_{N_{n-m}})$, and $\bs{x}_N = \bs{0}$ from Theorem 1. Since $\bs{A}_B$ is invertible, the sufficient condition for a basic solutions is

$$ \bs{x}_N=0 \quad \text{ and } \quad \bs{x}_B = \bs{A}_B^{-1}\bs{b} \geq 0. \tag{3} $$

Consider the linear program expressed in the form below:

$$ \begin{aligned} \min &\quad \bs{c}\tr\bs{x} \\ \text{s.t.} &\quad \bs{A}\bs{x}=\bs{b}, \,\, \bs{x}\geq 0, \,\, \bs{b} \geq 0 \\ & \quad \bs{A} \in \mathbb{R}^{m\times n}, \,\, \text{rank}(\bs{A}) =m. \end{aligned} \tag{4} $$

Since $\blue{(4)}$ is a convex optimization, such that linear constraint qualification (LCQ) holds, we have that the Karush-Kuhn-Tucker (KKT) conditions are both necessary and sufficient for optimality. The KKT conditions can be written in the vectorized form:

$$ \begin{aligned} (\text{PF}) & \quad \bs{A}\bs{x}=\bs{b}, \,\, \bs{x}\geq \bs{0} \\ (\text{DF}) & \quad \bs{c} - \bs{\lambda} + \bs{A}\tr \bs{\mu} = \bs{0}, \,\, \bs{\lambda}\geq \bs{0} \\ (\text{CS}) & \quad \bs{\lambda}\tr \bs{x} = \bs{0}. \end{aligned} \\ $$

It can be shown that if the above system is solvable, one optimal solution must be a basic feasible solution. Note that the primal feasibility (PF) condition can be replaced by $\blue{(3)}$. Let $B$ and $N$ denote the sets of basic and non-basic indices, we can rewrite the KKT conditions as

$$ \begin{aligned} (\text{PF}) & \quad \bs{x}_B = \bs{A}_B^{-1} \bs{b} \geq 0, \,\, \bs{x}_N = \bs{0} & \\ (\text{DF}) & \quad \bs{\lambda}_B, \bs{\lambda}_N \geq \bs{0} & (\text{DF}1) \\ & \quad \bs{c}_B - \bs{\lambda}_B + \bs{A}_B\tr \bs{\mu} = \bs{0} & (\text{DF}2) \\ & \quad \bs{c}_N - \bs{\lambda}_N + \bs{A}_N\tr \bs{\mu} = \bs{0} & (\text{DF}3) \\ (\text{CS}) & \quad \bs{\lambda}_B\tr \bs{x}_B = \bs{0}, \,\, \bs{\lambda}_N\tr \bs{x}_N = \bs{0} & \\ \end{aligned} \tag{5} $$

For this post, we assume that the LP is `non-degenerate`

, i.e. that $\bs{x}_B = \bs{A}_B^{-1}\bs{b}>0$. By (CS), we have that

$$ \bs{\lambda}_B=\bs{0}.\tag{6} $$

Hene, by (DF2), we have

$$ \bs{\mu} = -(\bs{A}_B\tr)^{-1}\bs{c}_B. \tag{7} $$

Substitute $\blue{(7)}$ into (DF3) we obtain

$$ \bs{c}_N - \bs{A}_N\tr(\bs{A}_B\tr)^{-1}\bs{c}_B = \bs{\lambda}_N \geq \bs{0}. \tag{8} $$

We can augment $\blue{(8)}$ by adding the basic variables:

$$ \begin{aligned} & \begin{pmatrix} \bs{c}_B\\ \bs{c}_N \end{pmatrix} - \begin{pmatrix} \bs{A}_B\tr \\ \bs{A}_N\tr \end{pmatrix}(\bs{A}_B\tr)^{-1} \bs{c}_B \geq \bs{0}. \\ & \implies \bs{c} - \bs{A}\tr (\bs{A}_B\tr)^{-1}\bs{c}_B \geq \bs{0}. \\ & \implies \bs{c}\tr - \bs{c}_B\tr \bs{A}_B^{-1}\bs{A} \geq \bs{0}\tr. \end{aligned} \tag{9} $$

From $\blue{(9)}$, we define the `reduced cost`

$\overline{\bs{c}}$ as a vector that satisfies

$$ \overline{\bs{c}}\tr = \bs{c}\tr - \bs{c}_B\tr \bs{A}_B^{-1}\bs{A} \geq \bs{0}\tr. \tag{10} $$

An immediate consequence of $\blue{(9)}$ and $\blue{(10)}$ is that $\overline{c}_j = 0$ for all $j\in B$.

**Question 1**: Why is $\overline{\bs{c}}$ called the reduced cost?

Everytime we change the basis set $B$, we are increasing the values of some $x_j$ where $j$ is a non-basic index initially. Suppose we want to increase $x_j$ for some $j \in N$ by assigning $x_j \leftarrow \theta$, for some $\theta>0$. (Initially we have $x_j=0$.) If that is all we do, then the end result will not satisfy $\bs{A}\bs{x} = \bs{b}$, so we do not have a basic solution anymore. Instead, we need to adjust $\bs{x}_B$ by $\bs{x}_B + \theta\,\bs{d}_B$ so that

$$ \begin{aligned} & \bs{A}_B(\bs{x}_B + \theta\, \bs{d}_B) + \theta\,\bs{A}_j = \bs{b} \\ \implies & \theta\, \bs{A}_B \bs{d}_B = \bs{b} - \bs{A}_B \bs{x}_B - \theta\, \bs{A}_j \\ \implies & \bs{d}_B = -\bs{A}_B^{-1} \bs{A}_j. \end{aligned} $$

Hence, under the adjustments $x_j \leftarrow \theta$ and $\bs{x}_B \leftarrow \bs{x}_B + \theta\, \bs{d}_B$, the value of the objective function is modified by

$$ \begin{aligned} \bs{c}\tr \bs{x} &\leftarrow \bs{c}\tr \bs{x} + \theta\,(c_j + \bs{c}_B\tr \bs{d}_B) \\ &= \bs{c}\tr \bs{x} + \theta\,(c_j - \bs{c}_B\tr \bs{A}_B^{-1}\bs{A}_j) \\ &= \bs{c}\tr \bs{x} + \theta\,\overline{c}_j \end{aligned} $$

In other words, adding $\theta$ to $x_j$ increases the objective function by $\theta\, \overline{c}_j$. So as long as $\overline{c}_j <0$ (this is a minimization problem), we are making progress! This agrees with (10): when all components of the cost is non-negative, we cannot make any more progress, implying that we are at an optimal solution.

We summarize the result into the following theorem.

Theorem 2Suppose that $\bs{x}$ is a BFS of the LP given by $\blue{(4)}$, and define the reduce cost $\overline{\bs{c}}$ by $\blue{(10)}$. Then we have

- If $\overline{\bs{c}} \geq \bs{0}$, then $\bs{x}$ is optimal.
- If $\bs{x}$ is optimal and $\bs{x}_B > \bs{0}$, then $\overline{\bs{c}} \geq \bs{0}$.

The reduced cost $\overline{\bs{c}}$ is essential for the development of the simplex method. Suppose that $\bs{x}$ is a BFS with $\bs{x}_B >0$ and $\overline{c}_j<0$ for some $j \in N$. By Theorem 2, $\bs{x}$ is not optimal. But we can try to find a better solution $\bs{x} + \theta\,\bs{d}$ where $\theta>0$ and $\bs{d} \in \mathbb{R}^n\setminus \{0\}$.

Inspired by $\blue{(10)}$, we construct the direction $\bs{d}$ in the following way:

$$ \begin{aligned} \bs{d}_B &= -\bs{A}_B^{-1} \bs{A}_j & \quad (11.1)\\ d_j &= 1 & \quad (11.2)\\ d_i &= 0 \text{ for all } i \in N\setminus\{j\} & \quad (11.3) \end{aligned}\tag{11} $$

Under this construction, we have that

$$ \begin{aligned} & \bs{c}\tr(\bs{x}+\theta\,\bs{d}) - \bs{c}\tr \bs{x} & \\ =\,\, & \theta\, \bs{c}\tr \bs{d} &\\ =\,\, & \theta\,(\bs{c}_B\tr \bs{d}_B + c_jd_j) &\quad \orange{(11.3)} \\ =\,\, & \theta\,(-\bs{c}_B\tr \bs{A}_B^{-1} \bs{A}_j + c_j) &\quad \orange{(11.1) + (11.2)} \\ =\,\, & \theta\, \overline{c}_j &\quad \orange{(9) + (10)} \\ <\,\, & 0 \\ \implies \,\, & \bs{c}\tr \bs{d} < 0. \end{aligned} \tag{12} $$

Moreover,

$$ \begin{aligned} & \bs{A}(\bs{x}+\theta\,\bs{d}) & \\ =\,\, & \bs{A}\bs{x} + \theta\,\bs{A}\bs{d} & \\ =\,\, & \bs{b} + \theta\left(\bs{A}_B\bs{d}_B + d_j \bs{A}_j\right) &\quad \orange{(11.2)+(11.3)} \\ =\,\, & \bs{b} + \theta\left(-\bs{A}_B \bs{A}_B^{-1}\bs{A}_j + \bs{A}_j\right) &\quad \orange{(11.1)} \\ =\,\, & \bs{b} \\ \implies \,\, & \bs{A}\bs{d}=\bs{0}. \end{aligned}\tag{13} $$

Finally, since $\bs{x} \geq 0$, we choose a sufficiently small $\theta$ to guarantee that

$$ \bs{x} + \theta\, \bs{d} \geq 0.\tag{14} $$

Combining $\blue{(12)}, \blue{(13)}$ and $\blue{(14})$, we conclude that $\bs{x} + \theta\, \bs{d}$ is a feasible solution that is "more optimal" than $\bs{x}$.

**Question 2**: How large should the stepsize $\theta$ be?

If $\bs{d}\geq \bs{0}$, we donnot have to worry about $\blue{(14)}$, so we can take arbitrarily large step size $\theta$. In this case, the LP is unbounded below. Otherwise, suppose that $d_{B_i} < 0$ for some $B_i \in B$. (Recall that $\bs{d}_{N} \geq 0$ by definition.) In this case, the stepsize can only be as big such that

$$ x_{B_i} + \theta\, d_{B_i} \geq 0. $$

So we take as big as step as we can, until the first $x_{B_l}$ hits zero, where

$$ l = \argmin{i\,:\,d_{B_i} <0} \left\{-\frac{x_{B_i}}{d_{B_i}} \right\}. $$

At this point $l$ leaves the basis set $B$ (and enters $N$), and $j$ enters the basis set $B$ (and leaves $N$). Beautiful! Let's summarize it in a Theorem below.

Theorem 3Suppose that $\bs{x}$ is BFS of $\blue{(4)}$ with reduce cost$$\overline{c}_j = c_j - \bs{c}_B\tr \bs{A}_B^{-1} \bs{A}_j <0.$$

Consider the updated solution $\bs{y} = \bs{x}+\theta^*\bs{d}$ where $\bs{d}$ is chosen according to $\blue{(11)}$, and

$$ \theta^* = -\frac{x_{B_l}}{d_{B_l}} = \min_{\{i\,:\,d_{B_i}<0\}}\left\{-\frac{x_{B_i}}{d_{B_i}}\right\}. \tag{15} $$

Then $\bs{y}$ is a BFS associated with the basis matrix

$$ \bs{A}_{B^*} = \left(\bs{A}_{B_1}, \cdots, \bs{A}_{B_{l-1}}, \bs{A}_j, \bs{A}_{B_{l+1}}, \cdots, \bs{A}_{B_m}\right), $$

and the basic indices $B^* = \{B^*_1,..., B^*_m\}$, where

$$ B^*_i = \begin{cases} B_i & i\neq l \\ j & i=j \end{cases} $$

After the update, variable $x_{B_l}$ leaves the basis and $x_j$ enters the basis.

So far we have assumed that $\bs{x}_B >0$, and by $\blue{(12)}$, we have that $\bs{c}\tr \bs{d} < 0$, and the objective function decreases. We will briefly discuss the case when $x_{B_i}=0$ for some $B_i \in B$. (As a consequence, we have $b_i=\bs{A}_{B_i} x_{B_i} = 0$.) This corresponds to a degenerate case when multiple (more than $n$) constraints intersect at a single point. In this scenario, the algorithm will sometimes choose $\theta=0$ in order to "make progress" (see $\blue{(15)}$), but special care must be taken in order to avoid cycling repeatedly over the same indices without ever reaching the optimal solution. One possible solution is to "perturb" $\bs{b}$ by adding a small noise and solve. Another way is to use the `Bland's rule`

, where at each step we choose the smallest $j$ such that $\overline{c}_j<0$ and if there are multiple variables $x_{B_l}$ that exists the basis set, we pick the one with the smallest index.

The simplex method has a handy `tableau implementation`

. First we construct the following tableau.

$-f_B$ | $\overline{c}_1$ | $\cdots$ | $\red{\overline{c}_j}$ | $\cdots$ | $\overline{c}_n$ |
---|---|---|---|---|---|

$b_1$ | $a_{11}$ | $\cdots$ | $a_{1j}$ | $\cdots$ | $a_{1n}$ |

$\vdots$ | $\vdots$ | $\ddots$ | $\vdots$ | $\cdots$ | $\vdots$ |

$b_l$ | $a_{l1}$ | $\cdots$ | $\red{a_{lj}^*}$ | $\cdots$ | $a_{ln}$ |

$\vdots$ | $\vdots$ | $\ddots$ | $\vdots$ | $\cdots$ | $\vdots$ |

$b_m$ | $a_{m1}$ | $\cdots$ | $a_{mj}$ | $\cdots$ | $a_{mn}$ |

where $a_{ij}$ are the entries of $\bs{A}_B^{-1}\bs{A}$, and $\overline{c}_i$ are the reduced costs based on the current basis set $B$ (in the case where $\bs{B}=\bs{I}_m$, the terms $a_{ij}$ are simply the entries of the origial matrix $\bs{A}$), and $f_B$ is the value of the objective function at the current BFS. We will not talk about strategies for finding a initial feasible solution in this post.

To select the pivot row index $l$, we start with $\blue{(15)}$ and have

$$ \begin{aligned} l &= \argmin{\{i\,:\,d_{B_i}<0\}}\left\{-\frac{x_{B_i}}{d_{B_i}}\right\} \\ &= \argmin{\{i\,:\,d_{B_i}<0\}}\left\{-\frac{\left[\bs{A}_B^{-1}\bs{b}\right]_i}{-\left[\bs{A}_B^{-1}\bs{A}_j\right]_i}\right\} \\ &= \argmin{\{i\,:\,a_{ij}>0\}}\left\{\frac{b_i}{a_{ij}}\right\}. \end{aligned} $$

Note that the objective value under $B$ is

$$ \begin{aligned} f(\bs{x}) &= \bs{c}\tr \bs{x} \\ &= \overline{\bs{c}}\tr \bs{x} + \bs{c}_{B}\tr \bs{A}_B^{-1} \bs{A}\bs{x} \\ &= \overline{\bs{c}}\tr \bs{x} + \bs{c}_{B}\tr \bs{A}_B^{-1} \bs{b} \\ &= \overline{\bs{c}}_B\tr\bs{x}_B + \overline{\bs{c}}_N\tr\bs{x}_N + \bs{c}_{B}\tr \bs{A}_B^{-1} \bs{b} \\ &= \bs{c}_{B}\tr \bs{A}_B^{-1} \bs{b} \quad \blue{(\text{since } \overline{\bs{c}}_B =\bs{0},\,\, \bs{x}_N = \bs{0})} \end{aligned} $$

Having chosen pivot element $\red{a_{lj}^*}$, we find the new basis index $B^*$, and update the tableau by computing the following

$-f_{B^*} = -\bs{c}_{B^*} \tr \bs{A}_{B^*}^{-1} \bs{b}$ | $\overline{\bs{c}}\tr = \bs{c}\tr - \bs{c}_{B^*}\tr \bs{A}_{B^*}^{-1} \bs{A}$ |
---|---|

$\bs{x}_{B^*} = \bs{A}_{B^*}^{-1}\bs{b}$ | $\bs{A}_{B^*}^{-1} \bs{A}$ |

Hence, we adopt the following procedure:

- Select the pivot column as any column $j$ with $\overline{c}_j <0$.
- Given $j$, we select the
`pivot row`

whose index $l$ such that

$$ \frac{b_l}{a_{lj}} = \min_{1\leq i \leq m} \left\{\frac{b_i}{a_{ij}} : a_{ij} > 0 \right\}. \tag{16} $$

- If there exists any $\overline{c}_k <0$ such that $a_{ik} \leq 0$ for all $i$, then the LP is unbounded below.

Inverting a matrix at each step of the iteration is costly (the complexity is $O(m^3)$). Suppose that the columns of $\bs{A}$ are arranged so that $\bs{A}=\left(\bs{A}_{B}, \,\, \bs{A}_{N}\right)$. Then,

$$ \bs{A}_{B}^{-1} \bs{A} = \bs{A}_{B}^{-1} \left(\bs{A}_{B},\,\, \bs{A}_{N}\right) = \left(\bs{I}_m,\,\, \bs{A}_{B}^{-1} \bs{A}_{N}\right). $$

Likewise, assuming that $\bs{c}\tr$ is arranged so that $\bs{c}\tr = \left(\bs{c}_B\tr,\,\, {\bs{c}}_N\tr\right)$. The updated cost is given by

$$ \begin{aligned} \overline{\bs{c}}\tr &= \bs{c}\tr - \bs{c}_{B}\tr \bs{A}_{B}^{-1} \bs{A} \\ &= \left(\bs{c}_B\tr,\,\, {\bs{c}}_N\tr\right) - \bs{c}_{B}\tr \left(\bs{I}_m,\,\, \bs{A}_{B}^{-1} \bs{A}_{N}\right) \\ &= \left(\bs{c}_B\tr,\,\, {\bs{c}}_N\tr\right) - \left(\bs{c}_{B}\tr,\,\, \bs{c}_{B}\tr\bs{A}_{B}^{-1} \bs{A}_{N}\right) \\ &= \left(\bs{0}_{m}\tr,\,\, {\bs{c}}_N\tr-\bs{c}_{B}\tr\bs{A}_{B}^{-1} \bs{A}_{N}\right) \end{aligned} $$

This is reminiscent of elementary row operations. Indeed, at each step of the iteration, there will be one column $B_l$ leaving the basis and column $j$ entering the basis. This is equivalent of pivoting at the $j$th column of the array with entry $\red{a_{ij}^*}$ as the pivot.

Alternatively, each basis update from $\bs{A}_B$ to $\bs{A}_{B^*}$ can be expressed as

$$ \bs{A}_{B^*} = \bs{A}_B + (\bs{A}_j - \bs{A}_{B_l})\,\bs{e}_l\tr, $$

where $\bs{e}_l \in \mathbb{R}^m$ is the $k$th standard basis vector (the $k$th element is 1 and all other elements are 0).

We can apply the `Sherman-Morrison`

inversion formula and express the inverse of the updated basis matrix as

$$ \bs{A}_{B^*}^{-1} = \bs{A}_B^{-1} - \frac{\bs{A}_B^{-1}(\bs{A}_j-\bs{A}_{B_l})\, \bs{e}_l\tr \bs{A}_B^{-1}}{1+\bs{e}_l\tr \bs{A}_B^{-1} (\bs{A}_j - \bs{A}_{B_l})}. \tag{17} $$

Suppose we have the following linear program:

$$ \begin{aligned} \min \,\, f(x_1, x_2) &= -2x_1-x_2 \\ \text{s.t.} \quad x_1+x_2 &\leq 10 \\ \quad 2x_1-x_2 &\leq 5 \\ x_1, x_2 & \geq 0 \end{aligned} $$

By introducing slack variables $x_3$ and $x_4$, the standard form of the LP can be expressed as

$$ \begin{aligned} \min \quad\quad\quad\quad f(x_1, x_2) &= -2x_1-x_2 \\ \text{s.t.} \quad x_1+x_2+x_3 + \phantom{x_4} &= 10 \\ \quad 2x_1-x_2 + \phantom{x_3} + x_4 &= 5 \\ x_1, x_2, x_3, x_4 & \geq 0 \end{aligned} $$

0 | -2 | -1 | 0 | 0 |
---|---|---|---|---|

10 | 1 | 1 | 1 | 0 |

5 | $\red{2}$ | -1 | 0 | 1 |

- Choose $j=1$ as $\overline{c}_1 =-2<0$.
- Choose $l=2$ as $5/2 < 10/1$.
- $x_1$ enters the basis and $x_4$ leaves the basis
- The BFS is given by $(0,0,10,5)\tr$.
- The basis set $B=\{3, 4\}$

5 | 0 | -2 | 0 | 1 |
---|---|---|---|---|

15/2 | 0 | $\red{3/2}$ | 1 | -1/2 |

5/2 | 1 | -1/2 | 0 | 1/2 |

- Choose $j=2$ as $\overline{c}_2 =-2<0$.
- Choose $l=1$ as $3/2>0$.
- $x_2$ enters the basis and $x_3$ leaves the basis
- The BFS is given by $(5/2, 0, 15/2, 0)\tr$.
- The basis set $B=\{3, 1\}$

15 | 0 | 0 | 4/3 | 1/3 |
---|---|---|---|---|

5 | 0 | 1 | 2/3 | -1/3 |

5 | 1 | 0 | 1/3 | 1/3 |

- The BFS is given by $(5,5,0,0)\tr$.
- The basis set $B=\{2,1\}$
- The current BFS is optimal since $\overline{\bs{c}}\geq \bs{0}$.
- The minimum objective value is thus equal to -15.