Four Subspaces

Feb. 20, 2019 $\newcommand{\bs}{\boldsymbol}$ $\newcommand{\argmin}[1]{\underset{\bs{#1}}{\text{arg min}}\,}$ $\newcommand{\argmax}[1]{\underset{\bs{#1}}{\text{arg max}}\,}$ $\newcommand{\tr}{^{\top}}$ $\newcommand{\norm}[1]{\left|\left|\,#1\,\right|\right|}$ $\newcommand{\given}{\,|\,}$ $\newcommand{\st}{\,\big|\,}$ $\newcommand{\E}[1]{\mathbb{E}\left[#1\right]}$ $\newcommand{\P}[1]{\mathbb{P}\left(#1\right)}$ $\newcommand{\abs}[1]{\left|#1\right|}$ $\newcommand{\blue}[1]{{\color{blue} {#1}}}$ $\newcommand{\red}[1]{{\color{red} {#1}}}$ $\newcommand{\orange}[1]{{\color{orange} {#1}}}$ $\newcommand{\pfrac}[2]{\frac{\partial #1}{\partial #2}}$

The concept of the four fundamental subspaces is the the highlight of linear algebra. It is what makes the subject so beautiful, and an absolute joy to study. In low levels, we are often presented with the following results.

Theorem 1   Let $A$ be an $m\times n$ matrix with rank $r$. And let $R = \text{rref}(A)$ be the reduced row echelon form of $A$. Let $C(A)$ and $C(A\tr)$ denote the column space and row space of $A$, respectively. Then we have that $C(A\tr) =C(R\tr)$ and $N(A)=N(R)$. And,

$$\begin{aligned} \dim C(A) = \dim C(A\tr) &= r\\ \dim C(A\tr) + \dim N(A) &= n \\ \dim C(A) + \dim N(A\tr) &= m \end{aligned} $$

For example, if a matrix $A$ is factorized into

$$ A = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 5 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 & 3 & 0 & 5 \\ 0 & 0 & 1 & 6 \\ 0 & 0 & 0 & 0 \end{bmatrix} = LU = E^{-1}R. $$

From this LU factorization (obtained by doing elementary row operations), we can immediately conclude the following.

  1. The row space $C(A\tr)$ has basis $(1, 3, 0, 5)'$ and $(0, 0, 1, 6)'$ just by reading the rows of the matrix $U$.
  2. The column space $C(A)$ has basis $(1,2,5)'$ and $(0, 1, 0)'$ obtained as the first two columns of $L$ since the first two rows of $U$ are the pivot rows.
  3. The null space $N(A)$ has basis $(-3,1,0,0)'$ and $(-5,0,-6,1)'$ with dimension 2. From $U$ we see that column 2 and column 4 are the free columns. We set the free variables to $(1, 0)$ and $(0, 1)$ to obtain a basis for the nullspace.
  4. The left null space $N(A\tr)$ has basis $(-5,0,1)$. This is found by noting that since the law row of $U$ consists of all zeros. Hence we take the las row of $E$ obtained by reversing the elementary operation of $E^{-1}$.

The beauty is further augmented by introducing the notion of orthogonality, which we will discuss in detail next time. We have the following theorem.

Theorem 2   The column space $C(A)$ is the orthogonal complement of the left null space $N(A\tr)$. THe row space $C(A\tr)$ is the orthogonal complement of the null space $N(A)$. Using the orthogonal decomposition every $x\in \mathbb{R}^n$ can be uniquely represented as $x=x_r+x_n$. Matrix multiplication by $Ax=b$ is equivalent to mapping $x_r$ to $b$ in the column space and $x_n$ to 0.

Strang's book has a famous picture that illustrates this theorem.

For example, suppose $A=\begin{bmatrix}1&-1\\0&0\\0&0\end{bmatrix}$ and $x=\begin{bmatrix}2\\0\end{bmatrix}$. Then we have

$$ \begin{aligned} C(A) &= \text{span}\{(1,0,0)'\} \\ N(A\tr) &= \text{span}\{(0,1,0)',(0,0,1)'\} \\ C(A\tr) &= \text{span}\{(1,-1)'\} \\ N(A) &= \text{span}\{(1,1)'\}. \end{aligned} $$

Multiplying $A$ times $x$ yields the vector $(2,0,0)$ which lives in the column space of $A$. This illustrated by the following figure.

The beauty doesn't stop here. The cool thing is that all the results above can be derived without matrices at all! This is the beauty of abstraction. If you can prove that a result holds for a vector space, or more generally a metric space, instead of just $\mathbb{R}$ for example, then you can naturally generalize it to $\mathbb{R}^n$. A good example is shown in the proof of the continuous mapping theorem from my previous post.

Rank Nullity

Recall that a vector space $V$ over a field $\mathbb{F}$, often referred to as a $\mathbb{F}$-vector space, is a nonempty set with 2 operations: $\bs{+}: V\times V \to V$ and $\bs{\cdot} : \mathbb{F}\times V \to V$, such that

  1. $(V, +)$ is an abelian group.
  2. For all $a, b\in \mathbb{F}$ and $v\in V$, we have that $(c+d)\cdot v = c\cdot v + d\cdot v$.
  3. For all $v, w\in V$ and $c\in \mathbb{F}$, we have that $c\cdot (v+w)=c\cdot v+c\cdot w$.

Let $U, V$ be $\mathbb{F}$-vector spaces. A map $T:U\to V$ is called a linear map if it preserves vector addition and scalar multiplication. That is, for any $c_1, c_2 \in \mathbb{F}$, and $u_1, u_2 \in U$, we have

$$ T(c_1u_1+c_2u_2) = c_1T(u_1) + c_2T(u_2). $$

The kernel, or the null space, of $T$ is a subspace of $U$ given by:

$$ \text{ker}(T)=N_T = \{u\in U\st T(u)=0_V\}. $$

The image, or the range, of $T$ is a subspace of $V$ given by:

$$ \text{im}(T) = R_T = \{T(u) \st u\in U\}. $$

Now we are ready to state a result analogous to a part of Theorem 1 below.

Theorem 3 (Rank-Nullity)   Let $T: U\to V$ be a linear map. Suppose $U$ is finite-dimensional. Then

$$\dim N_T + \dim R_T = \dim U.$$

Proof   Since $N_T$ is a subspace of $U$, we have that

$$ \dim N_T + \dim U\big/N_T = \dim U, $$

where $U\big/N_T$ is the quotient space of $U$ modulo $N_T$. It suffices to show that $U\big/ N_T \cong R_T$. To do so, we define a map:

$$ \begin{aligned} S : U\big/N_T &\to R_T \\ u+N_T &\mapsto T(u). \end{aligned} $$

Note that here $u+N_T\in U\big/N_T$ denotes an equivalent class on $U$:

$$ u+N_T = \{u'\in U \st u'-u \in N_T\}. $$

First we show the map $S$ is well-defined. If $u+N_T = u'+N_T$, then

$$ \begin{aligned} T(u+N_T) &= T(u) \\ &= T(u+u'-u') \\ &= T(u' + (u-u')) \\ &= T(u') + T(u-u') \\ &= T(u') & [\text{since } u-u'\in N_T]\\ &= T(u'+N_T). \end{aligned} $$

The map $S$ is linear since $T$ is linear. To show that $S$ is one-to-one, we will examine the kernel of $S$. If $S(u+N_T) = 0$, then $T(u)=0$, so $u\in N_T$. This means that $u+N_T = 0+N_T = 0$. This shows that the $S$ is one-to-one. The map $S$ is onto since for all $v\in R_T$, there exists a $u\in U$, such that $v=T(u)$. Hence, we have $S(u+N_T) = T(u) = v$ as desired. This shows that $S$ is an isomorphism, which concludes the proof. $\blacksquare$


To complete the proof of Theorem 1 in the general setting, we need to introduce the notion of duality. Let $\mathcal{L}(U, V)$ denote the linear space of all linear maps from $U$ and $V$, both are $\mathbb{F}$-vector spaces. For any vector space $V$, the space $\mathcal{L}(V, \mathbb{F})$ is called the dual space of $V$, denoted as $V'$.

Lemma 1   Fix a basis $v_1,...,v_n$ for $V$, the dual space $V'$ has a basis $l_1,..., l_n$ where

$$l_i(v_j) = \delta_{ij}.$$

where $\delta_{ij}=1$ if $i=j$ and 0 otherwise.

Proof   To show that $l_1,..., l_n$ is independent, take any constants $\alpha_1,..., \alpha_n$, such that

$$ \alpha_1 l_1 + \cdots + \alpha_n l_n = 0_{V'}. $$

Since both sides of the equation are functionals, we can plugin $v=v_i$ and yields $\alpha_i = 0$ for all $i=1,...,n$.

Now we show that the elements $l_1,..., l_n$ span $V'$. Take $l\in V'$. For any $v\in V$, we can express $v = \sum_{i=1}^n \alpha_i v_i$. Hence,

$$ l(v) = \sum_{i=1}^n \alpha_i l(v_i) = \sum_{i=1}^n l_i(v) l(v_i). $$

The last equality holds since

$$l_i(v) = l_i(\sum_{i=1}^n \alpha_i v_i) = \alpha_i l_i(v_i) =\alpha_i,$$

since if $i\neq j$, all the terms $l_i(v_j) =0$.

Setting $l(v_i)=\beta_i$, which does not depend on $v$, we can write

$$ l = \beta_1 l_1 + \cdots + \beta_nl_n. $$

This shows that $l_1,..., l_n$ spans $V'$. Hence $l_1,...,l_n$ is a basis for $V'$, called the dual basis. $\blacksquare$

An immediate consequence of Lemma 1 is that if $V$ is finite-dimensional, then $V' \cong V$ via the isomorphism $T(v_i) = l_i$. Note that this isomorphism depends on the choice of basis.

So why building on top of these abstraction yet another layer? It turns out that this makes expressing a linear functional $l(v)$ extremely easy! Let $v_1,..., v_n$ be a basis for $V$ and $l_1,..., l_n$ be a basis for $V'$, then we can naturally identify an element by its coefficients in the basis expansion. Suppose $v = \sum_{i=1}^n a_i v_i$ and $l = \sum_{j=1} b_j l_j$, then

$$ \begin{aligned} l(v) &= \sum_{i=1}^n b_il_i (\sum_{j=1}^n a_jv_j) \\ &= a_1b_1 + a_2b_2 + \cdots + a_n b_n. \end{aligned} $$

This follows since all the cross terms vanish as $l_i(v_j)=0$ for $i\neq j$. This is exactly the dot product we learned in elementary linear algebra. Because of this, we adopt the following notation:

$$ (l, v) := l(v). $$

This is a bilinear form in the sense that if one component is fixed, it is linear in the other component.

Let $U$ be a subspace of $V$. The annihilator of $U$ can be defined by

$$ U^\perp := \{l\in V' \st l(u)=0 \text{ for all } u\in U\}. $$

The annihilator is indeed the orthogonal complement of $U$ if $V$ is an inner product space. However, for our current treatment, we are only assuming a general vector space. From the definition , we see that $U^\perp$ is a subspace of $V'$. We can also show that $U^\perp \cong V\big/U$ by constructing an isomorphism:

$$ \begin{aligned} \phi : U^\perp &\to (V\big/U)' \\ l &\mapsto L, \end{aligned} $$

where $L(v+U) := l(v)$. From this, we can conclude that

$$ \dim U + \dim U^\perp = \dim V. \tag{1} $$

We can go a step further by studying the dual of the dual space, denoted by $V''$ via a canonical isomorphism:

$$ \begin{aligned} \phi : V &\to V'' \\ v &\mapsto l', \end{aligned} $$

where $l' : V'\to \mathbb{F}$ is a functional that maps each $l\in V'$ to $l(v)$. Using the bilinear form, this is

$$ (l', l) = (l, v). $$

Unlike the isomorphism between $V$ and $V'$, the canonical isomorphism is independent of the choice of basis.

Theorem 4   Under the canonical identification of $V$ and $V''$, we have $U=U^{\perp\perp}$.

Proof   Since $U^\perp$ is a subspace of $V'$, by definition, we have that

$$ U^{\perp\perp} = \{l' \in V'' \st (l',l)=0, \text{ for all } l \in U^\perp\}. $$

Under the canonical isomorphism, the element $l'\in V''$ is naturally identificed with an element $v\in V$, such that $(l', l) = (l, v)$. Hence, we have

$$ U^{\perp\perp} \overset{id}{=} \{v\in V \st l(v)=0, \text{ for all } l \in U^\perp \}. $$

This is precisely $U$. $\blacksquare$

Suppose $T : U\to V$ is a linear map. We define the adjoint map or transpose map of $T$ by

$$ \begin{aligned} T': V' \to U', \end{aligned} $$

where for all $l\in V'$, $T'l \in U'$ is defined by $(T'l, u) = (l, Tu)$ for all $u\in U$.

Theorem 5   $R_T^\perp = N_{T'}$

Proof   Using definition,

$$ \begin{aligned} R_T^\perp &= \{l\in V' \st (l, v)=0, \text{ for all } v\in R_T\} \\ &=\{l\in V' \st (l, Tu)=0, \text{ for all } u\in U \} \\ &=\{l\in V' \st (T'l, u)=0, \text{ for all } u\in U\} \\ &=\{l\in V' \st T'l=0\} \\ &=N_{T'} \end{aligned} $$

This proves the theorem. $\blacksquare$

Corollary 1   $R_T = N_{T'}^{\perp}$

Proof   Taking the $\perp$ operator on both sides of Theorem 5, and by Theorem 4, we have the result. $\blacksquare$

Theorem 6   $\dim R_T = \dim R_{T'}$

Proof   Since $T'$ is a linear map from $V'$ to $U'$, by the rank-nullity theorem (Theorem 3) we have

$$ \begin{aligned} \dim R_{T'} &= \dim V'-\dim N_{T'} \\ &= \dim V'- \dim R_T^\perp & [\text{Theorem } 5] \\ &= \dim V'-(\dim V-\dim R_T) & [\text{Equation } (1)] \\ &= \dim R_T. \end{aligned} $$

This proves the theorem. $\blacksquare$

Theorem 7   If $T:U\to V$ with $\dim U=\dim V$, then

$$\dim N_T = \dim N_{T'}.$$

Proof   By the rank-nullity theorem, we have

$$ \begin{aligned} \dim N_T &= \dim U - \dim R_T \\ \dim N_{T'} &= \dim V' - \dim R_{T'} \end{aligned} $$

Since $\dim R_T = \dim R_{T'}$ from Theorem 6, and $\dim U = \dim V = \dim V'$ by assumption. Hence it follows that $\dim N_T = \dim N_T'$. $\blacksquare$

If we identify a linear map $T$ by a matrix $A$, then it turns out that $T'$ corresponds to $A\tr$, $N_T$ corresponds to the null space $A$, $N_{T'}$ corresponds to the left null space of $A$, $R_T$ corresponds to the column space of $A$, $R_{T'}$ corresponds to the row space of $A$, and $R_T^\perp$ corresponds to the orthogonal complement of the column space, which is indeed the left null space from Theorem 5. It all unfolds beautifully!