Vector Spaces
Table of Contents
1. Vector Spaces
A vector space is comprised of a set of vectors that satisfy a set of properties. The vector space \(\mathbb{R}^n\) consists of all the column vectors \(\mathbf{v}\) of length \(n\). Two operations are defined on a vector space: addition and scalar multiplication. Using these operations, we can find linear combinations \(c\mathbf{u} + d\mathbf{w}\) in the vector space \(\mathbb{R}^n\).
In the definition of a vector space, vector addition and scalar multiplication must obey the following eight rules:
- \(\mathbf{x} + \mathbf{y} = \mathbf{y} + \mathbf{x}\)
- \(\mathbf{x} + (\mathbf{y} + \mathbf{z}) = (\mathbf{x} + \mathbf{y}) + \mathbf{z}\)
- There is a "zero vector" such that \(\mathbf{x} + \mathbf{0} = \mathbf{x}\) for all \(\mathbf{x}\)
- For each \(\mathbf{x}\) there is a unique vector \(-\mathbf{x}\) such that \(\mathbf{x} + (-\mathbf{x}) = \mathbf{0}\)
- There exists a scalar "1" such that \(1 \cdot \mathbf{x} = \mathbf{x}\)
- \((c_1c_2) \mathbf{x} = c_1(c_2\mathbf{x})\)
- \(c(\mathbf{x} + \mathbf{y}) = c\mathbf{x} + c\mathbf{y}\)
- \((c_1 + c_2)\mathbf{x} = c_1\mathbf{x} + c_2\mathbf{x}\)
One important requirement is that all linear combinations stay in the vector space. This means that the set of all positive vectors is not a vector space (because some linear combinations yield negative vectors). Similarly, a line in \(\mathbb{R}^n\) is not a vector space unless it goes through the origin.
1.1. Subspaces
If that line does go through \(\mathbf{0}\), then we can multiply points on the line by any number \(c\) and still stay within the line. We can also add points on the line without ever leaving the line. This line through \(\mathbf{0}\) in \(\mathbb{R}^n\) is an example of a subspace: a vector space inside another vector space.
In a vector space \(V\), say we have the set of vectors \({\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_p}\). Then the set of all linear combinations of those vectors is a subspace of \(V\).
1.2. Column Space
The column space of a matrix \(A\) is the subspace spanned by the columns of \(A\), denoted \(\text{Col }A\). In other words, it consists of all linear combinations of the columns. To solve the equation \(A\mathbf{x} = \mathbf{b}\) is to express \(\mathbf{b}\) as a combination of the columns — the system is consistent if and only if \(\mathbf{b}\) is in the column space of \(A\).
1.3. Row Space
The row space is the subspace spanned by the rows of \(A\), denoted \(\text{Row }A\). Since we usually work with column vectors, we can also say that the row space of \(A\) is the column space of the transpose matrix \(A^T\).
1.4. Nullspace
The nullspace of \(A\) is the space consisting of all the solutions \(\mathbf{x}\) to the homogeneous system \(A\mathbf{x}=0\), denoted \(\text{Nul }A\). To get the nullspace, we can solve for \(A\mathbf{x}=0\) for parametric vector form, and then the vectors in that form span \(\text{Nul }A\).
We can see that the nullspace is a subspace with the following demonstration. Say \(\mathbf{x}\) and \(\mathbf{y}\) are in the nullspace of \(A\). Then \(\mathbf{x} + \mathbf{y}\) is also in the nullspace, since:
\begin{align} A(\mathbf{x}+\mathbf{y}) &= A\mathbf{x}+A\mathbf{y} \notag \\ &= 0 + 0 \notag \\ &= 0 \end{align}Example: Nullspace as a subspace
Given the set of all vectors \((a, b, c, d)\) such that it satisfies the following two equations:
\begin{cases} a - 2b = 4c \\ 2a = c + 3d \end{cases}We want to know if this set is a subspace. To do so, notice that we can rewrite the equations as:
\begin{align} a - 2b - 4c &= 0 \notag \\ 2a - c - 3d &= 0 \notag \end{align}This is simply a system of linear equations with \(a, b, c, d\) as the variables, which means we can write this like so:
\begin{align} \begin{bmatrix} 1 & -2 & -4 & 0 \\ 2 & 0 & -1 & -3 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} \notag \end{align}Now, it is clear that \((a, b, c, d)\) is the nullspace, and so therefore it is a subspace.
2. Basis
For a vector space/subspace, a basis is a set of linearly independent vectors that span the space. Every vector in the space is a unique combination of the basis vectors.
To see why there is only one way to write a vector as a combination of basis vectors, suppose there were two different ways to form the linear combination: \(\mathbf{v} = a_1\mathbf{v}_1 + \cdots + a_n\mathbf{v}_n\) and also \(\mathbf{v} = b_1\mathbf{v}_1 + \cdots + b_n\mathbf{v}_n\). By subtraction, \(0 = (a_1-b_1)\mathbf{v}_1 + \cdots + (a_n-b_n)\mathbf{v}_n\). Since each of the basis vectors are nonzero, we know that \(a_i=b_i\), and hence there are no two ways to form \(\mathbf{v}\).
However, there are many different choices for a basis vector (in fact, it can be seen that if \(n\) vectors form a basis, then any \(n\) linearly independent vectors in the vector space also form a basis). Critically, though, the number of basis vectors does not change.
2.1. Nullspace
To find the basis for \(\text{Nul }A\), we can use the REF/RREF to locate free variables, solve \(A\mathbf{x} = 0\), and the parametric vector form of the solution gives us the basis.
2.2. Column Space
The basis for \(\text{Col }A\) is the set of the pivot columns of \(A\). To see why this is, consider the RREF of the matrix. Using RREF, we see that each pivot column is not a linear combination of the other pivot columns, as for any pivot column, the entry that has a \(1\) in that pivot column has a \(0\) in the corresponding entry of the other pivot coumns, and so the pivot columns themselves are linearly independent.
Now, take a non-pivot column in RREF:
\begin{align} \begin{bmatrix} a_1 \\ \vdots \\ a_m \\ 0 \\ \vdots \\ 0 \end{bmatrix} \notag \end{align}This non-pivot column can be written as a linear combination of the other pivot columns, and as such the pivot columns of \(A\) form the basis for the column space.
2.3. Row Space
The basis for \(\text{Row }A\) are the pivot rows in the row echelon form of \(A\).
2.4. Change of Basis
2.4.1. With Standard Basis
Consider a real vector space \(V\) with basis \(B = {v_1, v_2, \dots , v_n}\). For any vector \(x\) in \(V\), we can write it as a linear combination of the basis of \(V\): \(x = c_1v_1 + c_2v_2 + \cdots + c_nv_n\), where \(c_i\) is any real scalar. However, if we form a matrix \(P_B\) using the basis vectors as column vectors, then we can rewrite this relationship as:
\begin{align} P_B [\mathbf{x}]_B = \mathbf{x} \end{align}where \([\mathbf{x}]_B\) is \(c_1, c_2, \dots, c_n\), what we call the coordinates of \(\mathbf{x}\) in the coordinate system of the basis \(B\).
When we write vectors, we do so in a particular coordinate system which is defined with a basis. In other words, a vector is always written in the context of a basis, in which the vector is a particular linear combination of the vectors in that basis. Most of the time, we implicitly use the standard basis, which is made up of the standard vectors \(\mathbf{e}_1, \mathbf{e}_2, \dots, \mathbf{e}_n\). Thus, using the standard basis as \(P_B\), we can see that the relationship of (2) is simply the identity.
We can see that in (2), by multiplying the basis matrix \(P_B\) with the vector in the coordinate system of that matrix \([\mathbf{x}]_B\), we obtain the vector \(\mathbf{x}\) in the standard basis.
We can also convert from the standard basis to the coordinate system of another basis by solving (2) for \([\mathbf{x}]_B\).
2.4.2. With Arbitrary Basis
The key here is to recognize the assumption we are making in (2). The reason why it works is because we assume \(P_B\) to be written in the standard basis, since we write basis vectors in the standard basis. In other words, in order to change from a basis \(B\) to another basis \(C\), we have to write \(P_B\) in the basis \(C\):
\begin{align} P_{C \leftarrow B} [\mathbf{x}]_B = [\mathbf{x}]_C \notag \end{align}where \(P_{C\leftarrow B}\) is our change of basis matrix, formed by the column vectors of the basis of \(B\) written in the basis of \(C\).
The issue now becomes how we rewrite the basis of \(B\) in the basis of \(C\). If \(P_C\) is the matrix formed by the vectors in the basis of \(C\) written in the standard basis, then by (2) we can solve for \([\mathbf{b}_i]_C\) like so:
\begin{align} P_C [\mathbf{b}_i]_C = \mathbf{b}_i \notag \end{align}Realize that all of the \([\mathbf{b}_i]_C\) in a matrix is simply the basis vectors of \(B\) written in the basis of \(C\), or \(P_{C\leftarrow B}\). Therefore, to find this matrix, we can solve the equation above by reducing the augmented matrix:
\begin{align} \boxed{[ P_C \; | \; P_B ] \sim [ I \; | \; P_{C\leftarrow B} ]} \end{align}2.5. Isomorphism
In general, a one-to-one linear transformation from a vector space \(V\) onto \(W\) is called an isomorphism from \(V\) onto \(W\). In particular, any real vector space with a basis of \(n\) vectors is isomorphic with \(\mathbb{R}^n\).
3. Dimension
Define the dimension, \(\text{dim }V\), of a vector space \(V\) to be the number of elements in a basis for \(V\). If the number of elements is finite, then \(V\) is finite dimensional; otherwise, it is infinite dimensional. If \(\text{dim }V = n\), then any set of \(n\) linearly independent vectors in \(V\) is a basis.
The dimensions of each of the important subspaces are as follows:
- \(\text{dim}(\text{Col }A)\): the number of pivots in \(A\)
- \(\text{dim}(\text{Nul }A)\): the number of free variables/non-pivots in \(A\)
- \(\text{dim}(\text{Row }A)\): the number of pivots in \(A\)
We call the rank of a matrix \(A\) the dimension of the row/column space. We call the nullity of \(A\) the dimension of the nullspace.
3.1. Rank-Nullity Theorem
If \(A\) is a real \(m\times n\) matrix, we have:
\begin{align} \boxed{\text{rank }A + \text{nullity }A = n} \end{align}We can use this to derive the following important fact, for an \(m \times n\) matrix \(A\):
\begin{align} \text{rank} (A^TA) = \text{rank }A \end{align}To prove this, we know that:
\begin{align} \text{rank }A &= n - \text{dim}(\text{Nul}(A)) \notag \\ \text{rank}(A^TA) &= n - \text{dim}(\text{Nul}(A^TA)) \notag \end{align}Thus, it is sufficient to show that \(\text{Nul }A = \text{Nul}(A^TA)\). First, it can be seen that if \(A\mathbf{x} = 0\), then \(A^TA\mathbf{x} = 0\). Then, we want to show that if \(A\mathbf{x} \neq 0\), then \(A^TA\mathbf{x} \neq 0\). To do this, set \(\mathbf{v} = A^TA\mathbf{x}\), and multiply both sides on the left by \(\mathbf{x}^T\):
\begin{align} \mathbf{x}^T\mathbf{v} &= \mathbf{x}^TA^TA\mathbf{x} \notag \\ &= (A\mathbf{x})^T(A\mathbf{x}) \notag \\ &= \left|A\mathbf{x}\right|^2 \notag \end{align}Since the right hand side cannot be zero if \(A\mathbf{x} \neq 0\), this means that \(\mathbf{v}\) is nonzero, so \(A^TA\mathbf{x} \neq 0\). Thus, we have shown (5) to be true.