Matrix Similarity
Table of Contents
1. Matrix Similarity
We call two \(n \times n\) matrices similar if there exists an invertible matrix \(P\) such that:
\begin{align} A = PBP^{-1} \end{align}Two similar matrices will always have the same characteristic equation, and therefore the same eigenvalues. This can be seen like so:
\begin{align} A &= PBP^{-1} \notag \\ A - \lambda I &= PBP^{-1} - \lambda I \notag \\ A - \lambda I &= PBP^{-1} - \lambda P P^{-1} \notag \\ A - \lambda I &= P(B - \lambda I)P^{-1} \notag \\ \end{align}Taking the determinant on both sides and simplifying, we get:
\begin{align} \det (A - \lambda I) &= \det (P(B-\lambda I)P^{-1}) \notag \\ \det (A - \lambda I) &= \det (PP^{-1}) \det (B - \lambda I) \notag \\ \det (A - \lambda I) &= \det (B - \lambda I) \notag \end{align}Thus, \(A\) and \(B\) must have the same characteristic polynomial. Note that this doesn't mean that two matrices with the same characteristic polynomial are guaranteed to be similar. For example, the two matrices \(\begin{bmatrix} 1 & 0 \\ 0 & 1\end{bmatrix}\) and \(\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\) have the same eigenvalues, but they are not similar.
1.1. Matrix Powers
Similar matrices can be helpful in computing powers of matrices. Consider the k-th power of a matrix \(A\):
\begin{align} A^k &= (PBP^{-1})^k \notag \\ &= (PBP^{-1})(PBP^{-1}) \cdots (PBP^{-1}) \notag \\ &= PB^kP^{-1} \notag \end{align}2. Diagonalization
If two matrices are similar and one of them is a diagonal matrix, then we call the matrix diagonalizable. To understand when a matrix \(A\) is diagonalizable, consider the matrix \(X\) formed with its columns as eigenvectors of \(A\), and \(\Lambda\) the diagonal matrix with eigenvalues of \(A\) as its diagonal entries. Then:
\begin{align} AX = \begin{bmatrix} \\ A\textbf{x}_1 & \cdots & A\textbf{x}_n \\ \: \end{bmatrix} = \begin{bmatrix} \\ \lambda_1 \textbf{x}_1 & \cdots & \lambda_n \textbf{x}_n \\ \: \end{bmatrix} = \begin{bmatrix} \\ \textbf{x}_1 & \cdots & \textbf{x}_n \\ \: \end{bmatrix} \begin{bmatrix} \lambda_1 \\ & \ddots \\ & & \lambda_n \end{bmatrix} = X \Lambda \notag \end{align}Thus, if \(AX = X\Lambda\), then a diagonalization of \(A\) is:
\begin{align} \boxed{A = X \Lambda X^{-1}} \end{align}Since \(X\) is formed out of the eigenvectors of \(A\), and \(X\) has to be invertible, then the condition for this diagonalization to exist must be that there has to exist \(n\) linearly independent eigenvectors for an \(n \times n\) matrix \(A\).
2.1. Multiplicity
For an \(n \times n\) real matrix \(A\), the eigenvalues of \(A\) are the roots of the characteristic polynomial \(\det (A-\lambda I)\). For certain exceptional matrices, it is possible that some roots are repeated. Then, we can define two different ways to count its multiplicity:
- Algebraic multiplicity (GM) is the number of times the eigenvalue is repeated. This can be found from the power of the root in the characteristic polynomial.
- Geometric multiplicity (AM) is the number of independent eigenvectors for a particular eigenvalue. This can be found from the dimension of the nullspace of \(A - \lambda I\).
In general, we always have \(\text{AM} \geq \text{GM} \geq 1\). Thus, we know that an \(n \times n\) matrix is diagonalizable if the algebraic multiplicity of each eigenvalue matches its geometric multiplicity. This is since the total of algebraic multiplicity has to be \(n\) (since \(\det (A - \lambda I)\) is of degree \(n\)), then this forces the total of geometric multiplicity to also be \(n\), which satisfies the condition that there must be \(n\) linearly independent eigenvectors for a diagonalization.
So, when \(n \times n\) real matrix \(A\) has \(n\) distinct real eigenvalues, it is always diagonalizable.
Example: Multiplicity
We want to know if the matrix $A = \begin{bmatrix} 2 & 2 & -1 \\ 1 & 3 & -1 \\ -1 & 2 & 2 \end{bmatrix}$1 is diagonalizable if its eigenvalues are \(\lambda = 1, 5\).
Firstly, we know that one of the eigenvalues must have multiplicity greater than one, since there are only two of them. For \(\lambda = 1\), reducing \(A - \lambda I\) gives:
\begin{align} A - 1 \cdot I = \begin{bmatrix} 1 & 2 & - 1 \\ 1 & 2 & -1 \\ -1 & 2 & 1 \end{bmatrix} \sim \begin{bmatrix} 1 & 2 & -1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix} \notag \end{align}Thus, for \(\lambda = 1\), the geometric multiplicity is \(2\) since the dimension of the nullspace is \(2\). This forces the geometric multiplicity of the other eigenvalue to be \(1\), since the geometric multiplicity must be at least one and cannot exceed the algebraic multiplicity. Then, we see that the total geometric multiplicity is \(3\), which means that there are \(3\) linearly independent eigenvectors, so this matrix is diagonalizable.
2.2. Basis of a Linear Transformation
When a linear transformation is in the standard basis, we can use the standard basis vectors to find the matrix of a linear transformation. Similarly, if we want to represent a linear transformation in a different basis \(\mathcal{B} = {\mathbf{b}_1, \mathbf{b}_2, \dots, \mathbf{b}_n}\), the matrix of this linear transformation in the basis \(\mathcal{B}\) (called the B-matrix) would be:
\begin{align} \begin{bmatrix} \\ [T(\mathbf{b}_1)]_B & [T(\mathbf{b}_2)]_B & \cdots & [T(\mathbf{b}_n)]_B \\ \: \end{bmatrix} \notag \end{align}Now suppose that an \(n \times n\) matrix \(A\) is diagonalizable over \(\mathbb{R}\), i.e. \(A = X\Lambda X^{-1}\). Then if we take the columns of \(X\) as the basis for a linear transformation \(T(\mathbf{x}) = A\mathbf{x}\), the B-matrix is:
\begin{align} \begin{bmatrix} T(\mathbf{b}_1) & T(\mathbf{b}_2) & \cdots & T(\mathbf{b}_n) \end{bmatrix} &= \begin{bmatrix} A\mathbf{b}_1 & A\mathbf{b}_2 & \cdots & A\mathbf{b}_n \end{bmatrix} \notag \\ &= \begin{bmatrix} X^{-1}A\mathbf{b}_1 & X^{-1} \mathbf{b}_2 & \cdots & X^{-1}A\mathbf{b}_n \end{bmatrix} \notag \\ &= X^{-1}A \begin{bmatrix} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_n \end{bmatrix} \notag \\ &= X^{-1} A X \notag \\ &= \Lambda \notag \end{align}Thus, for a diagonalizable matrix \(A = X\Lambda X^{-1}\), if \(\mathcal{B}\) is the basis formed from the columns of \(X\), then \(\Lambda\) is the B-matrix for the transformation \(T(\mathbf{x}) = A\mathbf{x}\).