Eigendecomposition
Key Points
- •Eigendecomposition expresses a matrix as a change of basis times a diagonal scaling, revealing its natural stretching directions.
- •Eigenvalues are the stretch factors and eigenvectors are the directions that do not bend under the matrix transformation.
- •A matrix is diagonalizable if it has enough linearly independent eigenvectors; symmetric real matrices are always diagonalizable with orthonormal eigenvectors.
- •The characteristic polynomial det(A - I) = 0 gives eigenvalues, but numerically we compute them with iterative algorithms like power iteration and the QR algorithm.
- •The Rayleigh quotient Ax / x estimates the eigenvalue associated with a nonzero vector x and is exact when x is an eigenvector.
- •For large sparse problems, iterative methods (power/inverse iteration, Lanczos) are preferred over dense factorizations.
- •Trace equals the sum of eigenvalues and determinant equals their product, linking spectral properties to basic matrix invariants.
- •Numerical stability matters: use orthogonal transformations and normalization to control rounding errors and convergence.
Prerequisites
- →Matrix and vector operations — You must understand multiplication, transposes, and basic indexing to follow eigenvalue algorithms.
- →Determinants and trace — Eigenvalues are roots of det(A − \lambda I), and trace/determinant relate to sums/products of eigenvalues.
- →Vector norms and inner products — Normalization, Rayleigh quotient, and convergence checks rely on Euclidean norms and dot products.
- →Similarity transforms — Diagonalization and QR iterations preserve eigenvalues via similarity.
- →Numerical stability and conditioning — Practical eigensolvers require stable orthogonal transformations and careful normalization.
- →Orthogonality and projections — QR factorization and spectral theorems depend on orthonormal bases and projections.
- →Polynomial roots (quadratic formula) — Useful for closed-form eigenvalues of 2×2 matrices and theoretical background via characteristic polynomials.
- →Asymptotic complexity — To choose the right algorithm, you need to compare O(n^2), O(n^3), and sparse costs.
Detailed Explanation
Tap terms for definitions01Overview
Eigendecomposition is a way to understand a square matrix by uncovering the special directions it scales without rotating them. If a matrix A acts on a vector v and the result is simply a scaled version of v, then v is an eigenvector and the scaling factor is its eigenvalue. When a matrix has enough independent eigenvectors, we can stack them as columns of a matrix P and write A = P D P^{-1}, where D is diagonal with the eigenvalues. This diagonal form is much easier to work with: powers of A become powers of D, differential equations decouple, and quadratic forms simplify. For real symmetric matrices (or complex Hermitian), eigendecomposition is guaranteed: A = Q Λ Q^{T} (or Q^{*} in complex), with orthonormal eigenvectors and real eigenvalues. In practice, we rarely compute eigenvalues by solving polynomials beyond tiny matrices because the characteristic equations become unstable numerically. Instead, we use iterative numerical linear algebra algorithms such as power iteration (to get the dominant eigenpair) and the QR algorithm (to get all eigenvalues), often preceded by reductions that preserve eigenvalues but simplify the structure (e.g., tridiagonalization for symmetric matrices). Understanding eigendecomposition opens doors to principal component analysis (PCA), graph spectral clustering, vibration modes, PageRank, and many other applications.
02Intuition & Analogies
Imagine a stretchy fabric printed with a grid. When you pull it in a particular way (apply a matrix A), most grid lines bend and skew. However, there may be a few lucky directions where lines do not bend at all—they only stretch or shrink. Those special directions are eigenvectors, and the amount of stretch is the eigenvalue. Think of a rubber band looped around a peg board: if you shear the board, most pegs move in a complicated way, but along certain axes the change looks like a pure scaling. Another analogy: consider musical instruments. A guitar string vibrates in specific modes: fundamental and harmonics. Each mode has a shape (the eigenvector) and a pitch (the eigenvalue, related to frequency). When you pluck the string, you excite a combination of these modes, but the natural modes define how the system behaves. Similarly, a matrix acting on vectors can be decomposed into independent modes that evolve separately when you repeatedly apply the matrix or solve related dynamical systems. For web ranking like PageRank, you repeatedly multiply by a transition matrix. Eventually, the vector points in a steady direction that doesn’t change except for scale—the dominant eigenvector. In image compression (PCA), you look for directions of greatest variance; those turn out to be eigenvectors of the covariance matrix. In short, eigenvectors are the matrix’s “favorite” directions, and eigenvalues tell you how strongly the matrix favors them.
03Formal Definition
04When to Use
Use eigendecomposition when you need to:
- Simplify repeated applications of a linear transformation, such as computing A^{k}x or e^{At}x in dynamical systems; diagonalization turns these into per-coordinate scalings.
- Analyze stability: the signs and magnitudes of eigenvalues determine whether trajectories grow, decay, or oscillate in linear systems and control theory.
- Perform dimensionality reduction and data analysis (PCA): eigenvectors of the covariance matrix give principal components and eigenvalues give explained variance.
- Understand graph structure: eigenvalues of Laplacian or adjacency matrices reveal connectivity, expansion, and community structure; the Fiedler vector is the second-smallest eigenvector of the Laplacian.
- Solve symmetric positive definite problems efficiently: eigendecomposition helps preconditioners and understanding of convergence in iterative solvers.
- Compute quadratic forms and optimize Rayleigh quotients: the largest eigenvalue maximizes x^{T}Ax subject to |x|=1 for symmetric A.
- Build modal analyses in physics/engineering: natural frequencies and modes of vibration are eigenpairs of stiffness/mass matrices. If the matrix is large and sparse, prefer iterative methods (power iteration, Lanczos) that access A only via matrix–vector products. If the matrix is small to medium and dense, QR-based methods or library routines (e.g., LAPACK) are appropriate.
⚠️Common Mistakes
- Expecting every matrix to be diagonalizable: defective matrices (e.g., a single Jordan block) do not have enough eigenvectors. Check conditions: distinct eigenvalues guarantee diagonalizability; symmetry/Hermitian structure guarantees orthogonal diagonalization.
- Confusing eigenvectors with arbitrary nonzero vectors: only vectors satisfying A v = \lambda v qualify. Always verify by substitution, not by inspection.
- Ignoring scaling and sign: eigenvectors are defined up to nonzero scaling; for symmetric problems, the sign can flip without changing correctness. Compare directions, not raw components.
- Using the characteristic polynomial numerically: solving high-degree polynomials is unstable. Prefer QR iterations or library eigensolvers; use the characteristic polynomial only for theoretical proofs or tiny matrices (2×2, 3×3 with care).
- Applying power iteration to get non-dominant eigenvalues: vanilla power iteration converges to the eigenvector of the largest-magnitude eigenvalue. To find others, use deflation, shifts, inverse iteration, or QR/Lanczos methods.
- Failing to normalize in iterative methods: without normalization, vectors may overflow/underflow and convergence tests become meaningless. Normalize each step and use Rayleigh quotient for eigenvalue estimates.
- Mishandling multiplicities: algebraic multiplicity does not equal geometric multiplicity in general. When eigenvalues repeat, ensure you construct a basis of independent eigenvectors or use Schur/Jordan forms if needed.
Key Formulas
Eigenvalue Equation
Explanation: An eigenvector v is scaled by A without changing direction; the scale factor is the eigenvalue This is the core definition of eigenpairs.
Characteristic Polynomial
Explanation: Eigenvalues are precisely the roots of the characteristic polynomial. It is useful theoretically but avoided numerically for large n.
Diagonalization
Explanation: If A has n independent eigenvectors, you can collect them in P so that A becomes diagonal in that basis. Computations like simplify to P .
Spectral Theorem (Real Symmetric)
Explanation: A real symmetric matrix can be written with orthonormal eigenvectors in Q and real eigenvalues in Λ. Orthogonality provides numerical stability and geometric clarity.
Trace and Determinant
Explanation: The trace equals the sum of eigenvalues and the determinant equals their product, both counting multiplicities. These identities help sanity-check computations.
Rayleigh Quotient
Explanation: For symmetric A, R(x) lies between the smallest and largest eigenvalues. It equals an eigenvalue when x is the corresponding eigenvector.
Power Iteration Convergence
Explanation: Repeatedly applying A and normalizing converges to the dominant eigenvector v1. The error decays roughly as the ratio of the second to first eigenvalue magnitudes raised to k.
Unshifted QR Iteration
Explanation: Each step factors A into Q and R, then reverses the factors. For many matrices (especially symmetric), converges to a diagonal matrix with eigenvalues on the diagonal.
Gershgorin Circle Theorem
Explanation: Every eigenvalue of A lies in at least one Gershgorin disk. It provides fast bounds for locating eigenvalues.
Spectral Radius
Explanation: The spectral radius is the largest magnitude of the eigenvalues. It governs long-term behavior of repeated multiplication by A.
Complexity Analysis
Code Examples
1 #include <iostream> 2 #include <vector> 3 #include <cmath> 4 #include <iomanip> 5 #include <limits> 6 7 using Matrix = std::vector<std::vector<double>>; 8 using Vector = std::vector<double>; 9 10 // Multiply matrix A (n x n) by vector x (n) 11 Vector matVecMul(const Matrix &A, const Vector &x) { 12 int n = (int)A.size(); 13 Vector y(n, 0.0); 14 for (int i = 0; i < n; ++i) { 15 double sum = 0.0; 16 for (int j = 0; j < n; ++j) sum += A[i][j] * x[j]; 17 y[i] = sum; 18 } 19 return y; 20 } 21 22 // Dot product 23 double dot(const Vector &a, const Vector &b) { 24 double s = 0.0; for (size_t i = 0; i < a.size(); ++i) s += a[i] * b[i]; return s; 25 } 26 27 // Euclidean norm 28 double norm2(const Vector &x) { 29 return std::sqrt(dot(x, x)); 30 } 31 32 // Normalize vector; returns its original norm 33 double normalize(Vector &x) { 34 double n = norm2(x); 35 if (n > 0) { 36 for (double &xi : x) xi /= n; 37 } 38 return n; 39 } 40 41 // Power iteration to approximate dominant eigenpair of A (assumes A has a unique eigenvalue of largest magnitude) 42 // Returns (lambda, v) with ||v||=1 43 std::pair<double, Vector> powerIteration(const Matrix &A, int maxIters = 1000, double tol = 1e-9) { 44 int n = (int)A.size(); 45 Vector v(n, 1.0); // initial guess (all-ones) 46 normalize(v); 47 48 double lambda_old = 0.0; 49 for (int iter = 0; iter < maxIters; ++iter) { 50 Vector w = matVecMul(A, v); 51 double wnorm = normalize(w); // prevents overflow/underflow 52 // Rayleigh quotient is a robust eigenvalue estimate for symmetric A 53 double lambda = dot(v, matVecMul(A, v)); 54 55 // Convergence test: change in eigenvalue and direction 56 double diff_lambda = std::fabs(lambda - lambda_old); 57 // angle-based check via \|Av - lambda v\| 58 Vector r = matVecMul(A, v); 59 for (int i = 0; i < n; ++i) r[i] -= lambda * v[i]; 60 double res = norm2(r); 61 62 v = w; // next iterate direction 63 if (diff_lambda < tol && res < std::sqrt(tol)) { 64 return {lambda, v}; 65 } 66 lambda_old = lambda; 67 } 68 // Final estimate after maxIters 69 double lambda = dot(v, matVecMul(A, v)); 70 return {lambda, v}; 71 } 72 73 int main() { 74 // Example: symmetric matrix with known eigenpairs 75 Matrix A = { 76 {4.0, 1.0, 0.0}, 77 {1.0, 3.0, 1.0}, 78 {0.0, 1.0, 2.0} 79 }; 80 81 auto [lambda, v] = powerIteration(A, 2000, 1e-10); 82 83 std::cout << std::fixed << std::setprecision(8); 84 std::cout << "Approx dominant eigenvalue: " << lambda << "\n"; 85 std::cout << "Approx eigenvector (unit): ["; 86 for (size_t i = 0; i < v.size(); ++i) { 87 std::cout << v[i] << (i + 1 == v.size() ? "]\n" : ", "); 88 } 89 90 // Residual check \|Av - lambda v\| 91 Vector r = matVecMul(A, v); 92 for (size_t i = 0; i < v.size(); ++i) r[i] -= lambda * v[i]; 93 double res = norm2(r); 94 std::cout << "Residual norm ||Av - lambda v||: " << res << "\n"; 95 return 0; 96 } 97
This program implements power iteration to find the dominant (largest-magnitude) eigenvalue and its eigenvector. Each iteration multiplies by A, normalizes to control scaling, and uses the Rayleigh quotient to estimate the eigenvalue. It stops when both the eigenvalue change and the residual ||Av − λv|| are small.
1 #include <iostream> 2 #include <vector> 3 #include <cmath> 4 #include <iomanip> 5 6 using Matrix = std::vector<std::vector<double>>; 7 8 Matrix transpose(const Matrix &A) { 9 int n = (int)A.size(); 10 Matrix AT(n, std::vector<double>(n, 0.0)); 11 for (int i = 0; i < n; ++i) 12 for (int j = 0; j < n; ++j) 13 AT[j][i] = A[i][j]; 14 return AT; 15 } 16 17 Matrix matMul(const Matrix &A, const Matrix &B) { 18 int n = (int)A.size(); 19 Matrix C(n, std::vector<double>(n, 0.0)); 20 for (int i = 0; i < n; ++i) 21 for (int k = 0; k < n; ++k) { 22 double a = A[i][k]; 23 for (int j = 0; j < n; ++j) 24 C[i][j] += a * B[k][j]; 25 } 26 return C; 27 } 28 29 // Classical Gram-Schmidt QR (educational; Householder is more stable in practice) 30 void qrDecompose(const Matrix &A, Matrix &Q, Matrix &R) { 31 int n = (int)A.size(); 32 Q.assign(n, std::vector<double>(n, 0.0)); 33 R.assign(n, std::vector<double>(n, 0.0)); 34 std::vector<std::vector<double>> V = A; // copy columns progressively orthogonalized 35 36 auto dot = [&](int colU, int colV) { 37 double s = 0.0; 38 for (int i = 0; i < n; ++i) s += Q[i][colU] * A[i][colV]; 39 return s; 40 }; 41 42 for (int j = 0; j < n; ++j) { 43 // v_j starts as A's j-th column 44 std::vector<double> v(n); 45 for (int i = 0; i < n; ++i) v[i] = A[i][j]; 46 47 // Subtract projections onto previous q_i 48 for (int i = 0; i < j; ++i) { 49 double rij = 0.0; 50 for (int r = 0; r < n; ++r) rij += Q[r][i] * A[r][j]; 51 R[i][j] = rij; 52 for (int r = 0; r < n; ++r) v[r] -= rij * Q[r][i]; 53 } 54 // r_jj = ||v|| 55 double rjj = 0.0; for (int r = 0; r < n; ++r) rjj += v[r] * v[r]; rjj = std::sqrt(rjj); 56 R[j][j] = rjj; 57 if (rjj == 0.0) { 58 // Dependent column; put any orthonormal vector (rare for symmetric non-singular) 59 for (int r = 0; r < n; ++r) Q[r][j] = 0.0; 60 Q[0][j] = 1.0; 61 } else { 62 for (int r = 0; r < n; ++r) Q[r][j] = v[r] / rjj; 63 } 64 } 65 } 66 67 // Unshifted QR iteration for symmetric A. Returns diagonal entries as eigenvalue estimates. 68 std::vector<double> qrEigenvaluesSymmetric(Matrix A, int maxIters = 200, double tol = 1e-10) { 69 int n = (int)A.size(); 70 for (int it = 0; it < maxIters; ++it) { 71 // Force symmetry to limit drift due to roundoff (educational trick) 72 for (int i = 0; i < n; ++i) 73 for (int j = i+1; j < n; ++j) { 74 double s = 0.5 * (A[i][j] + A[j][i]); 75 A[i][j] = A[j][i] = s; 76 } 77 Matrix Q, R; qrDecompose(A, Q, R); 78 A = matMul(R, Q); 79 // Check off-diagonal norm for convergence 80 double off = 0.0; 81 for (int i = 0; i < n; ++i) 82 for (int j = 0; j < n; ++j) 83 if (i != j) off += A[i][j] * A[i][j]; 84 if (std::sqrt(off) < tol) break; 85 } 86 std::vector<double> evals(n); 87 for (int i = 0; i < n; ++i) evals[i] = A[i][i]; 88 return evals; 89 } 90 91 int main() { 92 Matrix A = { 93 {6.0, 2.0, 1.0}, 94 {2.0, 3.0, 1.0}, 95 {1.0, 1.0, 1.0} 96 }; // symmetric 97 98 std::vector<double> evals = qrEigenvaluesSymmetric(A, 300, 1e-9); 99 100 std::cout << std::fixed << std::setprecision(10); 101 std::cout << "Estimated eigenvalues (QR):\n"; 102 for (double l : evals) std::cout << l << "\n"; 103 return 0; 104 } 105
This program computes all eigenvalues of a real symmetric matrix using unshifted QR iterations. Each step factors A = Q R (via classical Gram–Schmidt) and then sets A ← R Q. For symmetric matrices, the iterates tend toward a diagonal matrix whose diagonal entries are the eigenvalues. For production use, prefer Householder QR with shifts and tridiagonal reduction for stability and speed.
1 #include <iostream> 2 #include <vector> 3 #include <cmath> 4 #include <iomanip> 5 6 struct Eig2x2 { 7 double lambda1, lambda2; // eigenvalues 8 std::vector<double> v1, v2; // corresponding eigenvectors (unit length) 9 }; 10 11 // Compute eigenvalues and eigenvectors of a 2x2 real matrix [[a,b],[c,d]] 12 Eig2x2 eig2x2(double a, double b, double c, double d) { 13 double tr = a + d; 14 double det = a * d - b * c; 15 double disc = tr * tr - 4.0 * det; 16 if (disc < 0) disc = 0; // clamp tiny negatives from roundoff 17 double s = std::sqrt(disc); 18 19 // Numerically stable quadratic formula 20 double lambda1 = 0.5 * (tr + s); 21 double lambda2 = 0.5 * (tr - s); 22 23 auto norm = [](const std::vector<double>& v){ double s=0; for(double x:v) s+=x*x; return std::sqrt(s); }; 24 auto normalize = [&](std::vector<double>& v){ double n=norm(v); if(n>0) for(double &x:v) x/=n; }; 25 26 // Find eigenvector for lambda1: solve (A - lambda1 I) v = 0 27 std::vector<double> v1(2), v2(2); 28 if (std::fabs(b) > std::fabs(c)) { 29 v1 = {lambda1 - d, b}; 30 v2 = {lambda2 - d, b}; 31 } else { 32 v1 = {c, lambda1 - a}; 33 v2 = {c, lambda2 - a}; 34 } 35 // Handle near-zero vectors (e.g., diagonal matrices) 36 if (std::fabs(v1[0]) + std::fabs(v1[1]) < 1e-14) v1 = {1.0, 0.0}; 37 if (std::fabs(v2[0]) + std::fabs(v2[1]) < 1e-14) v2 = {0.0, 1.0}; 38 39 normalize(v1); normalize(v2); 40 41 // Optional: ensure a right-handed basis or deterministic sign 42 if (v1[0] < 0) { v1[0] = -v1[0]; v1[1] = -v1[1]; } 43 if (v2[0] < 0) { v2[0] = -v2[0]; v2[1] = -v2[1]; } 44 45 return {lambda1, lambda2, v1, v2}; 46 } 47 48 int main() { 49 double a=2, b=1, c=3, d=4; // Example matrix [[2,1],[3,4]] 50 Eig2x2 E = eig2x2(a, b, c, d); 51 52 std::cout << std::fixed << std::setprecision(8); 53 std::cout << "lambda1 = " << E.lambda1 << ", v1 = [" << E.v1[0] << ", " << E.v1[1] << "]\n"; 54 std::cout << "lambda2 = " << E.lambda2 << ", v2 = [" << E.v2[0] << ", " << E.v2[1] << "]\n"; 55 56 // Verify Av ≈ lambda v 57 auto check = [&](double la, const std::vector<double>& v){ 58 std::vector<double> Av = {a*v[0] + b*v[1], c*v[0] + d*v[1]}; 59 std::cout << "Residual ||Av - lambda v|| = " 60 << std::sqrt((Av[0]-la*v[0])*(Av[0]-la*v[0]) + (Av[1]-la*v[1])*(Av[1]-la*v[1])) << "\n"; 61 }; 62 check(E.lambda1, E.v1); 63 check(E.lambda2, E.v2); 64 return 0; 65 } 66
This program computes the eigendecomposition of a 2×2 real matrix in closed form. It uses the quadratic formula for eigenvalues and constructs eigenvectors by solving (A − λI)v = 0 directly, then normalizes them. This is exact up to floating-point rounding and illustrates diagonalization when eigenvalues are distinct.