Gaussian Elimination
Key Points
- •Gaussian elimination is a systematic way to solve linear equations by cleaning a matrix into an upper-triangular form using row swaps, scaling, and adding multiples of rows.
- •After reaching upper-triangular form, back substitution quickly retrieves the solution from bottom to top.
- •Partial pivoting (swapping to choose the largest available pivot) greatly improves numerical stability without changing the asymptotic cost.
- •The method simultaneously powers solving Ax=b, computing determinants, finding matrix rank, and inverting matrices (via Gauss–Jordan).
- •The arithmetic cost grows roughly as operations for an n×n system, which is practical for moderate n.
- •Determinants are the signed product of the pivots, and rank equals the number of nonzero pivots in row-echelon form.
- •If a pivot is zero (or tiny), the system may be singular or ill-conditioned; pivoting helps, but sometimes no unique solution exists.
- •Gaussian elimination is equivalent to LU factorization with row permutations: PA = LU.
Prerequisites
- →Systems of linear equations — Understanding what Ax = b represents and how equations relate to matrix rows is essential for the purpose of elimination.
- →Basic matrix operations — Knowledge of row/column indexing, matrix–vector multiplication, and triangular matrices helps follow the algorithm steps.
- →Floating-point arithmetic — Round-off, underflow/overflow, and tolerance checks matter for numerical stability in elimination.
- →Summation and asymptotic notation — Operation counts and complexity results use summations and big-O notation.
- →Vector and matrix norms (informal) — Residuals and condition numbers rely on norms to discuss accuracy and sensitivity.
Detailed Explanation
Tap terms for definitions01Overview
Hook: Imagine tidying a cluttered desk: you group similar items, put the biggest things in order, and clear what you don’t need. Gaussian elimination is that tidy-up for linear equations. Concept: We take a system Ax = b and apply three simple row operations—swap two rows, scale a row, and add a multiple of one row to another—to simplify A into an upper-triangular matrix U. In U, all entries below the main diagonal are zero, so each equation has fewer unknowns than the one above. That structure makes solving straightforward with back substitution: start from the last equation (one unknown), solve it, and substitute upward. Example: For a 3×3 system, we pick a pivot in the first column and eliminate entries below it. We then move to column two (one row down, one column right) and repeat, and finally column three. If a pivot is zero or suspiciously small, we swap with a lower row that has a larger entry in that column (partial pivoting) to avoid dividing by tiny numbers. After these steps, A has been transformed into U, and we quickly solve for x3, x2, x1. This engine also lets us compute determinants (product of diagonal entries times the sign from swaps), matrix rank (the number of pivots), and inverses (by running elimination on [A | I]).
02Intuition & Analogies
Hook: Think of solving equations like draining water from stacked containers. You open the lowest tap first (bottom equation with one unknown), then the level above becomes easier, and so on. Concept: To reach that stack with a clean single-unknown bottom, we systematically knock out entries below a leading element (pivot) in each column. The allowed moves—swap, scale, and add multiples—are like reordering items, resizing, and combining groups on your desk: they don’t change the solution set, but they make the layout clearer. The goal is a staircase pattern (row-echelon form) where each lower row starts farther to the right. Example: Suppose we have three equations with three unknowns. Pick a leading entry in the first column (the pivot). Use it to eliminate the first variable from all rows below by subtracting suitable multiples. Now the first column is clean under the pivot. Move to the next column and repeat, but only on the rows beneath the previous pivot—like tidying one shelf at a time. If the chosen pivot is tiny, the arithmetic becomes wobbly (dividing by a tiny number amplifies errors). So we swap in a stronger pivot (partial pivoting), just like moving a stable box to the bottom of a stack. By the end, the matrix is triangular, and each step in back substitution is like undoing one zipper tooth at a time—quick and controlled.
03Formal Definition
04When to Use
Hook: Use the right tool for the mess you have. Concept: Gaussian elimination is the default direct method for solving small-to-moderate linear systems and for extracting structural information from a matrix. Use it when you need a deterministic, finite-step answer with predictable cost. Example use cases: (1) Solve Ax = b for dense, moderately sized n (e.g., n ≤ a few thousands) where O(n^3) time is acceptable. (2) Compute det(A) robustly using LU/GE with pivoting rather than cofactor expansion (which is exponential). (3) Determine rank(A) and diagnose system consistency by comparing rank(A) to rank([A | b]). (4) Invert matrices via Gauss–Jordan when you truly need A^{-1} (e.g., pre/post-processing), though in practice solving Ax = b is usually better than forming A^{-1}. (5) Handle multiple right-hand sides efficiently: factor once (PA = LU) and solve Ly = Pb, Ux = y for each b. Avoid when: matrices are huge and sparse with special structure—then specialized sparse or iterative methods (e.g., Conjugate Gradient for SPD, GMRES) are typically superior. For nearly singular or ill-conditioned matrices, prefer pivoting and possibly iterative refinement to improve accuracy.
⚠️Common Mistakes
Hook: Most errors come from mixing up safe moves with risky arithmetic. Concept: Common pitfalls include dividing by tiny pivots (amplifies rounding errors), forgetting to swap the right-hand side when pivoting, assuming a unique solution without checking rank, and computing determinants by naive cofactor expansion. Example fixes: (1) Use partial pivoting—choose the largest magnitude pivot in the current column; if |pivot| < ε, treat the system as singular/ill-conditioned. (2) Always apply the same row swaps and operations to b (or to the right block when inverting) as you do to A; otherwise, you change the problem. (3) Detect singularity and solution type: compute rank(A) and rank([A | b]); if ranks differ, no solution; if equal but < n, infinite solutions. (4) Do not explicitly compute A^{-1} just to solve Ax = b; factor once and back substitute—it’s faster and more stable. (5) Mind floating-point tolerances: comparisons to zero should use a small ε and consistent scaling. (6) Beware of overwriting needed values; structure your loops so updated entries aren’t reused incorrectly. (7) Keep track of row swap parity when computing det(A); forgetting the sign yields incorrect determinants.
Key Formulas
LU Factorization with Pivoting
Explanation: Any nonsingular matrix A can be row-permuted (by P) so that it factors into L (unit lower-triangular) and U (upper-triangular). This models Gaussian elimination with partial pivoting.
Determinant from U
Explanation: After elimination to U with s row swaps, the determinant of A equals the product of U’s diagonal times the sign from swaps. A zero pivot makes the determinant zero.
Rank from Pivots
Explanation: The number of pivots encountered during elimination equals the matrix rank.
Forward Elimination Cost
Explanation: The dominant operation count for eliminating an n×n dense matrix is about arithmetic operations; lower-order terms are usually ignored.
Back Substitution Cost
Explanation: Solving the triangular system takes on the order of operations, much less than the elimination phase.
Matrix Inverse via Augmentation
Explanation: Row-reducing the augmented matrix replaces A with I and reveals on the right, if and only if A is invertible.
Condition Number
Explanation: The condition number measures sensitivity of the solution to perturbations; large κ(A) implies potential amplification of numerical errors.
Residual
Explanation: The residual vector measures how well x satisfies Ax = b; small residual is necessary but not sufficient for small solution error if κ(A) is large.
Consistency Check
Explanation: Comparing ranks of A and the augmented matrix detects inconsistency; unequal ranks mean the system cannot be satisfied.
Uniqueness Condition
Explanation: A square system has a unique solution when A is full rank; otherwise, there are either no solutions or infinitely many.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Solve Ax = b via Gaussian elimination with partial pivoting. 5 // Returns pair(success, solution). If success == false, solution is empty. 6 pair<bool, vector<double>> gaussianEliminationSolve(vector<vector<double>> A, vector<double> b) { 7 const double EPS = 1e-12; 8 int n = (int)A.size(); 9 for (int i = 0; i < n; ++i) { 10 // 1) Pivot selection: choose the row with largest |A[row][i]| at/under i 11 int pivot = i; 12 for (int r = i + 1; r < n; ++r) { 13 if (fabs(A[r][i]) > fabs(A[pivot][i])) pivot = r; 14 } 15 if (fabs(A[pivot][i]) < EPS) { 16 // Matrix is singular or ill-conditioned for this RHS 17 return {false, {}}; 18 } 19 // 2) Swap rows in A and corresponding entry in b 20 if (pivot != i) { 21 swap(A[pivot], A[i]); 22 swap(b[pivot], b[i]); 23 } 24 // 3) Eliminate entries below the pivot 25 for (int r = i + 1; r < n; ++r) { 26 double factor = A[r][i] / A[i][i]; 27 A[r][i] = 0.0; // explicitly set to zero for cleanliness 28 for (int c = i + 1; c < n; ++c) A[r][c] -= factor * A[i][c]; 29 b[r] -= factor * b[i]; 30 } 31 } 32 // 4) Back substitution to get x 33 vector<double> x(n, 0.0); 34 for (int i = n - 1; i >= 0; --i) { 35 double sum = b[i]; 36 for (int c = i + 1; c < n; ++c) sum -= A[i][c] * x[c]; 37 if (fabs(A[i][i]) < 1e-18) return {false, {}}; // safety 38 x[i] = sum / A[i][i]; 39 } 40 return {true, x}; 41 } 42 43 int main() { 44 // Example: Solve a 3x3 system Ax = b 45 vector<vector<double>> A = { 46 { 2, -1, 1}, 47 { 3, 3, 9}, 48 { 3, 3, 5} 49 }; 50 vector<double> b = {2, -1, 4}; 51 52 auto [ok, x] = gaussianEliminationSolve(A, b); 53 if (!ok) { 54 cout << "No unique solution (singular/ill-conditioned matrix).\n"; 55 return 0; 56 } 57 cout.setf(ios::fixed); cout << setprecision(8); 58 cout << "Solution x: "; 59 for (double xi : x) cout << xi << ' '; 60 cout << "\n"; 61 62 // Verify residual norm ||Ax - b|| 63 vector<double> r(b.size(), 0.0); 64 for (int i = 0; i < (int)A.size(); ++i) { 65 double s = 0.0; 66 for (int j = 0; j < (int)A.size(); ++j) s += A[i][j] * x[j]; 67 r[i] = s - b[i]; 68 } 69 double rn = 0.0; for (double ri : r) rn = max(rn, fabs(ri)); 70 cout << "Max residual |Ax - b|_inf = " << rn << "\n"; 71 return 0; 72 } 73
This program solves a dense linear system using Gaussian elimination with partial pivoting. Each column chooses the largest-magnitude pivot to stabilize division, then eliminates entries below the pivot to create an upper-triangular matrix. Back substitution recovers the solution. It reports failure if a pivot is near zero, signaling a singular or ill-conditioned system.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute determinant using elimination with partial pivoting. 5 // Returns det(A). If matrix is (numerically) singular, returns 0. 6 double determinant(vector<vector<double>> A) { 7 const double EPS = 1e-12; 8 int n = (int)A.size(); 9 double det = 1.0; 10 int swapCount = 0; 11 12 for (int i = 0; i < n; ++i) { 13 int pivot = i; 14 for (int r = i + 1; r < n; ++r) if (fabs(A[r][i]) > fabs(A[pivot][i])) pivot = r; 15 if (fabs(A[pivot][i]) < EPS) return 0.0; // singular 16 if (pivot != i) { 17 swap(A[pivot], A[i]); 18 swapCount++; 19 } 20 // Multiply diagonal element into determinant 21 det *= A[i][i]; 22 // Eliminate entries below pivot 23 for (int r = i + 1; r < n; ++r) { 24 double factor = A[r][i] / A[i][i]; 25 for (int c = i + 1; c < n; ++c) A[r][c] -= factor * A[i][c]; 26 // No need to store A[r][i] since not used later 27 } 28 } 29 if (swapCount % 2) det = -det; // adjust sign for row swaps 30 return det; 31 } 32 33 int main() { 34 vector<vector<double>> A = { 35 { 4, 2, 1}, 36 { 0, 3, -1}, 37 { 2, -2, 5} 38 }; 39 cout.setf(ios::fixed); cout << setprecision(6); 40 cout << "det(A) = " << determinant(A) << "\n"; 41 return 0; 42 } 43
This program computes det(A) by transforming A to upper-triangular form with partial pivoting. The determinant equals the product of diagonal entries of U times (−1) raised to the number of row swaps. If a pivot is numerically zero, the determinant is zero.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute inverse of A using Gauss–Jordan with partial pivoting. 5 // Returns pair(success, inverse). If not invertible, success=false. 6 pair<bool, vector<vector<double>>> invertMatrix(vector<vector<double>> A) { 7 const double EPS = 1e-12; 8 int n = (int)A.size(); 9 vector<vector<double>> I(n, vector<double>(n, 0.0)); 10 for (int i = 0; i < n; ++i) I[i][i] = 1.0; 11 12 for (int col = 0; col < n; ++col) { 13 // 1) Find pivot row 14 int pivot = col; 15 for (int r = col + 1; r < n; ++r) if (fabs(A[r][col]) > fabs(A[pivot][col])) pivot = r; 16 if (fabs(A[pivot][col]) < EPS) return {false, {}}; // singular 17 // 2) Swap rows in both A and I 18 if (pivot != col) { 19 swap(A[pivot], A[col]); 20 swap(I[pivot], I[col]); 21 } 22 // 3) Normalize pivot row to make pivot = 1 23 double pivotVal = A[col][col]; 24 for (int c = 0; c < n; ++c) { 25 A[col][c] /= pivotVal; 26 I[col][c] /= pivotVal; 27 } 28 // 4) Eliminate other rows in this column 29 for (int r = 0; r < n; ++r) if (r != col) { 30 double factor = A[r][col]; 31 if (factor == 0.0) continue; 32 for (int c = 0; c < n; ++c) { 33 A[r][c] -= factor * A[col][c]; 34 I[r][c] -= factor * I[col][c]; 35 } 36 } 37 } 38 return {true, I}; 39 } 40 41 int main() { 42 vector<vector<double>> A = { 43 { 4, 7, 2}, 44 { 3, 6, 1}, 45 { 2, 5, 3} 46 }; 47 auto [ok, invA] = invertMatrix(A); 48 if (!ok) { 49 cout << "Matrix is singular; no inverse.\n"; 50 return 0; 51 } 52 cout.setf(ios::fixed); cout << setprecision(6); 53 cout << "A^{-1}:\n"; 54 for (auto &row : invA) { 55 for (double v : row) cout << setw(12) << v << ' '; 56 cout << '\n'; 57 } 58 return 0; 59 } 60
This program performs Gauss–Jordan elimination on the augmented matrix [A | I]. After pivoting, it scales the pivot row to make the pivot 1 and eliminates the pivot column from all other rows. When finished, the left block becomes I and the right block becomes A^{-1}. If a pivot is (near) zero, A is singular and not invertible.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct RREFResult { 5 vector<vector<double>> R; // Reduced matrix (augmented) 6 int rankA; // rank of coefficient matrix A 7 int rankAug; // rank of [A|b] 8 }; 9 10 RREFResult rrefAugmented(vector<vector<double>> Aug, int nRows, int nColsA) { 11 // Aug has size nRows x (nColsA + nRHS), where nRHS >= 1 12 const double EPS = 1e-10; 13 int nCols = (int)Aug[0].size(); 14 int row = 0; 15 vector<int> pivotCol; pivotCol.reserve(min(nRows, nColsA)); 16 17 for (int col = 0; col < nColsA && row < nRows; ++col) { 18 // Find pivot in this column 19 int pivot = row; 20 for (int r = row + 1; r < nRows; ++r) 21 if (fabs(Aug[r][col]) > fabs(Aug[pivot][col])) pivot = r; 22 if (fabs(Aug[pivot][col]) < EPS) continue; // no pivot in this column 23 swap(Aug[pivot], Aug[row]); 24 // Normalize pivot row 25 double pv = Aug[row][col]; 26 for (int c = col; c < nCols; ++c) Aug[row][c] /= pv; 27 // Eliminate in all other rows 28 for (int r = 0; r < nRows; ++r) if (r != row) { 29 double f = Aug[r][col]; 30 if (f == 0.0) continue; 31 for (int c = col; c < nCols; ++c) Aug[r][c] -= f * Aug[row][c]; 32 } 33 pivotCol.push_back(col); 34 ++row; 35 } 36 37 // Compute ranks: count nonzero rows in A-part and in full augmented matrix 38 int rankA = 0, rankAug = 0; 39 for (int r = 0; r < nRows; ++r) { 40 bool anyA = false, anyAug = false; 41 for (int c = 0; c < nColsA; ++c) anyA = anyA || fabs(Aug[r][c]) > EPS; 42 for (int c = 0; c < nCols; ++c) anyAug = anyAug || fabs(Aug[r][c]) > EPS; 43 if (anyA) ++rankA; 44 if (anyAug) ++rankAug; 45 } 46 return {Aug, rankA, rankAug}; 47 } 48 49 int main() { 50 // Example: 3 equations, 3 unknowns, 1 RHS 51 vector<vector<double>> A = { 52 { 1, 2, -1}, 53 { 2, 4, -2}, // dependent on row 1 (singular system) 54 { 0, 1, 1} 55 }; 56 vector<double> b = {1, 2, 3}; 57 58 int n = (int)A.size(); 59 int m = (int)A[0].size(); 60 61 // Build augmented matrix [A | b] 62 vector<vector<double>> Aug(n, vector<double>(m + 1)); 63 for (int i = 0; i < n; ++i) { 64 for (int j = 0; j < m; ++j) Aug[i][j] = A[i][j]; 65 Aug[i][m] = b[i]; 66 } 67 68 auto res = rrefAugmented(Aug, n, m); 69 70 cout.setf(ios::fixed); cout << setprecision(6); 71 cout << "RREF([A|b]):\n"; 72 for (auto &row : res.R) { 73 for (double v : row) cout << setw(10) << v << ' '; 74 cout << '\n'; 75 } 76 77 cout << "rank(A) = " << res.rankA << ", rank([A|b]) = " << res.rankAug << "\n"; 78 79 if (res.rankA < res.rankAug) { 80 cout << "System is inconsistent: no solution.\n"; 81 } else if (res.rankA == m) { 82 cout << "Unique solution. Read the last column from RREF.\n"; 83 } else { 84 cout << "Infinitely many solutions: free variables exist.\n"; 85 } 86 return 0; 87 } 88
This program reduces an augmented matrix [A | b] to reduced row-echelon form and computes rank(A) and rank([A | b]). Comparing these ranks diagnoses whether the system has no solution, a unique solution, or infinitely many. The RREF also directly encodes the solution (with free variables if rank < number of unknowns).