MathIntermediate

Gaussian Elimination

Key Points

  • Gaussian elimination is a systematic way to solve linear equations by cleaning a matrix into an upper-triangular form using row swaps, scaling, and adding multiples of rows.
  • After reaching upper-triangular form, back substitution quickly retrieves the solution from bottom to top.
  • Partial pivoting (swapping to choose the largest available pivot) greatly improves numerical stability without changing the asymptotic cost.
  • The method simultaneously powers solving Ax=b, computing determinants, finding matrix rank, and inverting matrices (via Gauss–Jordan).
  • The arithmetic cost grows roughly as operations for an n×n system, which is practical for moderate n.
  • Determinants are the signed product of the pivots, and rank equals the number of nonzero pivots in row-echelon form.
  • If a pivot is zero (or tiny), the system may be singular or ill-conditioned; pivoting helps, but sometimes no unique solution exists.
  • Gaussian elimination is equivalent to LU factorization with row permutations: PA = LU.

Prerequisites

  • Systems of linear equationsUnderstanding what Ax = b represents and how equations relate to matrix rows is essential for the purpose of elimination.
  • Basic matrix operationsKnowledge of row/column indexing, matrix–vector multiplication, and triangular matrices helps follow the algorithm steps.
  • Floating-point arithmeticRound-off, underflow/overflow, and tolerance checks matter for numerical stability in elimination.
  • Summation and asymptotic notationOperation counts and complexity results use summations and big-O notation.
  • Vector and matrix norms (informal)Residuals and condition numbers rely on norms to discuss accuracy and sensitivity.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Imagine tidying a cluttered desk: you group similar items, put the biggest things in order, and clear what you don’t need. Gaussian elimination is that tidy-up for linear equations. Concept: We take a system Ax = b and apply three simple row operations—swap two rows, scale a row, and add a multiple of one row to another—to simplify A into an upper-triangular matrix U. In U, all entries below the main diagonal are zero, so each equation has fewer unknowns than the one above. That structure makes solving straightforward with back substitution: start from the last equation (one unknown), solve it, and substitute upward. Example: For a 3×3 system, we pick a pivot in the first column and eliminate entries below it. We then move to column two (one row down, one column right) and repeat, and finally column three. If a pivot is zero or suspiciously small, we swap with a lower row that has a larger entry in that column (partial pivoting) to avoid dividing by tiny numbers. After these steps, A has been transformed into U, and we quickly solve for x3, x2, x1. This engine also lets us compute determinants (product of diagonal entries times the sign from swaps), matrix rank (the number of pivots), and inverses (by running elimination on [A | I]).

02Intuition & Analogies

Hook: Think of solving equations like draining water from stacked containers. You open the lowest tap first (bottom equation with one unknown), then the level above becomes easier, and so on. Concept: To reach that stack with a clean single-unknown bottom, we systematically knock out entries below a leading element (pivot) in each column. The allowed moves—swap, scale, and add multiples—are like reordering items, resizing, and combining groups on your desk: they don’t change the solution set, but they make the layout clearer. The goal is a staircase pattern (row-echelon form) where each lower row starts farther to the right. Example: Suppose we have three equations with three unknowns. Pick a leading entry in the first column (the pivot). Use it to eliminate the first variable from all rows below by subtracting suitable multiples. Now the first column is clean under the pivot. Move to the next column and repeat, but only on the rows beneath the previous pivot—like tidying one shelf at a time. If the chosen pivot is tiny, the arithmetic becomes wobbly (dividing by a tiny number amplifies errors). So we swap in a stronger pivot (partial pivoting), just like moving a stable box to the bottom of a stack. By the end, the matrix is triangular, and each step in back substitution is like undoing one zipper tooth at a time—quick and controlled.

03Formal Definition

Hook: Behind the desk-cleaning analogy lies a precise algebraic machine. Concept: Gaussian elimination applies elementary row operations to a matrix A ∈ to produce a row-echelon form (REF) U, optionally proceeding to reduced row-echelon form (RREF). The allowed operations are: (1) swap rows , (2) scale a row ← α with α ≠ 0, (3) add a multiple of one row to another + These operations are equivalent to premultiplying by elementary matrices and preserve the solution set of when applied to the augmented matrix [A | for and swap rows k and p. Example: For a square nonsingular A, elimination yields P, where P is a permutation matrix encoding the row swaps, L is unit lower-triangular (ones on diagonal) capturing multipliers, and U is upper-triangular. Back substitution solves where . Determinant satisfies det(A) = det(P) det(L) det(U) = (−1)^{#swaps} ∏_{i=1}^n . Rank equals the number of pivots. If some pivot is zero and cannot be replaced by a lower nonzero entry, the system is rank-deficient, indicating either no solution (inconsistent) or infinitely many solutions (underdetermined).

04When to Use

Hook: Use the right tool for the mess you have. Concept: Gaussian elimination is the default direct method for solving small-to-moderate linear systems and for extracting structural information from a matrix. Use it when you need a deterministic, finite-step answer with predictable cost. Example use cases: (1) Solve Ax = b for dense, moderately sized n (e.g., n ≤ a few thousands) where O(n^3) time is acceptable. (2) Compute det(A) robustly using LU/GE with pivoting rather than cofactor expansion (which is exponential). (3) Determine rank(A) and diagnose system consistency by comparing rank(A) to rank([A | b]). (4) Invert matrices via Gauss–Jordan when you truly need A^{-1} (e.g., pre/post-processing), though in practice solving Ax = b is usually better than forming A^{-1}. (5) Handle multiple right-hand sides efficiently: factor once (PA = LU) and solve Ly = Pb, Ux = y for each b. Avoid when: matrices are huge and sparse with special structure—then specialized sparse or iterative methods (e.g., Conjugate Gradient for SPD, GMRES) are typically superior. For nearly singular or ill-conditioned matrices, prefer pivoting and possibly iterative refinement to improve accuracy.

⚠️Common Mistakes

Hook: Most errors come from mixing up safe moves with risky arithmetic. Concept: Common pitfalls include dividing by tiny pivots (amplifies rounding errors), forgetting to swap the right-hand side when pivoting, assuming a unique solution without checking rank, and computing determinants by naive cofactor expansion. Example fixes: (1) Use partial pivoting—choose the largest magnitude pivot in the current column; if |pivot| < ε, treat the system as singular/ill-conditioned. (2) Always apply the same row swaps and operations to b (or to the right block when inverting) as you do to A; otherwise, you change the problem. (3) Detect singularity and solution type: compute rank(A) and rank([A | b]); if ranks differ, no solution; if equal but < n, infinite solutions. (4) Do not explicitly compute A^{-1} just to solve Ax = b; factor once and back substitute—it’s faster and more stable. (5) Mind floating-point tolerances: comparisons to zero should use a small ε and consistent scaling. (6) Beware of overwriting needed values; structure your loops so updated entries aren’t reused incorrectly. (7) Keep track of row swap parity when computing det(A); forgetting the sign yields incorrect determinants.

Key Formulas

LU Factorization with Pivoting

Explanation: Any nonsingular matrix A can be row-permuted (by P) so that it factors into L (unit lower-triangular) and U (upper-triangular). This models Gaussian elimination with partial pivoting.

Determinant from U

Explanation: After elimination to U with s row swaps, the determinant of A equals the product of U’s diagonal times the sign from swaps. A zero pivot makes the determinant zero.

Rank from Pivots

Explanation: The number of pivots encountered during elimination equals the matrix rank.

Forward Elimination Cost

Explanation: The dominant operation count for eliminating an n×n dense matrix is about arithmetic operations; lower-order terms are usually ignored.

Back Substitution Cost

Explanation: Solving the triangular system takes on the order of operations, much less than the elimination phase.

Matrix Inverse via Augmentation

Explanation: Row-reducing the augmented matrix replaces A with I and reveals on the right, if and only if A is invertible.

Condition Number

Explanation: The condition number measures sensitivity of the solution to perturbations; large κ(A) implies potential amplification of numerical errors.

Residual

Explanation: The residual vector measures how well x satisfies Ax = b; small residual is necessary but not sufficient for small solution error if κ(A) is large.

Consistency Check

Explanation: Comparing ranks of A and the augmented matrix detects inconsistency; unequal ranks mean the system cannot be satisfied.

Uniqueness Condition

Explanation: A square system has a unique solution when A is full rank; otherwise, there are either no solutions or infinitely many.

Complexity Analysis

For an n×n dense system, Gaussian elimination’s dominant cost is forward elimination. At step k, you eliminate entries in roughly (n−k) rows and (n−k) columns, costing about 2(n−k)^2 floating-point operations (one multiply and one subtract per entry). Summing over k from 1 to n−1 yields (n) = + n, so the asymptotic time complexity is O(). Back substitution is cheaper: it takes about n(n−1)/2 multiply–add operations, i.e., O(). Thus total time is O(), with the constant dominated by elimination. Partial pivoting adds an O(n) scan per column for the pivot, which is subsumed by the O() term. Space complexity is O() to store the matrix. In-place implementations overwrite A with its L and U factors (storing multipliers below the diagonal), requiring only O(n) extra space for temporary variables and the right-hand side. For k right-hand sides, reusing a single factorization reduces total work to O( + k), which is far better than solving each system from scratch. Gauss–Jordan inversion roughly doubles the work of elimination because it eliminates both below and above each pivot, yielding about O() operations with a higher constant (≈ 2/3 to , depending on implementation). For determinants, the extra overhead beyond a single elimination is negligible; the determinant is the signed product of the diagonal of U. In rank computation and consistency checks, reducing to REF or RREF has the same O() worst-case time, but early termination or sparsity can lower practical cost. Numerical stability with partial pivoting is generally good for most matrices, though ill-conditioned problems may still require iterative refinement or higher precision.

Code Examples

Solve Ax = b using Gaussian Elimination with Partial Pivoting
1#include <bits/stdc++.h>
2using namespace std;
3
4// Solve Ax = b via Gaussian elimination with partial pivoting.
5// Returns pair(success, solution). If success == false, solution is empty.
6pair<bool, vector<double>> gaussianEliminationSolve(vector<vector<double>> A, vector<double> b) {
7 const double EPS = 1e-12;
8 int n = (int)A.size();
9 for (int i = 0; i < n; ++i) {
10 // 1) Pivot selection: choose the row with largest |A[row][i]| at/under i
11 int pivot = i;
12 for (int r = i + 1; r < n; ++r) {
13 if (fabs(A[r][i]) > fabs(A[pivot][i])) pivot = r;
14 }
15 if (fabs(A[pivot][i]) < EPS) {
16 // Matrix is singular or ill-conditioned for this RHS
17 return {false, {}};
18 }
19 // 2) Swap rows in A and corresponding entry in b
20 if (pivot != i) {
21 swap(A[pivot], A[i]);
22 swap(b[pivot], b[i]);
23 }
24 // 3) Eliminate entries below the pivot
25 for (int r = i + 1; r < n; ++r) {
26 double factor = A[r][i] / A[i][i];
27 A[r][i] = 0.0; // explicitly set to zero for cleanliness
28 for (int c = i + 1; c < n; ++c) A[r][c] -= factor * A[i][c];
29 b[r] -= factor * b[i];
30 }
31 }
32 // 4) Back substitution to get x
33 vector<double> x(n, 0.0);
34 for (int i = n - 1; i >= 0; --i) {
35 double sum = b[i];
36 for (int c = i + 1; c < n; ++c) sum -= A[i][c] * x[c];
37 if (fabs(A[i][i]) < 1e-18) return {false, {}}; // safety
38 x[i] = sum / A[i][i];
39 }
40 return {true, x};
41}
42
43int main() {
44 // Example: Solve a 3x3 system Ax = b
45 vector<vector<double>> A = {
46 { 2, -1, 1},
47 { 3, 3, 9},
48 { 3, 3, 5}
49 };
50 vector<double> b = {2, -1, 4};
51
52 auto [ok, x] = gaussianEliminationSolve(A, b);
53 if (!ok) {
54 cout << "No unique solution (singular/ill-conditioned matrix).\n";
55 return 0;
56 }
57 cout.setf(ios::fixed); cout << setprecision(8);
58 cout << "Solution x: ";
59 for (double xi : x) cout << xi << ' ';
60 cout << "\n";
61
62 // Verify residual norm ||Ax - b||
63 vector<double> r(b.size(), 0.0);
64 for (int i = 0; i < (int)A.size(); ++i) {
65 double s = 0.0;
66 for (int j = 0; j < (int)A.size(); ++j) s += A[i][j] * x[j];
67 r[i] = s - b[i];
68 }
69 double rn = 0.0; for (double ri : r) rn = max(rn, fabs(ri));
70 cout << "Max residual |Ax - b|_inf = " << rn << "\n";
71 return 0;
72}
73

This program solves a dense linear system using Gaussian elimination with partial pivoting. Each column chooses the largest-magnitude pivot to stabilize division, then eliminates entries below the pivot to create an upper-triangular matrix. Back substitution recovers the solution. It reports failure if a pivot is near zero, signaling a singular or ill-conditioned system.

Time: O(n^3) total (O(n^3) elimination + O(n^2) back substitution)Space: O(n^2) for the matrix, O(n) extra for vectors and temporaries
Determinant via Gaussian Elimination (Partial Pivoting)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute determinant using elimination with partial pivoting.
5// Returns det(A). If matrix is (numerically) singular, returns 0.
6double determinant(vector<vector<double>> A) {
7 const double EPS = 1e-12;
8 int n = (int)A.size();
9 double det = 1.0;
10 int swapCount = 0;
11
12 for (int i = 0; i < n; ++i) {
13 int pivot = i;
14 for (int r = i + 1; r < n; ++r) if (fabs(A[r][i]) > fabs(A[pivot][i])) pivot = r;
15 if (fabs(A[pivot][i]) < EPS) return 0.0; // singular
16 if (pivot != i) {
17 swap(A[pivot], A[i]);
18 swapCount++;
19 }
20 // Multiply diagonal element into determinant
21 det *= A[i][i];
22 // Eliminate entries below pivot
23 for (int r = i + 1; r < n; ++r) {
24 double factor = A[r][i] / A[i][i];
25 for (int c = i + 1; c < n; ++c) A[r][c] -= factor * A[i][c];
26 // No need to store A[r][i] since not used later
27 }
28 }
29 if (swapCount % 2) det = -det; // adjust sign for row swaps
30 return det;
31}
32
33int main() {
34 vector<vector<double>> A = {
35 { 4, 2, 1},
36 { 0, 3, -1},
37 { 2, -2, 5}
38 };
39 cout.setf(ios::fixed); cout << setprecision(6);
40 cout << "det(A) = " << determinant(A) << "\n";
41 return 0;
42}
43

This program computes det(A) by transforming A to upper-triangular form with partial pivoting. The determinant equals the product of diagonal entries of U times (−1) raised to the number of row swaps. If a pivot is numerically zero, the determinant is zero.

Time: O(n^3)Space: O(n^2) for the matrix (in-place), O(1) extra aside from loop variables
Matrix Inverse via Gauss–Jordan Elimination (Partial Pivoting)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute inverse of A using Gauss–Jordan with partial pivoting.
5// Returns pair(success, inverse). If not invertible, success=false.
6pair<bool, vector<vector<double>>> invertMatrix(vector<vector<double>> A) {
7 const double EPS = 1e-12;
8 int n = (int)A.size();
9 vector<vector<double>> I(n, vector<double>(n, 0.0));
10 for (int i = 0; i < n; ++i) I[i][i] = 1.0;
11
12 for (int col = 0; col < n; ++col) {
13 // 1) Find pivot row
14 int pivot = col;
15 for (int r = col + 1; r < n; ++r) if (fabs(A[r][col]) > fabs(A[pivot][col])) pivot = r;
16 if (fabs(A[pivot][col]) < EPS) return {false, {}}; // singular
17 // 2) Swap rows in both A and I
18 if (pivot != col) {
19 swap(A[pivot], A[col]);
20 swap(I[pivot], I[col]);
21 }
22 // 3) Normalize pivot row to make pivot = 1
23 double pivotVal = A[col][col];
24 for (int c = 0; c < n; ++c) {
25 A[col][c] /= pivotVal;
26 I[col][c] /= pivotVal;
27 }
28 // 4) Eliminate other rows in this column
29 for (int r = 0; r < n; ++r) if (r != col) {
30 double factor = A[r][col];
31 if (factor == 0.0) continue;
32 for (int c = 0; c < n; ++c) {
33 A[r][c] -= factor * A[col][c];
34 I[r][c] -= factor * I[col][c];
35 }
36 }
37 }
38 return {true, I};
39}
40
41int main() {
42 vector<vector<double>> A = {
43 { 4, 7, 2},
44 { 3, 6, 1},
45 { 2, 5, 3}
46 };
47 auto [ok, invA] = invertMatrix(A);
48 if (!ok) {
49 cout << "Matrix is singular; no inverse.\n";
50 return 0;
51 }
52 cout.setf(ios::fixed); cout << setprecision(6);
53 cout << "A^{-1}:\n";
54 for (auto &row : invA) {
55 for (double v : row) cout << setw(12) << v << ' ';
56 cout << '\n';
57 }
58 return 0;
59}
60

This program performs Gauss–Jordan elimination on the augmented matrix [A | I]. After pivoting, it scales the pivot row to make the pivot 1 and eliminates the pivot column from all other rows. When finished, the left block becomes I and the right block becomes A^{-1}. If a pivot is (near) zero, A is singular and not invertible.

Time: O(n^3) with a larger constant than standard eliminationSpace: O(n^2) to store A and the augmented identity; in-place storage keeps overhead modest
RREF and Rank of an Augmented System (Detect Unique/None/Infinite Solutions)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct RREFResult {
5 vector<vector<double>> R; // Reduced matrix (augmented)
6 int rankA; // rank of coefficient matrix A
7 int rankAug; // rank of [A|b]
8};
9
10RREFResult rrefAugmented(vector<vector<double>> Aug, int nRows, int nColsA) {
11 // Aug has size nRows x (nColsA + nRHS), where nRHS >= 1
12 const double EPS = 1e-10;
13 int nCols = (int)Aug[0].size();
14 int row = 0;
15 vector<int> pivotCol; pivotCol.reserve(min(nRows, nColsA));
16
17 for (int col = 0; col < nColsA && row < nRows; ++col) {
18 // Find pivot in this column
19 int pivot = row;
20 for (int r = row + 1; r < nRows; ++r)
21 if (fabs(Aug[r][col]) > fabs(Aug[pivot][col])) pivot = r;
22 if (fabs(Aug[pivot][col]) < EPS) continue; // no pivot in this column
23 swap(Aug[pivot], Aug[row]);
24 // Normalize pivot row
25 double pv = Aug[row][col];
26 for (int c = col; c < nCols; ++c) Aug[row][c] /= pv;
27 // Eliminate in all other rows
28 for (int r = 0; r < nRows; ++r) if (r != row) {
29 double f = Aug[r][col];
30 if (f == 0.0) continue;
31 for (int c = col; c < nCols; ++c) Aug[r][c] -= f * Aug[row][c];
32 }
33 pivotCol.push_back(col);
34 ++row;
35 }
36
37 // Compute ranks: count nonzero rows in A-part and in full augmented matrix
38 int rankA = 0, rankAug = 0;
39 for (int r = 0; r < nRows; ++r) {
40 bool anyA = false, anyAug = false;
41 for (int c = 0; c < nColsA; ++c) anyA = anyA || fabs(Aug[r][c]) > EPS;
42 for (int c = 0; c < nCols; ++c) anyAug = anyAug || fabs(Aug[r][c]) > EPS;
43 if (anyA) ++rankA;
44 if (anyAug) ++rankAug;
45 }
46 return {Aug, rankA, rankAug};
47}
48
49int main() {
50 // Example: 3 equations, 3 unknowns, 1 RHS
51 vector<vector<double>> A = {
52 { 1, 2, -1},
53 { 2, 4, -2}, // dependent on row 1 (singular system)
54 { 0, 1, 1}
55 };
56 vector<double> b = {1, 2, 3};
57
58 int n = (int)A.size();
59 int m = (int)A[0].size();
60
61 // Build augmented matrix [A | b]
62 vector<vector<double>> Aug(n, vector<double>(m + 1));
63 for (int i = 0; i < n; ++i) {
64 for (int j = 0; j < m; ++j) Aug[i][j] = A[i][j];
65 Aug[i][m] = b[i];
66 }
67
68 auto res = rrefAugmented(Aug, n, m);
69
70 cout.setf(ios::fixed); cout << setprecision(6);
71 cout << "RREF([A|b]):\n";
72 for (auto &row : res.R) {
73 for (double v : row) cout << setw(10) << v << ' ';
74 cout << '\n';
75 }
76
77 cout << "rank(A) = " << res.rankA << ", rank([A|b]) = " << res.rankAug << "\n";
78
79 if (res.rankA < res.rankAug) {
80 cout << "System is inconsistent: no solution.\n";
81 } else if (res.rankA == m) {
82 cout << "Unique solution. Read the last column from RREF.\n";
83 } else {
84 cout << "Infinitely many solutions: free variables exist.\n";
85 }
86 return 0;
87}
88

This program reduces an augmented matrix [A | b] to reduced row-echelon form and computes rank(A) and rank([A | b]). Comparing these ranks diagnoses whether the system has no solution, a unique solution, or infinitely many. The RREF also directly encodes the solution (with free variables if rank < number of unknowns).

Time: O(n^3) in the worst case for dense matricesSpace: O(n^2) to store the augmented matrix plus O(1) extra