Graph Laplacian
Key Points
- •The graph Laplacian translates a graph’s connectivity into a matrix that measures how much a function varies across edges.
- •The unnormalized Laplacian is , while the symmetric normalized Laplacian is L_ L A .
- •Its quadratic form L x equals the total smoothness penalty across edges, so small values mean x changes slowly over the graph.
- •L is symmetric positive semidefinite for undirected graphs with nonnegative weights, so all eigenvalues are nonnegative.
- •The number of connected components equals the multiplicity of the zero eigenvalue; the constant vector is always in the nullspace (L 1 = 0).
- •Spectral clustering uses the second eigenvector (Fiedler vector) to partition a graph with small edge cuts and balanced volumes.
- •Matrix–vector products with L or are O(m), where m is the number of edges, enabling scalable iterative methods.
- •Normalized versions are essential when node degrees vary a lot, making comparisons fair across high- and low-degree nodes.
Prerequisites
- →Linear algebra (vectors, matrices, eigenvalues) — The Laplacian is a matrix and its key properties are expressed via eigenvalues, eigenvectors, and quadratic forms.
- →Graph basics (undirected graphs, adjacency, degree) — Definitions of A, D, and connectivity rely on standard graph concepts.
- →Matrix calculus / quadratic forms — Understanding x^T L x as energy is central to intuition and optimization.
- →Iterative methods (power iteration) — Eigenvector computations for large graphs use iterative algorithms.
- →Sparse data structures — Efficient storage and O(m) matrix–vector products require sparse representations.
- →Probability / Markov chains (optional) — Normalized matrices (D^{-1}A) connect to random walks and diffusion interpretations.
- →Optimization basics — Spectral relaxations of cut problems and regularization use energy minimization.
- →Numerical stability and normalization — Normalization choices (L vs L_sym) and step sizes (τ) affect convergence and correctness.
Detailed Explanation
Tap terms for definitions01Overview
The graph Laplacian is a core matrix in spectral graph theory that encodes how nodes in a graph connect and interact. For an undirected weighted graph with adjacency matrix A and degree matrix D (diagonal with D_{ii} = sum_j A_{ij}), the unnormalized Laplacian is L = D - A. A commonly used variant is the symmetric normalized Laplacian L_{sym} = D^{-1/2} L D^{-1/2} = I - D^{-1/2} A D^{-1/2}. These matrices are symmetric and positive semidefinite if the graph has nonnegative edge weights. Intuitively, they penalize differences across connected nodes; functions that change a lot across edges have high Laplacian energy.
This property makes the Laplacian central to many algorithms: spectral clustering, graph partitioning, semi-supervised learning, graph signal processing (smoothing and denoising), diffusion processes, and solving certain linear systems. The spectrum (eigenvalues and eigenvectors) of the Laplacian reveals deep structural information. The smallest eigenvalue is 0, with eigenvector 1, and the number of zero eigenvalues equals the number of connected components. The second-smallest eigenvalue (the algebraic connectivity) and its eigenvector (the Fiedler vector) guide balanced graph partitions. Normalized versions are particularly useful when node degrees vary widely, reducing degree bias.
From a computational standpoint, we often avoid forming dense matrices. Instead, we use sparse structures and perform matrix–vector products in O(m) time, where m is the number of edges. Iterative methods like power iteration, Lanczos, or conjugate gradients exploit this structure to scale to large graphs.
02Intuition & Analogies
Imagine each edge in a graph as a spring connecting two masses at its endpoints, with stiffness equal to the edge weight. If you assign a number x_i to each node (like vertical displacement), the total spring energy is higher when neighboring nodes differ more. The Laplacian quadratic form x^T L x exactly captures this energy: it is large when values on connected nodes differ a lot, and small when values are similar. Thus, minimizing x^T L x tends to make neighboring nodes agree—like a system settling into a low-energy configuration.
Another analogy is a road network with temperatures at intersections. If heat diffuses along roads, the rate of change depends on temperature differences across connected intersections. The Laplacian drives this diffusion: applying (I - τ L) repeatedly smooths the temperatures, just as actual heat flow evens out hot and cold spots.
Consider social networks: if each person holds an opinion x_i, and friends influence each other, then large opinion gaps across friendships are “costly.” The Laplacian measures that cost. If you seek a split of the network into two communities with few cross-edges, the Fiedler vector (an eigenvector of the Laplacian) points to a way to separate the network with relatively few disagreements—similar to cutting the fewest springs or weakest friendships.
Normalization helps when some nodes are social butterflies (high degree) and others are loners (low degree). Without normalization, high-degree nodes dominate the energy and can skew partitions or smoothing. L_sym rescales by degrees, giving each node a fairer influence regardless of how many neighbors it has. In short: Laplacians turn graph connectivity into forces, heat flow, or consensus dynamics that we can compute with linear algebra.
03Formal Definition
04When to Use
- Spectral clustering and graph partitioning: Use the second eigenvector (or a few smallest eigenvectors) of L or L_{sym} to find balanced cuts with few crossing edges. Normalized cuts (Ncut) specifically favor L_{sym}.
- Graph signal processing and smoothing: When denoising values on nodes (temperatures, ratings, sensor readings), penalize x^{\top} L x to encourage neighboring nodes to agree. Iterations like x \leftarrow (I - \tau L) x approximate heat diffusion.
- Semi-supervised learning: Add a Laplacian regularization term to fit labels that vary smoothly across similar nodes, helping propagate labels from few labeled nodes to many unlabeled ones.
- Diffusion maps and embeddings: Use functions of L (e.g., e^{-tL}) or its leading eigenvectors to embed nodes into a low-dimensional space that preserves connectivity structure.
- Connectivity analysis: The number of zero eigenvalues tells you how many connected components exist. L \mathbf{1} = 0 is also a quick consistency check when implementing L.
- Preconditioning and linear solvers: Laplacians appear in PDE discretizations and as graph-structured linear systems. Iterative methods exploit sparsity and PSD properties.
- Normalization choice: Prefer L_{sym} (or L_{rw} = I - D^{-1} A) when degrees vary widely or when using normalized cut objectives; use L when absolute degrees are meaningful or for certain PDE-like discretizations.
⚠️Common Mistakes
- Ignoring graph direction or negative weights: The standard Laplacian theory assumes undirected graphs with nonnegative weights. For directed or signed graphs, definitions and PSD properties differ; using the standard L may break guarantees.
- Mishandling isolated nodes (d_i = 0): In L_{sym}, D^{-1/2} uses zeros for isolated nodes. Forgetting this leads to divisions by zero or NaNs. Special-case nodes with degree 0 in code.
- Forgetting the trivial eigenvector: For L_{sym}, the normalized adjacency S = D^{-1/2} A D^{-1/2} has top eigenvector D^{1/2} \mathbf{1}. When computing the second eigenvector by power iteration, project out this trivial component to avoid converging to it.
- Building dense matrices: Forming dense L for large sparse graphs is memory-inefficient (O(n^2)). Prefer sparse adjacency lists and implement matrix–vector products in O(m).
- Mixing normalization conventions: L, L_{sym}, and L_{rw} behave differently. Using the wrong one can skew clustering or smoothing. Match your choice to the objective (e.g., normalized cut prefers L_{sym}).
- Incorrect scaling in normalized operations: When computing y = L_{sym} x or y = S x, you must scale by 1/\sqrt{d_i d_j}. Missing either factor changes the spectrum and invalidates results.
- Expecting exact eigenvectors from few iterations: Power iteration converges slowly when eigen-gaps are small. Use better initializations, deflation, or more advanced methods (Lanczos) for tight accuracy.
Key Formulas
Unnormalized Laplacian
Explanation: The Laplacian subtracts adjacency from degree, capturing how each node’s value contrasts with its neighbors. This is the basic definition for undirected, nonnegative-weight graphs.
Symmetric Normalized Laplacian
Explanation: This rescales by node degrees, reducing bias from high-degree nodes. It is symmetric with eigenvalues in [0, 2].
Nullspace property
Explanation: Multiplying L by the all-ones vector yields zero, showing that is always an eigenvector with eigenvalue 0.
Quadratic Form / Dirichlet Energy
Explanation: This expresses the Laplacian energy as a sum of squared differences across edges. It proves L is positive semidefinite for nonnegative weights.
Spectrum Ordering
Explanation: The eigenvalues of a PSD symmetric matrix are nonnegative and can be ordered. The number of zero eigenvalues equals the number of connected components.
Eigen-relationship
Explanation: Eigenpairs of correspond to those of the normalized adjacency S. The second-smallest eigenvalue of corresponds to the second-largest eigenvalue of S.
Rayleigh Quotient
Explanation: For any nonzero x, the quotient lies between the smallest and largest eigenvalues. Minimizing it (with constraints) recovers eigenvectors.
Incidence Factorization
Explanation: With B the oriented incidence matrix and W the diagonal matrix of edge weights, this factorization shows PSD-ness. It is useful in optimization and solver design.
Heat Kernel (Matrix Exponential)
Explanation: This operator models diffusion over time t. Applying it to a signal smooths it according to the graph’s connectivity.
Normalized Cut Objective
Explanation: This balances the number (or weight) of crossing edges with the total degree (volume) of each side. Spectral relaxation links it to eigenvectors of .
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Graph { 5 int n; // number of nodes 6 vector<vector<pair<int,double>>> adj; // adjacency list: (neighbor, weight) 7 vector<double> deg; // degree of each node 8 }; 9 10 Graph build_graph(int n, const vector<tuple<int,int,double>>& edges) { 11 Graph G; G.n = n; G.adj.assign(n, {}); G.deg.assign(n, 0.0); 12 for (auto [u,v,w] : edges) { 13 if (u==v) continue; // ignore self-loops for Laplacian basics 14 G.adj[u].push_back({v,w}); 15 G.adj[v].push_back({u,w}); 16 G.deg[u] += w; G.deg[v] += w; 17 } 18 return G; 19 } 20 21 // y = L x where L = D - A (unnormalized). O(m) 22 void matvec_L(const Graph& G, const vector<double>& x, vector<double>& y) { 23 int n = G.n; y.assign(n, 0.0); 24 for (int i = 0; i < n; ++i) y[i] += G.deg[i] * x[i]; 25 for (int i = 0; i < n; ++i) { 26 for (auto [j, w] : G.adj[i]) y[i] -= w * x[j]; 27 } 28 } 29 30 // y = S x where S = D^{-1/2} A D^{-1/2}. O(m) 31 void matvec_S(const Graph& G, const vector<double>& x, vector<double>& y) { 32 int n = G.n; y.assign(n, 0.0); 33 // precompute invsqrt degrees (zero for isolated nodes) 34 static vector<double> invsqrt; invsqrt.resize(n); 35 for (int i = 0; i < n; ++i) invsqrt[i] = (G.deg[i] > 0.0) ? 1.0 / sqrt(G.deg[i]) : 0.0; 36 for (int i = 0; i < n; ++i) { 37 double si = invsqrt[i]; 38 for (auto [j, w] : G.adj[i]) { 39 double sj = invsqrt[j]; 40 if (si > 0 && sj > 0) y[i] += w * si * sj * x[j]; 41 } 42 } 43 } 44 45 // Quadratic form x^T L x = (1/2) sum_{i,j} w_ij (x_i - x_j)^2. O(m) 46 double quadratic_form_L(const Graph& G, const vector<double>& x) { 47 double sum = 0.0; 48 vector<char> seen(G.n, 0); 49 for (int i = 0; i < G.n; ++i) { 50 seen[i] = 1; 51 for (auto [j, w] : G.adj[i]) if (!seen[j]) { 52 double d = x[i] - x[j]; 53 sum += w * d * d; // count each undirected edge once 54 } 55 } 56 return 0.5 * sum; 57 } 58 59 int main() { 60 // Example graph: a square with a diagonal 61 int n = 4; 62 vector<tuple<int,int,double>> edges = { 63 {0,1,1.0},{1,2,1.0},{2,3,1.0},{3,0,1.0},{0,2,0.5} 64 }; 65 Graph G = build_graph(n, edges); 66 67 // Verify L * 1 = 0 68 vector<double> one(n, 1.0), y; 69 matvec_L(G, one, y); 70 cout.setf(std::ios::fixed); cout<<setprecision(6); 71 cout << "L * 1 = ["; 72 for (int i = 0; i < n; ++i) cout << y[i] << (i+1==n? "]\n" : ", "); 73 74 // Random vector and its Laplacian energy 75 vector<double> x = {1.0, 2.0, -1.0, 0.5}; 76 double energy = quadratic_form_L(G, x); 77 cout << "x^T L x = " << energy << "\n"; 78 79 // Compare with Rayleigh quotient using explicit matvec 80 vector<double> Lx; matvec_L(G, x, Lx); 81 double num = 0.0, den = 0.0; 82 for (int i = 0; i < n; ++i) { num += x[i]*Lx[i]; den += x[i]*x[i]; } 83 cout << "Rayleigh quotient (x^T L x / x^T x) = " << (num/den) << "\n"; 84 85 // Show one step with normalized adjacency S 86 vector<double> Sx; matvec_S(G, x, Sx); 87 cout << "First 2 entries of Sx: " << Sx[0] << ", " << Sx[1] << "\n"; 88 return 0; 89 } 90
This program builds an undirected weighted graph, exposes Laplacian operations as O(m) matrix–vector products, verifies that L times the all-ones vector is zero, and computes the Laplacian quadratic form. It also multiplies by the normalized adjacency S = D^{-1/2} A D^{-1/2}, which is used when working with L_sym = I − S.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Graph { int n; vector<vector<pair<int,double>>> adj; vector<double> deg; }; 5 6 Graph build_graph(int n, const vector<tuple<int,int,double>>& edges) { 7 Graph G; G.n = n; G.adj.assign(n, {}); G.deg.assign(n, 0.0); 8 for (auto [u,v,w] : edges) { 9 if (u==v) continue; 10 G.adj[u].push_back({v,w}); 11 G.adj[v].push_back({u,w}); 12 G.deg[u]+=w; G.deg[v]+=w; 13 } 14 return G; 15 } 16 17 // y = S x with S = D^{-1/2} A D^{-1/2} 18 void matvec_S(const Graph& G, const vector<double>& x, vector<double>& y, const vector<double>& invsqrt) { 19 int n = G.n; y.assign(n, 0.0); 20 for (int i = 0; i < n; ++i) { 21 double si = invsqrt[i]; 22 for (auto [j, w] : G.adj[i]) { 23 double sj = invsqrt[j]; 24 if (si > 0 && sj > 0) y[i] += w * si * sj * x[j]; 25 } 26 } 27 } 28 29 // L2 normalize vector 30 void normalize(vector<double>& x) { 31 double nrm = 0.0; for (double v: x) nrm += v*v; nrm = sqrt(max(1e-30, nrm)); 32 for (double &v: x) v /= nrm; 33 } 34 35 // Dot product 36 double dot(const vector<double>& a, const vector<double>& b) { 37 double s = 0.0; for (size_t i=0;i<a.size();++i) s += a[i]*b[i]; return s; 38 } 39 40 // Power iteration to find the 2nd eigenvector of S by projecting out v1 = D^{1/2} 1 41 vector<double> second_eigenvector_S(const Graph& G, int iters=200, double tol=1e-8) { 42 int n = G.n; 43 vector<double> invsqrt(n), vsqrt(n); 44 for (int i=0;i<n;++i) { 45 if (G.deg[i] > 0) { invsqrt[i] = 1.0/sqrt(G.deg[i]); vsqrt[i] = sqrt(G.deg[i]); } 46 else { invsqrt[i] = 0.0; vsqrt[i] = 0.0; } 47 } 48 // v1 is proportional to D^{1/2} 1 49 vector<double> v1 = vsqrt; normalize(v1); 50 51 // random initial vector orthogonal to v1 52 std::mt19937 rng(42); std::normal_distribution<double> N(0.0,1.0); 53 vector<double> x(n); for (int i=0;i<n;++i) x[i] = N(rng); 54 // project out v1 55 double alpha = dot(x, v1); for (int i=0;i<n;++i) x[i] -= alpha * v1[i]; 56 normalize(x); 57 58 vector<double> y(n); 59 double prev_lambda = 0.0; 60 for (int t=0; t<iters; ++t) { 61 matvec_S(G, x, y, invsqrt); 62 // Deflate component along v1 again (numerical drift) 63 double beta = dot(y, v1); for (int i=0;i<n;++i) y[i] -= beta * v1[i]; 64 normalize(y); 65 // Rayleigh quotient to monitor convergence: lambda = x^T S x 66 double lambda = dot(x, y); // since y ≈ Sx and both unit-norm, dot(x,y) ≈ x^T S x 67 if (fabs(lambda - prev_lambda) < tol) break; 68 prev_lambda = lambda; 69 x.swap(y); 70 } 71 return x; // approximate 2nd eigenvector of S 72 } 73 74 int main(){ 75 // Two clusters with weak inter-connection 76 int n = 8; 77 vector<tuple<int,int,double>> edges; 78 auto add_chain = [&](int start){ 79 for (int i=start; i<start+3; ++i) edges.push_back({i, i+1, 1.0}); 80 }; 81 add_chain(0); // 0-1-2-3 82 add_chain(4); // 4-5-6-7 83 edges.push_back({3,4,0.1}); // weak bridge 84 85 Graph G = build_graph(n, edges); 86 vector<double> v2 = second_eigenvector_S(G); 87 88 // Partition by sign of v2 (or by median threshold) 89 vector<int> part(n,0); 90 // use median threshold for better balance 91 vector<double> tmp = v2; nth_element(tmp.begin(), tmp.begin()+n/2, tmp.end()); 92 double thr = tmp[n/2]; 93 for (int i=0;i<n;++i) part[i] = (v2[i] > thr) ? 1 : 0; 94 95 cout << "Approximate 2nd eigenvector (v2):\n"; 96 cout.setf(std::ios::fixed); cout<<setprecision(4); 97 for (int i=0;i<n;++i) cout << i << ": " << v2[i] << (i+1==n?"\n":"\n"); 98 99 cout << "Partition labels (0/1):\n"; 100 for (int i=0;i<n;++i) cout << i << ": " << part[i] << (i+1==n?"\n":"\n"); 101 return 0; 102 } 103
This code computes the second-largest eigenvector of S = D^{-1/2} A D^{-1/2} using power iteration with deflation of the trivial eigenvector v1 = D^{1/2} 1. Thresholding this eigenvector produces a bipartition that approximately minimizes a normalized cut. The method runs in O(T·m) time for T iterations and uses O(n + m) memory.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Graph { int n; vector<vector<pair<int,double>>> adj; vector<double> deg; }; 5 6 Graph build_graph(int n, const vector<tuple<int,int,double>>& edges) { 7 Graph G; G.n = n; G.adj.assign(n, {}); G.deg.assign(n, 0.0); 8 for (auto [u,v,w] : edges) { 9 if (u==v) continue; 10 G.adj[u].push_back({v,w}); 11 G.adj[v].push_back({u,w}); 12 G.deg[u]+=w; G.deg[v]+=w; 13 } 14 return G; 15 } 16 17 // y = L x = D x - A x 18 void matvec_L(const Graph& G, const vector<double>& x, vector<double>& y) { 19 int n = G.n; y.assign(n, 0.0); 20 for (int i=0;i<n;++i) y[i] += G.deg[i] * x[i]; 21 for (int i=0;i<n;++i) for (auto [j,w]: G.adj[i]) y[i] -= w * x[j]; 22 } 23 24 // One diffusion step: x <- x - tau * L x 25 void diffuse_step(const Graph& G, vector<double>& x, double tau) { 26 vector<double> Lx; matvec_L(G, x, Lx); 27 for (int i=0;i<G.n;++i) x[i] -= tau * Lx[i]; 28 } 29 30 int main(){ 31 // Line graph with noisy signal 32 int n = 10; vector<tuple<int,int,double>> edges; 33 for (int i=0;i<n-1;++i) edges.push_back({i,i+1,1.0}); 34 Graph G = build_graph(n, edges); 35 36 // Create a noisy step signal 37 vector<double> x(n); 38 for (int i=0;i<n;++i) x[i] = (i < n/2 ? 1.0 : -1.0) + 0.3 * sin(3*i); 39 40 double dmax = 0.0; for (double d: G.deg) dmax = max(dmax, d); 41 double tau = 0.45 / max(1.0, dmax); // conservative stability step size 42 43 cout.setf(std::ios::fixed); cout<<setprecision(4); 44 cout << "Initial signal:\n"; 45 for (int i=0;i<n;++i) cout << x[i] << (i+1==n?"\n":" "); 46 47 int T = 50; // number of diffusion steps 48 for (int t=0;t<T;++t) diffuse_step(G, x, tau); 49 50 cout << "Smoothed signal after diffusion:\n"; 51 for (int i=0;i<n;++i) cout << x[i] << (i+1==n?"\n":" "); 52 return 0; 53 } 54
This example performs explicit Euler diffusion steps x ← (I − τ L) x to smooth a noisy signal on a line graph. Each step is O(m), and a sufficiently small τ ensures stability and gradual smoothing that respects graph connectivity.