📚TheoryIntermediate

Spectral Graph Theory

Key Points

•
Spectral graph theory studies graphs by looking at eigenvalues and eigenvectors of matrices like the adjacency matrix A and Laplacians L and $L_{n} or m$ .
•
The second smallest eigenvalue of the Laplacian (the Fiedler value, $λ_{2})$ quantifies how well a graph is connected and guides graph partitioning.
•
Cheeger’s inequality links $λ_{2}$ to the quality of the best graph cut, explaining why spectral methods find good clusters.
•
Normalized Laplacian L_ $n or m = I$ − $D^{- 1/2}$ A $D^{- 1/2}$ is scale-invariant and preferred for irregular-degree graphs.
•
Spectral clustering embeds nodes using Laplacian eigenvectors and then applies k-means to get communities.
•
Random walks on graphs have stationary distribution proportional to node degrees; the spectral gap controls mixing time.
•
Graph neural networks often approximate spectral filters defined by Laplacian eigen-decompositions using fast polynomial filters.
•
In practice, iterative methods that multiply by sparse matrices yield scalable spectral algorithms with O(m) work per iteration.

Prerequisites

→Linear algebra (eigenvalues, eigenvectors, orthogonality) — Spectral methods rely on eigen-decompositions, Rayleigh quotients, and orthogonalization.
→Graph theory basics — Understanding adjacency, degree, paths, and cuts is necessary to interpret Laplacians and partitions.
→Probability and Markov chains — Random walks, stationary distributions, and mixing times connect to spectral gaps.
→Optimization and k-means clustering — Spectral clustering uses k-means on eigenvector embeddings.
→Numerical methods — Iterative eigen-solvers and convergence criteria are key for scalable implementations.

Detailed Explanation

Tap terms for definitions

01Overview

Hook → If you could listen to a graph, what would it sound like? Spectral graph theory answers this by turning a graph into a musical instrument whose notes are its eigenvalues and whose harmonics are eigenvectors. These frequencies reveal connectivity, bottlenecks, and cluster structure that are hard to see directly from edges. Concept → We represent a graph with matrices: the adjacency matrix A, the degree matrix D, the combinatorial Laplacian L = D − A, and the normalized Laplacian L_norm = I − D^{-1/2} A D^{-1/2}. The eigenvalues and eigenvectors of these matrices encode structural properties. For instance, the second smallest eigenvalue of L (the Fiedler value) measures how well-connected the graph is; its eigenvector guides us to split the graph into two balanced pieces with few crossing edges. Example → In spectral clustering, we compute a few of the smallest eigenvectors of L_norm (or equivalently the largest eigenvectors of D^{-1/2} A D^{-1/2}), embed nodes as points in low-dimensional space, and then run k-means. This approach underlies community detection in networks, image segmentation in vision, and feature learning in graph neural networks.

02Intuition & Analogies

Hook → Imagine a network of masses and springs: nodes are masses, edges are springs. When you pluck this system, it vibrates in certain modes at certain frequencies. Low-frequency modes deform the system smoothly, revealing large-scale structure; high-frequency modes wiggle locally. Concept → Graph Laplacians are the mathematical version of this mass–spring system. The Laplacian energy x^T L x measures how much the signal x changes across edges. Eigenvectors minimize this energy subject to constraints; the smallest-energy nontrivial eigenvector (Fiedler vector) is the smoothest way to vary over the graph while not being constant. Where this vector changes sign is where the graph “wants to split,” revealing a natural cut. Example → Suppose a social network has two friend groups with a few cross-group ties. If you color each node by the value of the Fiedler vector, nodes in the same community have similar colors, and there’s a sharp transition across the sparse bridge. Sorting nodes by this value and cutting at the transition yields a clean partition. For k communities, multiple low-frequency eigenvectors serve as coordinates; points from the same community cluster together, and k-means separates them.

03Formal Definition

Hoo k \to T o

formalize the vibration picture, we translate edges into quadratic forms that penalize disagreement across links.

C o n ce pt \to F or

an undirected graph G = (V, E) with weights

w_{ij} \geq 0

, the matrices are:

A_{ij} = w_{ij}

D_{ii} = d_{i}

\sum_{j}

w_{ij}

L = D

− A, and L_{

norm

} = I −

D^{- 1/2}

D^{- 1/2}

. L is symmetric positive semi-definite, so it has real eigenvalues 0 =

λ_{1}

≤

λ_{2}

≤ … ≤

λ_{n} .

The trivial eigenvector for L is the all-ones vector 1. For L_{

norm

}, the trivial eigenvector is

D^{1/2}

1 with eigenvalue 0. The Rayleigh quotient

R_{L}

(x) = (

x^{T}

L x)/(

x^{T}

x) equals

\sum_{(i, j) \in E}

w_{ij}

(

x_{i}

−

x_{j}

)^2 divided by

\sum_{i}

x_{i}

^2, quantifying smoothness over edges. By the Courant–Fischer theorem,

λ_{2}

mi n_{x ⊥ 1}

R_{L}

(x), making the Fiedler vector any minimizer of this constrained problem.

E x am pl e \to T h e

normalized random-walk matrix

P = D^{- 1}

A has largest eigenvalue 1 with stationary vector π proportional to degrees. The symmetric normalization

S = D^{- 1/2}

D^{- 1/2}

is similar to P and shares eigenvalues; L_{

norm

} = I − S, so the smallest eigenvalues of L_{

norm

} correspond to the largest eigenvalues of S. This equivalence lets us find clusters by computing dominant eigenvectors of S and then deflating the trivial one.

04When to Use

Hook → If you need the big picture of a large, noisy graph—Who belongs with whom? Where are the bottlenecks?—spectral methods are a strong first choice. Concept → Use spectral graph theory when you want global structure from local edges. It excels for: (1) clustering/community detection via spectral embeddings (e.g., normalized cut minimization), (2) graph partitioning for load balancing in parallel computing, (3) semi-supervised learning on graphs (label propagation smoothness priors), (4) analyzing random walks and Markov chains (mixing time, PageRank variants), and (5) graph neural networks where filters are functions of L (e.g., Chebyshev or GCN filters). Example → In image segmentation, build a pixel graph with weights based on color/position similarity, compute the first few eigenvectors of L_{\text{norm}}, embed pixels into 2–10 dimensions, and run k-means to separate regions. In recommendation systems, user–item bipartite graphs can be embedded via spectral methods to reveal communities. For GNNs on citation networks, approximate spectral filters g(L) with polynomials in L to get localized convolutions that are efficient and stable.

⚠️Common Mistakes

Hook → Spectral tools are powerful, but subtle choices can break them or hide their benefits. Concept → Frequent pitfalls include: (1) using the unnormalized Laplacian on graphs with highly variable degrees, which biases partitions; (2) forgetting to remove isolated nodes or handling them incorrectly; (3) extracting the wrong eigenvector (confusing the top eigenvector of S with the second one, or not deflating the trivial eigenvector); (4) interpreting eigen-gaps without checking for near-multiplicities; (5) running k-means on unnormalized eigenvector rows, which can distort clusters; and (6) treating numerical eigen-solvers as black boxes without monitoring convergence. Example → If you compute the largest eigenvector of S = D^{-1/2} A D^{-1/2} and cluster by its sign, you’ll often get a trivial split aligned with degree because the top eigenvector is proportional to D^{1/2} 1. The fix is to project out this trivial direction and use the next eigenvector(s). Similarly, when doing spectral clustering, always row-normalize the eigenvector matrix before k-means (Ng–Jordan–Weiss variant), and for graphs with isolated nodes, either remove them or assign them as singleton clusters to avoid numerical issues.

Key Formulas

Combinatorial Laplacian

L = D - A

Explanation: Defines the Laplacian from degree and adjacency. It penalizes differences across edges and is central to spectral analysis.

Normalized Laplacian

L_{norm} = I - D^{- 1/2} A D^{- 1/2}

Explanation: Removes degree scaling, making spectral quantities comparable across nodes with different degrees. Preferred for clustering.

Laplacian eigenvalue ordering

0 = λ_{1} \leq λ_{2} \leq \dots \leq λ_{n}

Explanation: L is positive semi-definite, so eigenvalues are nonnegative and ordered. The first is zero with eigenvector 1.

Rayleigh quotient

R_{M} (x) = \frac{x ^{⊤} M x}{x ^{⊤} x}

Explanation: Measures how x aligns with M’s eigenvectors. Minimizing or maximizing this over constraints yields eigenvalues.

Fiedler value (variational)

λ_{2} = x ⊥ 1 min \frac{x ^{⊤} Lx}{x ^{⊤} x}

Explanation: The algebraic connectivity is the smallest nontrivial Laplacian Rayleigh quotient. Its minimizer is the Fiedler vector.

Symmetric normalization

S = D^{- 1/2} A D^{- 1/2}, L_{norm} = I - S

Explanation: Relates the normalized Laplacian to a symmetric stochastic-like matrix S. Dominant eigenvectors of S correspond to smallest eigenvectors of $L_{n} or m$ .

Cheeger inequality (graph)

h (G) \leq 2 λ_{2} and \frac{λ _{2}}{2} \leq h (G)

Explanation: Links the best conductance (Cheeger constant) to the Laplacian eigenvalue $λ_{2} .$ Small $λ_{2}$ implies a sparse cut exists, and vice versa up to constants.

Stationary distribution

π_{i} = \frac{d _{i}}{\sum _{j} d _{j}} = \frac{d _{i}}{2 m}

Explanation: For an undirected graph’s random walk, the stationary probability of node i is proportional to its degree.

Mixing by spectral gap

∥ P^{t} x - π ∥_{TV} \leq C (1 - γ)^{t}, γ = 1 - μ_{2}

Explanation: Convergence to stationarity is geometric with rate controlled by the spectral gap γ of P (or S). Larger gaps mean faster mixing.

Normalized cut objective

Ncut (S, \overset{ˉ}{S}) = \frac{cut ( S , S ˉ )}{vol ( S )} + \frac{cut ( S , S ˉ )}{vol ( S ˉ )}

Explanation: Measures the fraction of edges leaving each side relative to its volume. Minimization is relaxed by $L_{n} or m$ eigenvectors.

Spectral filtering

g (L) x = U g (Λ) U^{⊤} x

Explanation: Applying a graph filter g(L) corresponds to scaling eigencomponents by g on eigenvalues. Used in spectral GNNs.

Relaxed k-cut

Y \in R^{n \times k} min Tr (Y^{⊤} L_{norm} Y) s.t. Y^{⊤} Y = I

Explanation: The optimal Y spans the k smallest eigenvectors of $L_{n} or m$ . Row-normalizing Y and running k-means yields spectral clustering.

Complexity Analysis

For sparse graphs with n nodes and m edges, the core cost in spectral methods is multiplying a vector by a graph-derived matrix. Multiplying by A, L, or the symmetric normalization

S = D^{- 1/2}

D^{- 1/2}

can be implemented in O(m) time using adjacency lists, since each undirected edge contributes a constant number of operations. Power iteration to obtain one dominant eigenpair takes O(T m) time where T is the number of iterations needed to converge; T typically scales like O(

lo g

(1/

ε

gap

) where gap is the eigenvalue separation and

ε

the desired accuracy. To get the second eigenvector of S, we deflate or orthogonalize against the trivial eigenvector, preserving the O(m) per-iteration cost. For k eigenvectors, block power or Lanczos methods cost O(T m k) with an additional O(n

k^{2}

) for orthonormalization each iteration. The k-means step in spectral clustering on an n×k embedding costs O(I n k

d_{c}

), where I is the number of Lloyd iterations and

d_{c}

is effectively k (distance computations). Space-wise, storing a sparse graph is O(n + m). Storing k dense vectors adds O(n k). During iterations, we maintain a handful of length-n vectors, so peak memory is O(n + m + n k). In practice, convergence is dominated by the spectral gap: well-separated clusters (large gaps) converge quickly, while near-degenerate clusters require more iterations or more sophisticated solvers (e.g., Lanczos with reorthogonalization). Dense representations (n×n) would cost O(

n^{2}

) memory and O(

n^{2}

) per multiply and are only suitable for small graphs; thus sparse implementations are crucial for scalability.

Code Examples

Compute the Fiedler vector (2-way spectral cut) via power iteration on S = D^{-1/2} A D^{-1/2}

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Undirected, unweighted graph stored as adjacency list
5 struct Graph {
6     int n; // number of nodes
7     vector<vector<int>> adj; // adjacency list
8     vector<double> deg; // degrees
9     Graph(int n_) : n(n_), adj(n_), deg(n_, 0.0) {}
10     void add_edge(int u, int v) {
11         if (u == v) return; // ignore self-loops for simplicity
12         adj[u].push_back(v);
13         adj[v].push_back(u);
14         deg[u] += 1.0;
15         deg[v] += 1.0;
16     }
17 };
18 
19 // Multiply y = S x where S = D^{-1/2} A D^{-1/2} using adjacency list
20 static inline void multiply_S(const Graph &G, const vector<double> &x, vector<double> &y) {
21     int n = G.n;
22     fill(y.begin(), y.end(), 0.0);
23     for (int u = 0; u < n; ++u) {
24         double du = G.deg[u];
25         if (du == 0.0) continue; // isolated node contributes nothing
26         double scale_u = 1.0 / sqrt(du);
27         for (int v : G.adj[u]) {
28             double dv = G.deg[v];
29             if (dv == 0.0) continue;
30             double w = scale_u * (1.0 / sqrt(dv)); // 1/sqrt(d_u d_v)
31             y[u] += w * x[v];
32         }
33     }
34 }
35 
36 static inline double dot(const vector<double> &a, const vector<double> &b) {
37     double s = 0.0; int n = (int)a.size();
38     for (int i = 0; i < n; ++i) s += a[i] * b[i];
39     return s;
40 }
41 
42 static inline double norm2(const vector<double> &a) {
43     return sqrt(max(0.0, dot(a, a)));
44 }
45 
46 // Project x to be orthogonal to u: x <- x - (x.u / u.u) u
47 static inline void project_out(vector<double> &x, const vector<double> &u) {
48     double uu = max(1e-18, dot(u, u));
49     double alpha = dot(x, u) / uu;
50     for (size_t i = 0; i < x.size(); ++i) x[i] -= alpha * u[i];
51 }
52 
53 // Compute the second largest eigenpair (mu2, v2) of S by deflating the trivial eigenvector u = sqrt(deg)
54 // Returns estimated mu2 and fills v2 (unit norm). Also returns lambda2(L_norm) = 1 - mu2.
55 pair<double,double> second_eigen_S(const Graph &G, vector<double> &v2, int iters = 200, double tol = 1e-9) {
56     int n = G.n;
57     // Trivial eigenvector for S is u = sqrt(deg)
58     vector<double> u(n);
59     for (int i = 0; i < n; ++i) u[i] = sqrt(G.deg[i]);
60 
61     // Initialize v2 randomly and orthogonalize to u
62     mt19937 rng(42);
63     normal_distribution<double> nd(0.0, 1.0);
64     v2.assign(n, 0.0);
65     for (int i = 0; i < n; ++i) v2[i] = nd(rng);
66     project_out(v2, u);
67     double nrm = norm2(v2);
68     if (nrm < 1e-12) {
69         // If all isolated, return trivial
70         return {0.0, 1.0};
71     }
72     for (int i = 0; i < n; ++i) v2[i] /= nrm;
73 
74     vector<double> y(n, 0.0);
75     double mu_old = 0.0;
76     for (int it = 0; it < iters; ++it) {
77         multiply_S(G, v2, y);
78         // Deflate: ensure orthogonal to u (remove trivial component)
79         project_out(y, u);
80         double nrm_y = norm2(y);
81         if (nrm_y < 1e-14) break;
82         for (int i = 0; i < n; ++i) v2[i] = y[i] / nrm_y;
83         // Rayleigh quotient as eigenvalue estimate
84         multiply_S(G, v2, y);
85         double mu = dot(v2, y);
86         if (fabs(mu - mu_old) < tol * max(1.0, fabs(mu_old))) {
87             mu_old = mu;
88             break;
89         }
90         mu_old = mu;
91     }
92     double mu2 = mu_old; // second largest eigenvalue of S
93     double lambda2 = 1.0 - mu2; // second smallest eigenvalue of L_norm
94     return {mu2, lambda2};
95 }
96 
97 int main() {
98     ios::sync_with_stdio(false);
99     cin.tie(nullptr);
100 
101     // Example: two 5-node cliques connected by a single bridge (barbell-like)
102     int n = 10; Graph G(n);
103     // Clique 0..4
104     for (int i = 0; i < 5; ++i) for (int j = i+1; j < 5; ++j) G.add_edge(i, j);
105     // Clique 5..9
106     for (int i = 5; i < 10; ++i) for (int j = i+1; j < 10; ++j) G.add_edge(i, j);
107     // Weak bridge
108     G.add_edge(4, 5);
109 
110     vector<double> v2;
111     auto [mu2, lambda2] = second_eigen_S(G, v2, 500, 1e-10);
112 
113     cout.setf(std::ios::fixed); cout << setprecision(6);
114     cout << "Estimated mu2 (S): " << mu2 << "\n";
115     cout << "Estimated lambda2 (L_norm): " << lambda2 << "\n";
116 
117     // 2-way spectral cut by sign (median threshold for robustness)
118     vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
119     vector<double> v2_copy = v2;
120     nth_element(idx.begin(), idx.begin()+n/2, idx.end(), [&](int a, int b){return v2_copy[a] < v2_copy[b];});
121     double median = v2_copy[idx[n/2]];
122 
123     vector<int> part(n, 0);
124     for (int i = 0; i < n; ++i) part[i] = (v2[i] >= median) ? 1 : 0;
125 
126     cout << "Partition (node:part):\n";
127     for (int i = 0; i < n; ++i) cout << i << ":" << part[i] << (i+1==n?'\n':' ');
128 
129     return 0;
130 }
131

We construct an undirected graph, implement a sparse multiply by S = D^{-1/2} A D^{-1/2}, and perform power iteration while projecting out the trivial eigenvector u = sqrt(deg). The iteration converges to the second largest eigenvector of S, which corresponds to the Fiedler vector of L_norm. We estimate the eigenvalue with the Rayleigh quotient and then cut the graph into two parts by thresholding the eigenvector at its median.

Time: O(T m) where T is iterations (typically tens to hundreds) and m is the number of edges.Space: O(n + m) for the graph plus O(n) for working vectors.

Spectral clustering (k-way) using block power iteration and k-means

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct Graph {
5     int n; vector<vector<int>> adj; vector<double> deg;
6     Graph(int n_) : n(n_), adj(n_), deg(n_, 0.0) {}
7     void add_edge(int u, int v) {
8         if (u == v) return; adj[u].push_back(v); adj[v].push_back(u); deg[u]+=1.0; deg[v]+=1.0; }
9 };
10 
11 static inline void multiply_S(const Graph &G, const vector<double> &x, vector<double> &y) {
12     int n = G.n; fill(y.begin(), y.end(), 0.0);
13     for (int u = 0; u < n; ++u) {
14         double du = G.deg[u]; if (du == 0.0) continue; double su = 1.0/sqrt(du);
15         for (int v : G.adj[u]) { double dv = G.deg[v]; if (dv == 0.0) continue; y[u] += su * (1.0/sqrt(dv)) * x[v]; }
16     }
17 }
18 
19 // Orthonormalize columns of X using modified Gram-Schmidt
20 static void mgs_orthonormalize(vector<vector<double>> &X) {
21     int n = (int)X.size(); int k = (int)X[0].size();
22     for (int j = 0; j < k; ++j) {
23         // subtract projections on previous columns
24         for (int i = 0; i < j; ++i) {
25             double dot_ij = 0.0; for (int r = 0; r < n; ++r) dot_ij += X[r][j] * X[r][i];
26             for (int r = 0; r < n; ++r) X[r][j] -= dot_ij * X[r][i];
27         }
28         double nrm = 0.0; for (int r = 0; r < n; ++r) nrm += X[r][j]*X[r][j];
29         nrm = sqrt(max(1e-18, nrm));
30         for (int r = 0; r < n; ++r) X[r][j] /= nrm;
31     }
32 }
33 
34 // Project each column of X to be orthogonal to u
35 static void deflate_against(vector<vector<double>> &X, const vector<double> &u) {
36     double uu = 0.0; for (double val : u) uu += val*val; uu = max(1e-18, uu);
37     int n = (int)X.size(); int k = (int)X[0].size();
38     for (int j = 0; j < k; ++j) {
39         double alpha = 0.0; for (int r = 0; r < n; ++r) alpha += X[r][j] * u[r]; alpha /= uu;
40         for (int r = 0; r < n; ++r) X[r][j] -= alpha * u[r];
41     }
42 }
43 
44 // Block power iteration to get k dominant eigenvectors of S, deflating the trivial one
45 vector<vector<double>> topk_eigenvectors_S(const Graph &G, int k, int iters = 200) {
46     int n = G.n; k = max(1, k);
47     vector<double> u(n); for (int i = 0; i < n; ++i) u[i] = sqrt(G.deg[i]);
48     // Initialize X with Gaussian random entries
49     mt19937 rng(123);
50     normal_distribution<double> nd(0.0, 1.0);
51     vector<vector<double>> X(n, vector<double>(k));
52     for (int i = 0; i < n; ++i) for (int j = 0; j < k; ++j) X[i][j] = nd(rng);
53     deflate_against(X, u);
54     mgs_orthonormalize(X);
55 
56     vector<vector<double>> Y(n, vector<double>(k, 0.0));
57     vector<double> xcol(n), ycol(n);
58     for (int it = 0; it < iters; ++it) {
59         // Y = S X (column by column multiply)
60         for (int j = 0; j < k; ++j) {
61             for (int r = 0; r < n; ++r) xcol[r] = X[r][j];
62             fill(ycol.begin(), ycol.end(), 0.0);
63             multiply_S(G, xcol, ycol);
64             for (int r = 0; r < n; ++r) Y[r][j] = ycol[r];
65         }
66         deflate_against(Y, u);
67         mgs_orthonormalize(Y);
68         X.swap(Y);
69     }
70     return X; // columns approximate top-k (excluding trivial)
71 }
72 
73 // Row-normalize an n x k matrix
74 static void row_normalize(vector<vector<double>> &M) {
75     int n = (int)M.size(); int k = (int)M[0].size();
76     for (int i = 0; i < n; ++i) {
77         double nrm = 0.0; for (int j = 0; j < k; ++j) nrm += M[i][j] * M[i][j];
78         nrm = sqrt(max(1e-18, nrm));
79         for (int j = 0; j < k; ++j) M[i][j] /= nrm;
80     }
81 }
82 
83 // Simple k-means on rows of data (n x d)
84 vector<int> kmeans(const vector<vector<double>> &data, int k, int iters = 50, int seed = 7) {
85     int n = (int)data.size(); int d = (int)data[0].size();
86     mt19937 rng(seed); uniform_int_distribution<int> uni(0, n-1);
87     vector<vector<double>> cent(k, vector<double>(d, 0.0));
88     for (int j = 0; j < k; ++j) cent[j] = data[uni(rng)];
89     vector<int> label(n, 0);
90     for (int it = 0; it < iters; ++it) {
91         bool changed = false;
92         // assign
93         for (int i = 0; i < n; ++i) {
94             int best = 0; double bestd = numeric_limits<double>::infinity();
95             for (int j = 0; j < k; ++j) {
96                 double d2 = 0.0; for (int t = 0; t < d; ++t) { double diff = data[i][t] - cent[j][t]; d2 += diff*diff; }
97                 if (d2 < bestd) { bestd = d2; best = j; }
98             }
99             if (label[i] != best) { label[i] = best; changed = true; }
100         }
101         // update
102         vector<vector<double>> newc(k, vector<double>(d, 0.0));
103         vector<int> cnt(k, 0);
104         for (int i = 0; i < n; ++i) { cnt[label[i]]++; for (int t = 0; t < d; ++t) newc[label[i]][t] += data[i][t]; }
105         for (int j = 0; j < k; ++j) {
106             if (cnt[j] == 0) { newc[j] = data[uni(rng)]; cnt[j] = 1; }
107             for (int t = 0; t < d; ++t) newc[j][t] /= cnt[j];
108         }
109         cent.swap(newc);
110         if (!changed) break;
111     }
112     return label;
113 }
114 
115 int main() {
116     ios::sync_with_stdio(false);
117     cin.tie(nullptr);
118 
119     // Build a toy graph with 3 clusters (two cliques and one ring)
120     int n = 18; Graph G(n);
121     // Clique 0..5
122     for (int i = 0; i < 6; ++i) for (int j = i+1; j < 6; ++j) G.add_edge(i, j);
123     // Clique 6..11
124     for (int i = 6; i < 12; ++i) for (int j = i+1; j < 12; ++j) G.add_edge(i, j);
125     // Ring 12..17
126     for (int i = 12; i < 18; ++i) G.add_edge(i, 12 + (i-12+1)%6);
127     // Sparse inter-cluster edges
128     G.add_edge(2, 7); G.add_edge(10, 14); G.add_edge(4, 12);
129 
130     int k = 3;
131     auto U = topk_eigenvectors_S(G, k, 200); // n x k columns are eigenvectors (dominant, excluding trivial)
132 
133     // Convert to row-major embedding (n x k)
134     int nrows = G.n; int kcols = k;
135     vector<vector<double>> embed(nrows, vector<double>(kcols, 0.0));
136     for (int i = 0; i < nrows; ++i) for (int j = 0; j < kcols; ++j) embed[i][j] = U[i][j];
137 
138     // Row-normalize (Ng–Jordan–Weiss)
139     row_normalize(embed);
140 
141     // Cluster rows with k-means
142     vector<int> labels = kmeans(embed, k, 100, 11);
143 
144     cout << "Cluster labels (node:cluster):\n";
145     for (int i = 0; i < nrows; ++i) cout << i << ":" << labels[i] << (i+1==nrows?'\n':' ');
146 
147     return 0;
148 }
149

This program finds k dominant eigenvectors of S = D^{-1/2} A D^{-1/2} via block power iteration while deflating the trivial eigenvector. It row-normalizes the eigenvector matrix to form a spectral embedding and then applies k-means to obtain k clusters. This is the normalized spectral clustering algorithm (Ng–Jordan–Weiss variant) implemented with sparse O(m) multiplies.

Time: O(T m k + T n k^2) for eigenvectors (T iterations) plus O(I n k^2) for k-means with I iterations.Space: O(n + m + n k) for the graph and embedding, plus O(n k) working memory.

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Undirected, unweighted graph stored as adjacency list
5	struct Graph {
6	int n; // number of nodes
7	vector<vector<int>> adj; // adjacency list
8	vector<double> deg; // degrees
9	Graph(int n_) : n(n_), adj(n_), deg(n_, 0.0) {}
10	void add_edge(int u, int v) {
11	if (u == v) return; // ignore self-loops for simplicity
12	adj[u].push_back(v);
13	adj[v].push_back(u);
14	deg[u] += 1.0;
15	deg[v] += 1.0;
16	}
17	};
18
19	// Multiply y = S x where S = D^{-1/2} A D^{-1/2} using adjacency list
20	static inline void multiply_S(const Graph &G, const vector<double> &x, vector<double> &y) {
21	int n = G.n;
22	fill(y.begin(), y.end(), 0.0);
23	for (int u = 0; u < n; ++u) {
24	double du = G.deg[u];
25	if (du == 0.0) continue; // isolated node contributes nothing
26	double scale_u = 1.0 / sqrt(du);
27	for (int v : G.adj[u]) {
28	double dv = G.deg[v];
29	if (dv == 0.0) continue;
30	double w = scale_u * (1.0 / sqrt(dv)); // 1/sqrt(d_u d_v)
31	y[u] += w * x[v];
32	}
33	}
34	}
35
36	static inline double dot(const vector<double> &a, const vector<double> &b) {
37	double s = 0.0; int n = (int)a.size();
38	for (int i = 0; i < n; ++i) s += a[i] * b[i];
39	return s;
40	}
41
42	static inline double norm2(const vector<double> &a) {
43	return sqrt(max(0.0, dot(a, a)));
44	}
45
46	// Project x to be orthogonal to u: x <- x - (x.u / u.u) u
47	static inline void project_out(vector<double> &x, const vector<double> &u) {
48	double uu = max(1e-18, dot(u, u));
49	double alpha = dot(x, u) / uu;
50	for (size_t i = 0; i < x.size(); ++i) x[i] -= alpha * u[i];
51	}
52
53	// Compute the second largest eigenpair (mu2, v2) of S by deflating the trivial eigenvector u = sqrt(deg)
54	// Returns estimated mu2 and fills v2 (unit norm). Also returns lambda2(L_norm) = 1 - mu2.
55	pair<double,double> second_eigen_S(const Graph &G, vector<double> &v2, int iters = 200, double tol = 1e-9) {
56	int n = G.n;
57	// Trivial eigenvector for S is u = sqrt(deg)
58	vector<double> u(n);
59	for (int i = 0; i < n; ++i) u[i] = sqrt(G.deg[i]);
60
61	// Initialize v2 randomly and orthogonalize to u
62	mt19937 rng(42);
63	normal_distribution<double> nd(0.0, 1.0);
64	v2.assign(n, 0.0);
65	for (int i = 0; i < n; ++i) v2[i] = nd(rng);
66	project_out(v2, u);
67	double nrm = norm2(v2);
68	if (nrm < 1e-12) {
69	// If all isolated, return trivial
70	return {0.0, 1.0};
71	}
72	for (int i = 0; i < n; ++i) v2[i] /= nrm;
73
74	vector<double> y(n, 0.0);
75	double mu_old = 0.0;
76	for (int it = 0; it < iters; ++it) {
77	multiply_S(G, v2, y);
78	// Deflate: ensure orthogonal to u (remove trivial component)
79	project_out(y, u);
80	double nrm_y = norm2(y);
81	if (nrm_y < 1e-14) break;
82	for (int i = 0; i < n; ++i) v2[i] = y[i] / nrm_y;
83	// Rayleigh quotient as eigenvalue estimate
84	multiply_S(G, v2, y);
85	double mu = dot(v2, y);
86	if (fabs(mu - mu_old) < tol * max(1.0, fabs(mu_old))) {
87	mu_old = mu;
88	break;
89	}
90	mu_old = mu;
91	}
92	double mu2 = mu_old; // second largest eigenvalue of S
93	double lambda2 = 1.0 - mu2; // second smallest eigenvalue of L_norm
94	return {mu2, lambda2};
95	}
96
97	int main() {
98	ios::sync_with_stdio(false);
99	cin.tie(nullptr);
100
101	// Example: two 5-node cliques connected by a single bridge (barbell-like)
102	int n = 10; Graph G(n);
103	// Clique 0..4
104	for (int i = 0; i < 5; ++i) for (int j = i+1; j < 5; ++j) G.add_edge(i, j);
105	// Clique 5..9
106	for (int i = 5; i < 10; ++i) for (int j = i+1; j < 10; ++j) G.add_edge(i, j);
107	// Weak bridge
108	G.add_edge(4, 5);
109
110	vector<double> v2;
111	auto [mu2, lambda2] = second_eigen_S(G, v2, 500, 1e-10);
112
113	cout.setf(std::ios::fixed); cout << setprecision(6);
114	cout << "Estimated mu2 (S): " << mu2 << "\n";
115	cout << "Estimated lambda2 (L_norm): " << lambda2 << "\n";
116
117	// 2-way spectral cut by sign (median threshold for robustness)
118	vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
119	vector<double> v2_copy = v2;
120	nth_element(idx.begin(), idx.begin()+n/2, idx.end(), [&](int a, int b){return v2_copy[a] < v2_copy[b];});
121	double median = v2_copy[idx[n/2]];
122
123	vector<int> part(n, 0);
124	for (int i = 0; i < n; ++i) part[i] = (v2[i] >= median) ? 1 : 0;
125
126	cout << "Partition (node:part):\n";
127	for (int i = 0; i < n; ++i) cout << i << ":" << part[i] << (i+1==n?'\n':' ');
128
129	return 0;
130	}
131