📚TheoryAdvanced

Spectral Convolution on Graphs

Key Points

•
Spectral convolution on graphs generalizes the classical notion of convolution using the graph’s Laplacian eigenvectors as “Fourier” basis functions.
•
The Graph Fourier Transform (GFT) projects a node signal onto Laplacian eigenvectors, and filtering multiplies each Fourier coefficient by a frequency response g( $λ) .$
•
Exact spectral convolution $y = U$ g(Λ) $U^{T}$ x is accurate but can be expensive because it needs an eigen-decomposition of the Laplacian.
•
Polynomial and Chebyshev approximations implement spectral filters without eigen-decomposition using repeated sparse matrix–vector products.
•
Graph smoothing, denoising, diffusion, and semi-supervised learning benefit from low-pass spectral filters that attenuate high Laplacian eigenvalues.
•
For large graphs, use localized polynomial filters (e.g., Chebyshev or GCN first-order) to keep time O(Km) where K is filter order and m is edges.
•
Normalization choices (unnormalized, symmetric, or random-walk Laplacian) affect numerical stability and the meaning of “frequency.”
•
Always sort eigenpairs, scale for Chebyshev correctly to [-1,1], and prefer sparse operations for large graphs to avoid O( $n^{3}$ ) costs.

Prerequisites

→Linear algebra (eigenvalues, eigenvectors, orthogonality) — Spectral convolution relies on the Laplacian eigen-decomposition and orthonormal bases.
→Graph basics (adjacency, degree, Laplacian) — Understanding how the Laplacian encodes connectivity is essential for defining graph frequencies.
→Signal processing (Fourier transform, filtering) — Spectral filtering on graphs generalizes classical frequency-domain multiplication.
→Numerical methods (iterative solvers) — Estimating λ_max and scalable filtering use power iteration and sparse operations.
→Polynomial approximation (Chebyshev polynomials) — Approximates spectral filters efficiently without eigen-decomposition.
→Sparse matrix–vector multiplication — Efficient implementation on large graphs requires O(m) SpMV operations.

Detailed Explanation

Tap terms for definitions

01Overview

Spectral convolution on graphs extends the idea of convolution from regular domains (like time or 2D images) to irregular graph domains. On a line or a grid, convolution is defined via the Fourier transform: you convert a signal to frequencies, multiply by a frequency response (a filter), and transform back. On graphs, there is no natural shift operator, but the graph Laplacian—a matrix derived from the adjacency structure—plays an analogous role. Its eigenvectors form an orthonormal basis that acts like “graph Fourier modes,” and its eigenvalues behave like frequencies: small eigenvalues correspond to smooth variations over the graph, while large eigenvalues capture rapid changes across edges. The Graph Fourier Transform (GFT) uses Laplacian eigenvectors to map a node signal into the spectral domain. Spectral convolution then multiplies each spectral component by a function g(λ) and transforms back, yielding y = U g(Λ) U^T x, where L = UΛU^T is the Laplacian eigen-decomposition. While exact spectral methods require computing eigenvectors (costly for large graphs), practical implementations approximate g(·) using polynomials (e.g., Chebyshev), enabling fast filtering through repeated sparse matrix–vector multiplications. This approach underpins efficient graph neural networks like ChebNet and the first-order GCN layer, which can be interpreted as specific low-order spectral filters. The result is a principled, flexible framework for processing signals on graphs, with applications in denoising, diffusion, semi-supervised learning, and beyond.

02Intuition & Analogies

Imagine plucking a guitar string: the shape of the vibration can be decomposed into pure tones (sine waves) with different frequencies. Filtering (like a tone control) adjusts how strong each frequency is. On a graph, we don’t have a straight string—nodes connect in an irregular web—so sine waves don’t fit. Instead, the graph’s Laplacian matrix encodes how connected each pair of nodes is. Its eigenvectors are like the “standing waves” that fit this web. When a signal lives on nodes (e.g., a temperature at each city), projecting it onto these eigenvectors gives us its “graph frequencies.” If neighboring nodes have similar values, the signal is smooth and mostly uses low frequencies (small eigenvalues). Sharp changes across edges (like an abrupt boundary) require high frequencies (large eigenvalues). Filtering on the graph is then like turning a frequency knob: we pick a response g(λ) that says how much to keep or suppress each frequency. A low-pass filter keeps smooth trends and removes noise; a high-pass filter highlights boundaries. The exact way to do this is to transform to the spectral domain, multiply, and transform back—just like with audio signals. However, computing all the “standing waves” (eigenvectors) can be slow for big graphs. Fortunately, many useful filters can be written as polynomials of the Laplacian. That’s like saying, instead of opening the device and tweaking frequencies directly, we can simulate the effect by repeatedly passing information along edges a small number of times (the polynomial’s degree). This keeps operations local and fast, which is why modern graph neural networks often use polynomial (or first-order) approximations.

03Formal Definition

Let G = (V, E, W) be an undirected weighted graph with n =

∣ V ∣

. Let A be the adjacency matrix with entries

A_{ij} = w_{ij} \geq 0

, and D be the degree matrix with

D_{ii}

\sum_{j}

A_{ij}

. The unnormalized Laplacian is

L = D

− A. The symmetric normalized Laplacian is

L_{sy m} = I

−

D^{- 1/2}

D^{- 1/2}

. For symmetric L, there exists an orthonormal eigenbasis U = [

u_{1}

, …,

u_{n}

] and nonnegative eigenvalues 0 =

λ_{1}

≤

λ_{2}

≤ … ≤

λ_{n}

such that

L = U

U^{T}

with Λ =

diag

(

λ_{1},

…,

λ_{n}) .

For a graph signal x ∈

R^{n}

, the Graph Fourier Transform (GFT) is

\overset{x}{^}

U^{T}

x and the inverse is

x = U

\overset{x}{^}

. A spectral filter is specified by a frequency response g:

R_{+}

→

R

; its action is

y = U

g(Λ)

U^{T}

x, where g(Λ) is the diagonal matrix with g(

λ_{i})

on the diagonal. This operator is sometimes called graph convolution because it generalizes the convolution theorem: convolution in the vertex domain corresponds to multiplication in the spectral domain. When g is a polynomial of degree K, g(

λ)

\sum_{k = 0}^{K}

c_{k}

T_{k}

(

\tilde{λ}

) (e.g., Chebyshev basis with scaled eigenvalues

\tilde{λ}

∈ [−1, 1]), then y can be computed without U via the recurrence of

T_{k}

applied to L, using only matrix–vector products. In practice, choosing L (unnormalized vs. normalized) affects frequency scaling and numerical stability, but the theoretical structure remains the same.

04When to Use

Use exact spectral convolution (via eigen-decomposition) when the graph is small to medium (hundreds to a few thousands of nodes), when precise frequency control is needed, or for teaching and diagnostics. It is also useful when you need explicit spectral quantities such as eigenvalues for interpretability or for designing filters like heat kernels g(λ) = e^{−tλ}. Use polynomial or Chebyshev approximations for large, sparse graphs where full eigen-decomposition is infeasible: these achieve near-linear time in the number of edges for moderate polynomial degree K and localize computation to K-hop neighborhoods. For semi-supervised node classification, node regression, smoothing/denoising, and diffusion-based tasks, low-pass filters are effective because they encourage label/feature smoothness across edges. Graph signal reconstruction and anomaly detection can benefit from band-pass or high-pass filters depending on whether you want to isolate communities or sharp discontinuities. In graph neural networks, use first-order approximations (e.g., GCN) for scalability and simplicity, or higher-order Chebyshev filters (ChebNet) for greater spectral flexibility and larger receptive fields without overfitting to global eigenvectors.

⚠️Common Mistakes

Mixing Laplacian variants without adjusting formulas. L = D − A, L_{sym} = I − D^{−1/2} A D^{−1/2}, and L_{rw} = I − D^{−1} A behave differently; using the wrong one can skew frequency interpretations or break symmetry assumptions.
Forgetting to sort eigenpairs. Eigenvalues/eigenvectors returned by numerical routines are often unordered; applying g(λ) to mismatched eigenvectors yields incorrect filtering.
Ignoring scaling for Chebyshev. Chebyshev polynomials require eigenvalues scaled into [−1, 1]; failing to rescale L by λ_{max} causes divergence or artifacts.
Using dense matrices on large sparse graphs. Building dense A or L inflates time and memory to O(n^2); prefer sparse adjacency with O(m) storage and SpMV operations.
Over-aggressive filters. Very sharp ideal filters can amplify numerical noise; smooth frequency responses (e.g., heat kernels) are more stable.
Not adding self-loops in GCN-style models. The common renormalization with Ā = A + I and \hat{D}^{−1/2} Ā \hat{D}^{−1/2} is crucial for stability and including self-information.
Applying symmetric formulas to directed graphs. For directed graphs, L may be nonsymmetric; the standard orthonormal eigenbasis assumption breaks and requires different tools (e.g., Hermitian symmetrization or magnetic Laplacians).
Misinterpreting smoothness. The quadratic form x^T L x measures variation across edges; small values imply smoothness relative to weights, not necessarily globally flat signals.

Key Formulas

Unnormalized Laplacian

L = D - A

Explanation: The Laplacian is degree minus adjacency. It measures how a signal deviates from its neighbors and is symmetric positive semidefinite for undirected graphs.

Symmetric Normalized Laplacian

L_{sy m} = I - D^{- 1/2} A D^{- 1/2}

Explanation: This normalization balances degrees and keeps the operator symmetric. It is numerically stable and widely used in learning on graphs.

Random-Walk Laplacian

L_{r w} = I - D^{- 1} A

Explanation: This form relates directly to random-walk dynamics. It is not symmetric but is similar to $L_{s} y m$ and often used for diffusion analysis.

Spectral Decomposition

L = U Λ U^{T}, U^{T} U = I

Explanation: A symmetric Laplacian diagonalizes via an orthonormal eigenbasis. The diagonal entries of Λ are nonnegative and represent graph frequencies.

Graph Fourier Transform

\overset{x}{^} = U^{T} x, x = U \overset{x}{^}

Explanation: Project a vertex-domain signal x onto eigenvectors to obtain spectral coefficients, then reconstruct by a linear combination of eigenvectors.

Spectral Convolution

y = U g (Λ) U^{T} x

Explanation: Filtering multiplies each spectral component by g( $λ) .$ This generalizes the convolution theorem from classical signal processing to graphs.

Dirichlet Energy

x^{T} Lx = \frac{1}{2} i, j \sum A_{ij} (x_{i} - x_{j})^{2} = i = 1 \sum n λ_{i} \overset{x}{^}_{i}^{2}

Explanation: The Laplacian quadratic form measures signal variation across edges. In the spectral domain, energy weights coefficients by their frequencies.

Heat Kernel / Diffusion

g (λ) = e^{- t λ}, y = e^{- t L} x

Explanation: The heat filter smooths signals by diffusing values along edges for time t. It suppresses high frequencies exponentially.

Chebyshev Recurrence

T_{0} (z) = 1, T_{1} (z) = z, T_{k + 1} (z) = 2 z T_{k} (z) - T_{k - 1} (z)

Explanation: Chebyshev polynomials form an efficient basis for approximating functions on [-1,1]. They enable fast computation of g(L)x via recurrence.

Chebyshev Approximation of Filters

\tilde{L} = \frac{2}{λ _{m a x}} L - I, y \approx k = 0 \sum K c_{k} T_{k} (\tilde{L}) x

Explanation: Scale L so its spectrum lies in [-1,1], then approximate g using Chebyshev polynomials. This avoids eigen-decomposition and uses only SpMV operations.

Power Iteration (Largest Eigenvalue)

v_{t + 1} = \frac{L v _{t}}{∥ L v _{t} ∥ _{2}}

Explanation: Iteratively applying L and normalizing converges to the dominant eigenvector. The Rayleigh quotient estimates $λ_{m a x}$ .

GCN First-Order Propagation

H = \hat{D}^{- 1/2} \hat{A} \hat{D}^{- 1/2} X, \hat{A} = A + I

Explanation: Adding self-loops and symmetric normalization yields a stable first-order spectral filter. Each layer mixes a node with its neighbors and itself.

Complexity Analysis

Exact spectral convolution requires computing the eigen-decomposition

L = U Λ U^{T}

. For dense methods (e.g., Jacobi or QR), this costs O(

n^{3}

) time and O(

n^{2}

) space, which quickly becomes prohibitive beyond a few thousand nodes. Applying the filter

y = U

g(Λ)

U^{T}

x then takes O(

n^{2}

) time (two matrix–vector multiplications) and O(

n^{2}

) space to store U. If only a small number of eigenpairs are needed (e.g., for band-limited filters), iterative methods

\frac{L an czos}{A RP A C K}

can reduce costs to roughly O(k m) for k eigenpairs on sparse graphs but still require nontrivial implementation. Polynomial/Chebyshev approximations avoid eigenvectors entirely. Let m be the number of edges and K the polynomial order. Each application of L or the normalized operator to a vector costs O(m) on sparse graphs. The Chebyshev recurrence performs about K such multiplications, leading to O(K m) time and O(n) extra space (to store a few working vectors). This scales nearly linearly with graph size for small K and is the predominant approach in large-scale graph learning. The GCN first-order propagation with self-loops uses the normalized operator

\hat{D}^{- 1/2}

\hat{A}

\hat{D}^{- 1/2}

. A single layer computes O(m F) operations for F feature dimensions plus O(n F

F_{o} u t

) for the feature transformation by W, with O(m) memory for the sparse adjacency and O(nF) for features. In practice, the choice between exact and approximate spectral methods balances accuracy and interpretability (exact) against scalability and locality (approximate). Numerical stability improves with normalized Laplacians, smooth frequency responses (e.g., heat kernels), and proper scaling for polynomial bases.

Code Examples

Exact spectral convolution with Jacobi eigen-decomposition (small graphs)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct WeightedEdge { int u, v; double w; };
5 
6 // Build dense adjacency and unnormalized Laplacian L = D - A (symmetric, undirected)
7 vector<vector<double>> buildLaplacian(int n, const vector<WeightedEdge>& edges) {
8     vector<vector<double>> A(n, vector<double>(n, 0.0));
9     vector<double> deg(n, 0.0);
10     for (auto &e : edges) {
11         if (e.u == e.v) continue; // ignore self-loops here
12         A[e.u][e.v] += e.w;
13         A[e.v][e.u] += e.w;
14         deg[e.u] += e.w;
15         deg[e.v] += e.w;
16     }
17     vector<vector<double>> L(n, vector<double>(n, 0.0));
18     for (int i = 0; i < n; ++i) {
19         L[i][i] = deg[i];
20         for (int j = 0; j < n; ++j) if (i != j) L[i][j] = -A[i][j];
21     }
22     return L;
23 }
24 
25 // Jacobi eigen-decomposition for symmetric matrices: A = V * diag(d) * V^T
26 // Returns eigenvalues d (ascending) and eigenvectors V (columns)
27 void jacobiEigenDecomposition(vector<vector<double>> A, vector<double>& d, vector<vector<double>>& V, int maxSweeps=100, double tol=1e-10) {
28     int n = (int)A.size();
29     V.assign(n, vector<double>(n, 0.0));
30     for (int i = 0; i < n; ++i) V[i][i] = 1.0;
31 
32     auto maxOffDiag = [&](int &p, int &q){
33         double m = 0.0; p = 0; q = 1;
34         for (int i = 0; i < n; ++i) {
35             for (int j = i+1; j < n; ++j) {
36                 if (fabs(A[i][j]) > m) { m = fabs(A[i][j]); p = i; q = j; }
37             }
38         }
39         return m;
40     };
41 
42     for (int sweep = 0; sweep < maxSweeps; ++sweep) {
43         int p, q; double m = maxOffDiag(p, q);
44         if (m < tol) break;
45         double app = A[p][p], aqq = A[q][q], apq = A[p][q];
46         double tau = (aqq - app) / (2.0 * apq);
47         double t = ((tau >= 0) ? 1.0 : -1.0) / (fabs(tau) + sqrt(1.0 + tau * tau));
48         double c = 1.0 / sqrt(1.0 + t * t);
49         double s = t * c;
50         // Rotate A
51         for (int k = 0; k < n; ++k) if (k != p && k != q) {
52             double aik = A[p][k], akq = A[q][k];
53             A[p][k] = A[k][p] = c * aik - s * akq;
54             A[q][k] = A[k][q] = s * aik + c * akq;
55         }
56         double app_new = c*c*app - 2*s*c*apq + s*s*aqq;
57         double aqq_new = s*s*app + 2*s*c*apq + c*c*aqq;
58         A[p][p] = app_new; A[q][q] = aqq_new; A[p][q] = A[q][p] = 0.0;
59         // Rotate V
60         for (int k = 0; k < n; ++k) {
61             double vip = V[k][p], viq = V[k][q];
62             V[k][p] = c*vip - s*viq;
63             V[k][q] = s*vip + c*viq;
64         }
65     }
66     // Extract eigenvalues and sort ascending
67     d.resize(n);
68     for (int i = 0; i < n; ++i) d[i] = A[i][i];
69     vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
70     sort(idx.begin(), idx.end(), [&](int i, int j){ return d[i] < d[j]; });
71     vector<double> d_sorted(n);
72     vector<vector<double>> V_sorted(n, vector<double>(n));
73     for (int r = 0; r < n; ++r) {
74         d_sorted[r] = d[idx[r]];
75         for (int i = 0; i < n; ++i) V_sorted[i][r] = V[i][idx[r]];
76     }
77     d.swap(d_sorted); V.swap(V_sorted);
78 }
79 
80 vector<double> matVec(const vector<vector<double>>& M, const vector<double>& x) {
81     int n = (int)M.size();
82     vector<double> y(n, 0.0);
83     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[i] += M[i][j] * x[j];
84     return y;
85 }
86 
87 vector<double> transposeMatVec(const vector<vector<double>>& M, const vector<double>& x) {
88     // For symmetric M, same as matVec. This is generic.
89     int n = (int)M.size();
90     vector<double> y(n, 0.0);
91     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[j] += M[i][j] * x[i];
92     return y;
93 }
94 
95 vector<double> multiplyByUGLambdaUTx(const vector<vector<double>>& U, const vector<double>& lambda, const vector<double>& x, function<double(double)> g) {
96     int n = (int)U.size();
97     // x_hat = U^T x
98     vector<double> xhat(n, 0.0);
99     for (int i = 0; i < n; ++i) {
100         for (int j = 0; j < n; ++j) xhat[i] += U[j][i] * x[j];
101     }
102     // Apply filter in spectral domain
103     for (int i = 0; i < n; ++i) xhat[i] *= g(lambda[i]);
104     // y = U xhat
105     vector<double> y(n, 0.0);
106     for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[i] += U[i][j] * xhat[j];
107     return y;
108 }
109 
110 int main() {
111     // Small demo graph (path of 5 nodes)
112     int n = 5;
113     vector<WeightedEdge> edges;
114     auto add = [&](int u, int v){ edges.push_back({u,v,1.0}); };
115     add(0,1); add(1,2); add(2,3); add(3,4);
116 
117     auto L = buildLaplacian(n, edges);
118 
119     // Eigen-decomposition of L
120     vector<double> lambda; vector<vector<double>> U; 
121     jacobiEigenDecomposition(L, lambda, U);
122 
123     // Define a heat kernel filter g(λ) = exp(-t λ)
124     double t = 1.0;
125     auto g = [&](double lam){ return exp(-t * lam); };
126 
127     // A noisy signal on nodes
128     vector<double> x = {1.0, 0.0, 1.0, 0.0, 1.0};
129 
130     // Filter via spectral convolution y = U g(Λ) U^T x
131     vector<double> y = multiplyByUGLambdaUTx(U, lambda, x, g);
132 
133     cout << fixed << setprecision(6);
134     cout << "Eigenvalues (ascending):\n";
135     for (double lam : lambda) cout << lam << " "; cout << "\n\n";
136 
137     cout << "Input x:\n"; for (double xi : x) cout << xi << " "; cout << "\n";
138     cout << "Filtered y (heat kernel t=1):\n"; for (double yi : y) cout << yi << " "; cout << "\n";
139 
140     return 0;
141 }
142

This program constructs the unnormalized Laplacian of a small graph, computes its full eigen-decomposition with the Jacobi method, and performs exact spectral filtering y = U g(Λ) U^T x using a heat kernel g(λ) = exp(−tλ). The eigenvalues are printed (interpretable as graph frequencies), and the input signal is smoothed as high frequencies are attenuated. The Jacobi algorithm is simple and robust for small symmetric matrices, making it suitable for demonstration but not for large graphs.

Time: Eigen-decomposition: O(n^3). Filtering: O(n^2).Space: O(n^2) for storing L and U.

Fast Chebyshev polynomial filter without eigen-decomposition (sparse graph)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct Edge { int u, v; double w; };
5 
6 struct SparseGraph {
7     int n; vector<vector<pair<int,double>>> adj; // undirected
8     explicit SparseGraph(int n): n(n), adj(n) {}
9     void addEdge(int u, int v, double w=1.0){
10         if (u==v) { adj[u].push_back({v,w}); return; }
11         adj[u].push_back({v,w}); adj[v].push_back({u,w});
12     }
13 };
14 
15 // Multiply by unnormalized Laplacian: y = L x = D x - A x (O(m))
16 vector<double> laplacianTimesVec(const SparseGraph& G, const vector<double>& x){
17     int n = G.n; vector<double> y(n, 0.0); vector<double> deg(n, 0.0);
18     for (int u = 0; u < n; ++u) for (auto [v,w] : G.adj[u]) deg[u] += w;
19     for (int i = 0; i < n; ++i) y[i] = deg[i] * x[i];
20     for (int u = 0; u < n; ++u) for (auto [v,w] : G.adj[u]) y[u] -= w * x[v];
21     return y;
22 }
23 
24 // Power iteration to estimate largest eigenvalue of L (lambda_max)
25 double estimateLambdaMax(const SparseGraph& G, int iters=50){
26     int n = G.n; vector<double> v(n), y(n);
27     std::mt19937 rng(42); std::uniform_real_distribution<double> dist(0.0, 1.0);
28     for (int i = 0; i < n; ++i) v[i] = dist(rng);
29     auto norm = [&](const vector<double>& a){ double s=0; for(double z:a) s+=z*z; return sqrt(max(s,1e-30)); };
30     double ray = 0.0;
31     for (int t = 0; t < iters; ++t) {
32         y = laplacianTimesVec(G, v);
33         double nv = norm(y);
34         for (int i = 0; i < n; ++i) v[i] = y[i] / nv;
35         // Rayleigh quotient v^T L v / v^T v (v is normalized)
36         vector<double> Lv = laplacianTimesVec(G, v);
37         double num=0.0, den=0.0; for (int i=0;i<n;++i){ num += v[i]*Lv[i]; den += v[i]*v[i]; }
38         ray = num / max(den,1e-30);
39     }
40     return max(ray, 1e-12);
41 }
42 
43 // Multiply by scaled operator: \tilde{L} = (2/lmax) L - I
44 vector<double> scaledLaplacianTimesVec(const SparseGraph& G, const vector<double>& x, double lmax){
45     vector<double> Lx = laplacianTimesVec(G, x);
46     int n = (int)x.size();
47     vector<double> y(n);
48     for (int i = 0; i < n; ++i) y[i] = (2.0 / lmax) * Lx[i] - x[i];
49     return y;
50 }
51 
52 // Chebyshev filter: y = sum_{k=0}^K c_k T_k(\tilde{L}) x
53 vector<double> chebyshevFilter(const SparseGraph& G, const vector<double>& x, const vector<double>& c){
54     int n = G.n; int K = (int)c.size() - 1;
55     double lmax = estimateLambdaMax(G);
56     vector<double> t0 = x; // T0(x)
57     vector<double> y(n, 0.0);
58     for (int i = 0; i < n; ++i) y[i] += c[0] * t0[i];
59     if (K == 0) return y;
60     vector<double> t1 = scaledLaplacianTimesVec(G, x, lmax); // T1(x) = \tilde{L} x
61     for (int i = 0; i < n; ++i) y[i] += c[1] * t1[i];
62     for (int k = 2; k <= K; ++k) {
63         // T_k(x) = 2 \tilde{L} T_{k-1}(x) - T_{k-2}(x)
64         vector<double> Lt1 = scaledLaplacianTimesVec(G, t1, lmax);
65         vector<double> tk(n);
66         for (int i = 0; i < n; ++i) tk[i] = 2.0 * Lt1[i] - t0[i];
67         for (int i = 0; i < n; ++i) y[i] += c[k] * tk[i];
68         t0.swap(t1); t1.swap(tk);
69     }
70     return y;
71 }
72 
73 int main(){
74     // Build a sparse ring graph of 8 nodes
75     int n = 8; SparseGraph G(n);
76     for (int i = 0; i < n; ++i) {
77         int j = (i+1) % n; G.addEdge(i, j, 1.0);
78     }
79     // Sample signal with alternating noise
80     vector<double> x(n);
81     for (int i = 0; i < n; ++i) x[i] = (i%2==0? 1.0 : 0.0) + 0.2*((i%3)-1);
82 
83     // Choose Chebyshev coefficients (low-pass-ish smoothing)
84     // Example small-degree filter; in practice, compute c_k from desired g(λ)
85     vector<double> c = {0.8, -0.5, 0.2, -0.05}; // K = 3
86 
87     vector<double> y = chebyshevFilter(G, x, c);
88 
89     cout << fixed << setprecision(6);
90     cout << "Input x:\n"; for (double v : x) cout << v << ' '; cout << "\n";
91     cout << "Filtered y (Chebyshev K=3):\n"; for (double v : y) cout << v << ' '; cout << "\n";
92     return 0;
93 }
94

This example implements a fast Chebyshev polynomial filter that approximates y ≈ g(L)x without computing eigenvectors. It estimates λ_max via power iteration, scales L so its spectrum is in [−1,1], and then uses the Chebyshev recurrence. The filter coefficients c_k can be chosen to approximate a desired response (e.g., heat kernel). The runtime is linear in the number of edges times the polynomial order, which is practical for large sparse graphs.

Time: O(K m) where m is the number of edges and K is the Chebyshev order.Space: O(n + m) for the sparse graph plus O(n) working vectors.

GCN-style first-order spectral propagation with self-loops

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct Graph { int n; vector<vector<int>> adj; explicit Graph(int n): n(n), adj(n) {} };
5 
6 void addUndirectedEdge(Graph& G, int u, int v){ if(u==v) return; G.adj[u].push_back(v); G.adj[v].push_back(u); }
7 
8 // Single GCN propagation: H = \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} X, then Z = H W
9 vector<vector<double>> gcnLayer(const Graph& G, const vector<vector<double>>& X, const vector<vector<double>>& W){
10     int n = G.n; int F = (int)X[0].size(); int Fout = (int)W[0].size();
11     // Build A with self-loops: \hat{A} = A + I
12     vector<vector<int>> adj = G.adj; for(int i=0;i<n;++i) adj[i].push_back(i);
13     vector<double> deg(n, 0.0);
14     for (int i = 0; i < n; ++i) deg[i] = (double)adj[i].size();
15     // H = \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} X
16     vector<vector<double>> H(n, vector<double>(F, 0.0));
17     for (int u = 0; u < n; ++u){
18         for (int v : adj[u]){
19             double w = 1.0 / sqrt(deg[u] * deg[v]);
20             for (int f = 0; f < F; ++f) H[u][f] += w * X[v][f];
21         }
22     }
23     // Z = H W
24     vector<vector<double>> Z(n, vector<double>(Fout, 0.0));
25     for (int i = 0; i < n; ++i){
26         for (int j = 0; j < Fout; ++j){
27             double s = 0.0;
28             for (int f = 0; f < F; ++f) s += H[i][f] * W[f][j];
29             Z[i][j] = s; // typically followed by nonlinearity
30         }
31     }
32     return Z;
33 }
34 
35 int main(){
36     // Small graph (triangle + tail)
37     Graph G(5);
38     addUndirectedEdge(G,0,1); addUndirectedEdge(G,1,2); addUndirectedEdge(G,2,0); // triangle
39     addUndirectedEdge(G,2,3); addUndirectedEdge(G,3,4); // tail
40 
41     // Node features X (n x F)
42     vector<vector<double>> X = {
43         {1.0, 0.0}, {0.5, 0.2}, {0.0, 1.0}, {0.2, 0.1}, {0.0, 0.5}
44     };
45 
46     // Weight matrix W (F x Fout)
47     vector<vector<double>> W = {
48         {0.7, -0.3},
49         {0.2,  0.9}
50     };
51 
52     auto Z = gcnLayer(G, X, W);
53 
54     cout << fixed << setprecision(6);
55     cout << "Z after one GCN layer:\n";
56     for (auto &row : Z){
57         for (double v : row) cout << v << ' ';
58         cout << '\n';
59     }
60     return 0;
61 }
62

This code implements a single GCN-style propagation step, which is a first-order spectral approximation using symmetric normalization with added self-loops. It mixes each node’s features with its neighbors’ features and then applies a learnable linear transformation W. This corresponds to a low-pass spectral filter and is efficient for large sparse graphs.

Time: O(m F + n F F_out), where m is edges and F are feature dimensions.Space: O(n F + m) for features and graph storage.

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct WeightedEdge { int u, v; double w; };
5
6	// Build dense adjacency and unnormalized Laplacian L = D - A (symmetric, undirected)
7	vector<vector<double>> buildLaplacian(int n, const vector<WeightedEdge>& edges) {
8	vector<vector<double>> A(n, vector<double>(n, 0.0));
9	vector<double> deg(n, 0.0);
10	for (auto &e : edges) {
11	if (e.u == e.v) continue; // ignore self-loops here
12	A[e.u][e.v] += e.w;
13	A[e.v][e.u] += e.w;
14	deg[e.u] += e.w;
15	deg[e.v] += e.w;
16	}
17	vector<vector<double>> L(n, vector<double>(n, 0.0));
18	for (int i = 0; i < n; ++i) {
19	L[i][i] = deg[i];
20	for (int j = 0; j < n; ++j) if (i != j) L[i][j] = -A[i][j];
21	}
22	return L;
23	}
24
25	// Jacobi eigen-decomposition for symmetric matrices: A = V * diag(d) * V^T
26	// Returns eigenvalues d (ascending) and eigenvectors V (columns)
27	void jacobiEigenDecomposition(vector<vector<double>> A, vector<double>& d, vector<vector<double>>& V, int maxSweeps=100, double tol=1e-10) {
28	int n = (int)A.size();
29	V.assign(n, vector<double>(n, 0.0));
30	for (int i = 0; i < n; ++i) V[i][i] = 1.0;
31
32	auto maxOffDiag = [&](int &p, int &q){
33	double m = 0.0; p = 0; q = 1;
34	for (int i = 0; i < n; ++i) {
35	for (int j = i+1; j < n; ++j) {
36	if (fabs(A[i][j]) > m) { m = fabs(A[i][j]); p = i; q = j; }
37	}
38	}
39	return m;
40	};
41
42	for (int sweep = 0; sweep < maxSweeps; ++sweep) {
43	int p, q; double m = maxOffDiag(p, q);
44	if (m < tol) break;
45	double app = A[p][p], aqq = A[q][q], apq = A[p][q];
46	double tau = (aqq - app) / (2.0 * apq);
47	double t = ((tau >= 0) ? 1.0 : -1.0) / (fabs(tau) + sqrt(1.0 + tau * tau));
48	double c = 1.0 / sqrt(1.0 + t * t);
49	double s = t * c;
50	// Rotate A
51	for (int k = 0; k < n; ++k) if (k != p && k != q) {
52	double aik = A[p][k], akq = A[q][k];
53	A[p][k] = A[k][p] = c * aik - s * akq;
54	A[q][k] = A[k][q] = s * aik + c * akq;
55	}
56	double app_new = ccapp - 2scapq + ss*aqq;
57	double aqq_new = ssapp + 2scapq + cc*aqq;
58	A[p][p] = app_new; A[q][q] = aqq_new; A[p][q] = A[q][p] = 0.0;
59	// Rotate V
60	for (int k = 0; k < n; ++k) {
61	double vip = V[k][p], viq = V[k][q];
62	V[k][p] = cvip - sviq;
63	V[k][q] = svip + cviq;
64	}
65	}
66	// Extract eigenvalues and sort ascending
67	d.resize(n);
68	for (int i = 0; i < n; ++i) d[i] = A[i][i];
69	vector<int> idx(n); iota(idx.begin(), idx.end(), 0);
70	sort(idx.begin(), idx.end(), [&](int i, int j){ return d[i] < d[j]; });
71	vector<double> d_sorted(n);
72	vector<vector<double>> V_sorted(n, vector<double>(n));
73	for (int r = 0; r < n; ++r) {
74	d_sorted[r] = d[idx[r]];
75	for (int i = 0; i < n; ++i) V_sorted[i][r] = V[i][idx[r]];
76	}
77	d.swap(d_sorted); V.swap(V_sorted);
78	}
79
80	vector<double> matVec(const vector<vector<double>>& M, const vector<double>& x) {
81	int n = (int)M.size();
82	vector<double> y(n, 0.0);
83	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[i] += M[i][j] * x[j];
84	return y;
85	}
86
87	vector<double> transposeMatVec(const vector<vector<double>>& M, const vector<double>& x) {
88	// For symmetric M, same as matVec. This is generic.
89	int n = (int)M.size();
90	vector<double> y(n, 0.0);
91	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[j] += M[i][j] * x[i];
92	return y;
93	}
94
95	vector<double> multiplyByUGLambdaUTx(const vector<vector<double>>& U, const vector<double>& lambda, const vector<double>& x, function<double(double)> g) {
96	int n = (int)U.size();
97	// x_hat = U^T x
98	vector<double> xhat(n, 0.0);
99	for (int i = 0; i < n; ++i) {
100	for (int j = 0; j < n; ++j) xhat[i] += U[j][i] * x[j];
101	}
102	// Apply filter in spectral domain
103	for (int i = 0; i < n; ++i) xhat[i] *= g(lambda[i]);
104	// y = U xhat
105	vector<double> y(n, 0.0);
106	for (int i = 0; i < n; ++i) for (int j = 0; j < n; ++j) y[i] += U[i][j] * xhat[j];
107	return y;
108	}
109
110	int main() {
111	// Small demo graph (path of 5 nodes)
112	int n = 5;
113	vector<WeightedEdge> edges;
114	auto add = [&](int u, int v){ edges.push_back({u,v,1.0}); };
115	add(0,1); add(1,2); add(2,3); add(3,4);
116
117	auto L = buildLaplacian(n, edges);
118
119	// Eigen-decomposition of L
120	vector<double> lambda; vector<vector<double>> U;
121	jacobiEigenDecomposition(L, lambda, U);
122
123	// Define a heat kernel filter g(λ) = exp(-t λ)
124	double t = 1.0;
125	auto g = [&](double lam){ return exp(-t * lam); };
126
127	// A noisy signal on nodes
128	vector<double> x = {1.0, 0.0, 1.0, 0.0, 1.0};
129
130	// Filter via spectral convolution y = U g(Λ) U^T x
131	vector<double> y = multiplyByUGLambdaUTx(U, lambda, x, g);
132
133	cout << fixed << setprecision(6);
134	cout << "Eigenvalues (ascending):\n";
135	for (double lam : lambda) cout << lam << " "; cout << "\n\n";
136
137	cout << "Input x:\n"; for (double xi : x) cout << xi << " "; cout << "\n";
138	cout << "Filtered y (heat kernel t=1):\n"; for (double yi : y) cout << yi << " "; cout << "\n";
139
140	return 0;
141	}
142

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct Edge { int u, v; double w; };
5
6	struct SparseGraph {
7	int n; vector<vector<pair<int,double>>> adj; // undirected
8	explicit SparseGraph(int n): n(n), adj(n) {}
9	void addEdge(int u, int v, double w=1.0){
10	if (u==v) { adj[u].push_back({v,w}); return; }
11	adj[u].push_back({v,w}); adj[v].push_back({u,w});
12	}
13	};
14
15	// Multiply by unnormalized Laplacian: y = L x = D x - A x (O(m))
16	vector<double> laplacianTimesVec(const SparseGraph& G, const vector<double>& x){
17	int n = G.n; vector<double> y(n, 0.0); vector<double> deg(n, 0.0);
18	for (int u = 0; u < n; ++u) for (auto [v,w] : G.adj[u]) deg[u] += w;
19	for (int i = 0; i < n; ++i) y[i] = deg[i] * x[i];
20	for (int u = 0; u < n; ++u) for (auto [v,w] : G.adj[u]) y[u] -= w * x[v];
21	return y;
22	}
23
24	// Power iteration to estimate largest eigenvalue of L (lambda_max)
25	double estimateLambdaMax(const SparseGraph& G, int iters=50){
26	int n = G.n; vector<double> v(n), y(n);
27	std::mt19937 rng(42); std::uniform_real_distribution<double> dist(0.0, 1.0);
28	for (int i = 0; i < n; ++i) v[i] = dist(rng);
29	auto norm = [&](const vector<double>& a){ double s=0; for(double z:a) s+=z*z; return sqrt(max(s,1e-30)); };
30	double ray = 0.0;
31	for (int t = 0; t < iters; ++t) {
32	y = laplacianTimesVec(G, v);
33	double nv = norm(y);
34	for (int i = 0; i < n; ++i) v[i] = y[i] / nv;
35	// Rayleigh quotient v^T L v / v^T v (v is normalized)
36	vector<double> Lv = laplacianTimesVec(G, v);
37	double num=0.0, den=0.0; for (int i=0;i<n;++i){ num += v[i]Lv[i]; den += v[i]v[i]; }
38	ray = num / max(den,1e-30);
39	}
40	return max(ray, 1e-12);
41	}
42
43	// Multiply by scaled operator: \tilde{L} = (2/lmax) L - I
44	vector<double> scaledLaplacianTimesVec(const SparseGraph& G, const vector<double>& x, double lmax){
45	vector<double> Lx = laplacianTimesVec(G, x);
46	int n = (int)x.size();
47	vector<double> y(n);
48	for (int i = 0; i < n; ++i) y[i] = (2.0 / lmax) * Lx[i] - x[i];
49	return y;
50	}
51
52	// Chebyshev filter: y = sum_{k=0}^K c_k T_k(\tilde{L}) x
53	vector<double> chebyshevFilter(const SparseGraph& G, const vector<double>& x, const vector<double>& c){
54	int n = G.n; int K = (int)c.size() - 1;
55	double lmax = estimateLambdaMax(G);
56	vector<double> t0 = x; // T0(x)
57	vector<double> y(n, 0.0);
58	for (int i = 0; i < n; ++i) y[i] += c[0] * t0[i];
59	if (K == 0) return y;
60	vector<double> t1 = scaledLaplacianTimesVec(G, x, lmax); // T1(x) = \tilde{L} x
61	for (int i = 0; i < n; ++i) y[i] += c[1] * t1[i];
62	for (int k = 2; k <= K; ++k) {
63	// T_k(x) = 2 \tilde{L} T_{k-1}(x) - T_{k-2}(x)
64	vector<double> Lt1 = scaledLaplacianTimesVec(G, t1, lmax);
65	vector<double> tk(n);
66	for (int i = 0; i < n; ++i) tk[i] = 2.0 * Lt1[i] - t0[i];
67	for (int i = 0; i < n; ++i) y[i] += c[k] * tk[i];
68	t0.swap(t1); t1.swap(tk);
69	}
70	return y;
71	}
72
73	int main(){
74	// Build a sparse ring graph of 8 nodes
75	int n = 8; SparseGraph G(n);
76	for (int i = 0; i < n; ++i) {
77	int j = (i+1) % n; G.addEdge(i, j, 1.0);
78	}
79	// Sample signal with alternating noise
80	vector<double> x(n);
81	for (int i = 0; i < n; ++i) x[i] = (i%2==0? 1.0 : 0.0) + 0.2*((i%3)-1);
82
83	// Choose Chebyshev coefficients (low-pass-ish smoothing)
84	// Example small-degree filter; in practice, compute c_k from desired g(λ)
85	vector<double> c = {0.8, -0.5, 0.2, -0.05}; // K = 3
86
87	vector<double> y = chebyshevFilter(G, x, c);
88
89	cout << fixed << setprecision(6);
90	cout << "Input x:\n"; for (double v : x) cout << v << ' '; cout << "\n";
91	cout << "Filtered y (Chebyshev K=3):\n"; for (double v : y) cout << v << ' '; cout << "\n";
92	return 0;
93	}
94

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	struct Graph { int n; vector<vector<int>> adj; explicit Graph(int n): n(n), adj(n) {} };
5
6	void addUndirectedEdge(Graph& G, int u, int v){ if(u==v) return; G.adj[u].push_back(v); G.adj[v].push_back(u); }
7
8	// Single GCN propagation: H = \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} X, then Z = H W
9	vector<vector<double>> gcnLayer(const Graph& G, const vector<vector<double>>& X, const vector<vector<double>>& W){
10	int n = G.n; int F = (int)X[0].size(); int Fout = (int)W[0].size();
11	// Build A with self-loops: \hat{A} = A + I
12	vector<vector<int>> adj = G.adj; for(int i=0;i<n;++i) adj[i].push_back(i);
13	vector<double> deg(n, 0.0);
14	for (int i = 0; i < n; ++i) deg[i] = (double)adj[i].size();
15	// H = \hat{D}^{-1/2} \hat{A} \hat{D}^{-1/2} X
16	vector<vector<double>> H(n, vector<double>(F, 0.0));
17	for (int u = 0; u < n; ++u){
18	for (int v : adj[u]){
19	double w = 1.0 / sqrt(deg[u] * deg[v]);
20	for (int f = 0; f < F; ++f) H[u][f] += w * X[v][f];
21	}
22	}
23	// Z = H W
24	vector<vector<double>> Z(n, vector<double>(Fout, 0.0));
25	for (int i = 0; i < n; ++i){
26	for (int j = 0; j < Fout; ++j){
27	double s = 0.0;
28	for (int f = 0; f < F; ++f) s += H[i][f] * W[f][j];
29	Z[i][j] = s; // typically followed by nonlinearity
30	}
31	}
32	return Z;
33	}
34
35	int main(){
36	// Small graph (triangle + tail)
37	Graph G(5);
38	addUndirectedEdge(G,0,1); addUndirectedEdge(G,1,2); addUndirectedEdge(G,2,0); // triangle
39	addUndirectedEdge(G,2,3); addUndirectedEdge(G,3,4); // tail
40
41	// Node features X (n x F)
42	vector<vector<double>> X = {
43	{1.0, 0.0}, {0.5, 0.2}, {0.0, 1.0}, {0.2, 0.1}, {0.0, 0.5}
44	};
45
46	// Weight matrix W (F x Fout)
47	vector<vector<double>> W = {
48	{0.7, -0.3},
49	{0.2, 0.9}
50	};
51
52	auto Z = gcnLayer(G, X, W);
53
54	cout << fixed << setprecision(6);
55	cout << "Z after one GCN layer:\n";
56	for (auto &row : Z){
57	for (double v : row) cout << v << ' ';
58	cout << '\n';
59	}
60	return 0;
61	}
62