Wigner Semicircle Law
Key Points
- •The Wigner Semicircle Law says that the histogram of eigenvalues of large random symmetric matrices converges to a semicircle-shaped curve.
- •To see the law numerically, you must scale entries by 1/; without this scaling the eigenvalues blow up and the limit fails.
- •The limiting density on [-2,2] is (x) = , which integrates to 1 and has mean 0.
- •All odd moments of the limit are 0, and the 2k-th moment equals the k-th Catalan number ; this can be checked via traces of powers.
- •The result is universal: it holds for many entry distributions (e.g., Gaussian or Rademacher) as long as entries are independent, centered, and have the right variance.
- •A simple Jacobi eigenvalue solver (O()) lets you simulate the histogram in C++ for moderate n (e.g., ).
- •You can also verify the law by computing empirical moments Tr() and comparing them to Catalan numbers.
- •Finite-size effects are real: for small n you need multiple trials and careful binning to see the semicircle shape emerge.
Prerequisites
- →Linear algebra basics (eigenvalues, eigenvectors, trace) — The law describes distributions of eigenvalues and uses trace identities like Tr(A^k).
- →Probability theory (expectation, variance, law of large numbers) — Understanding independence, variance scaling, and convergence in distribution is crucial.
- →Measure-theoretic convergence (weak convergence) — Formal statement uses weak convergence of empirical spectral distributions.
- →Combinatorics (Catalan numbers, binomial coefficients) — Moment computations and pairings are counted by Catalan numbers.
- →Numerical linear algebra (eigenvalue algorithms) — Simulations require computing eigenvalues efficiently and stably.
- →Big-O notation and algorithm analysis — To assess the runtime of eigenvalue and matrix power computations.
- →C++ programming (vectors, random numbers) — Implementing simulations, random sampling, and basic matrix operations.
Detailed Explanation
Tap terms for definitions01Overview
The Wigner Semicircle Law is a cornerstone result in random matrix theory describing the global distribution of eigenvalues of large symmetric (or Hermitian) random matrices. Consider an n×n matrix W_n with independent, identically distributed entries above the diagonal, mirrored to be symmetric, mean zero, and variance scaled like 1/n. As n grows, the empirical histogram of eigenvalues stabilizes and approaches a deterministic curved shape: a semicircle supported on a finite interval. Specifically, if off-diagonal entries have variance 1/n, the eigenvalues concentrate on [−2, 2] and their density approaches (1/(2π))√(4 − x²). This convergence is robust: it does not depend on the exact distribution of the entries (Gaussian, ±1, etc.), provided some mild moment conditions hold; this is called universality.
Practically, the law explains why spectra of many large complex systems (networks, noise matrices, coupled oscillators) have a predictable bulk shape. It also provides asymptotic formulas for traces of powers, spectral norms, and resolvents that are useful in statistics, physics, and theoretical computer science. You can verify the law computationally by generating random symmetric matrices with entries scaled by 1/√n, computing their eigenvalues, and making a histogram; even for moderate sizes (n ≈ 50–200), the semicircle becomes apparent after a few trials.
02Intuition & Analogies
Imagine placing many tiny weights (eigenvalues) on a line, where their positions come from a complex balancing act of all the random entries in the matrix. If the entries are not scaled, the total effect grows with n and the weights spread out indefinitely. But if we scale each entry by 1/√n, the collective influence of n terms on each row/column remains O(1), keeping the weights in a bounded region. Now think of repeatedly combining many small, independent, centered effects: by the law of large numbers and central limit–type phenomena, their aggregate behaves predictably. For symmetric matrices, the way these effects intertwine across rows and columns creates strong cancellations and symmetries in powers of the matrix.
A concrete analogy: picture a large crowd shuffling left and right on a stage. Each person’s step is random but small, and because steps are symmetric and scaled, the crowd never drifts off-stage; instead, they settle into a consistent shape when viewed from afar. That shape is the semicircle. The crowd’s tallest point is at the center (x = 0), reflecting that most eigenvalues cluster near zero; near the edges (±2), fewer eigenvalues appear, so the density tapers to zero like a square root.
Another lens is via moments: the average of x^{k} across all eigenvalues equals (1/n) Tr(W^{k}). When you expand Tr(W^{k}), only certain index “pairings” survive in expectation as n grows, and the number of these valid pairings equals the Catalan numbers. This elegant combinatorial count is why the even moments match Catalan numbers and the odd moments vanish, producing exactly the semicircle law.
03Formal Definition
04When to Use
- Modeling “pure noise” symmetric interactions: If your matrix entries are independent, centered, and scaled like 1/\sqrt{n}, the bulk eigenvalue distribution is predicted by the semicircle law. Examples include GOE (Gaussian Orthogonal Ensemble), random graph adjacency matrices after centering and scaling, and noise in signal processing.
- Sanity checks and baselines: In high-dimensional data analysis, compare an observed eigenvalue histogram to the semicircle to detect structure; strong deviations suggest signal beyond pure noise.
- Algorithm testing: Use Wigner matrices to benchmark eigen-solvers; their spectra have a known limiting shape and edge behavior, which makes them useful regression tests.
- Theoretical derivations: When you need asymptotics of traces, resolvents, or norms of random symmetric matrices (e.g., estimating spectral radius |W_n| \to 2\sigma), the law provides the leading-order behavior.
- Universality exploration: To argue that certain spectral statistics don’t depend on the exact entry distribution, start from the semicircle law as the global limiting distribution and then consider refinements (local laws, edge universality) when needed.
⚠️Common Mistakes
- Forgetting 1/\sqrt{n} scaling: Without this normalization, eigenvalues grow like \sqrt{n} and the histogram does not stabilize; you will not see a semicircle.
- Not symmetrizing the matrix: Generating an i.i.d. matrix and skipping A = (A + A^{T})/2 (or mirroring entries) breaks the assumptions; the spectrum then follows different laws (e.g., circular law for non-Hermitian matrices).
- Overfitting small n: For n ≤ 30, a single histogram can look noisy; run multiple trials, average histograms, or increase n to reduce variance.
- Diagonal variance worries: The exact variance of diagonal entries is asymptotically irrelevant (as long as it’s O(1/n)), but for small n using a wildly different scale on the diagonal can distort the histogram.
- Numerical issues in eigen-solvers: Homegrown eigenvalue algorithms (e.g., Jacobi) can be slow (O(n^{3})) and sensitive to tolerances; for large n, use well-tested libraries (Eigen, LAPACK) and avoid tight tolerances that inflate runtime.
- Misinterpreting moments: Comparing (1/n) Tr(W^{k}) to Catalan numbers requires the correct scaling and averaging; odd moments should be near 0 but won’t be exactly 0 for finite samples.
Key Formulas
Semicircle Density
Explanation: This is the limiting probability density of eigenvalues for Wigner matrices with variance parameter /n. It is supported on [−2,2] and integrates to 1.
Empirical Spectral Distribution
Explanation: The ESD places equal mass at each eigenvalue. As n grows, this measure converges weakly to the semicircle distribution.
Stieltjes Transform of Semicircle
Explanation: The Stieltjes transform characterizes the measure and solves a quadratic equation. The branch of the square root is chosen so that (m(z)) > 0 when (z) > 0.
Self-Consistent Equation
Explanation: The Stieltjes transform of the limiting ESD satisfies this fixed-point equation. It is central to the proof via resolvents.
Moments of Semicircle
Explanation: Even moments equal Catalan numbers times , and all odd moments vanish due to symmetry. This follows from the moment method.
Catalan Numbers
Explanation: Catalan numbers count noncrossing pairings, which are exactly the terms that survive in the trace expansion of Wigner matrix powers.
Moment Convergence
Explanation: Normalized traces of powers converge almost surely to the moments of the semicircle law. This implies weak convergence of the ESD.
Spectral Edge
Explanation: The largest absolute eigenvalue (spectral norm) converges in probability to 2, matching the support of the semicircle density.
Wigner Scaling
Explanation: Entries are centered and scaled by 1/ so that row/column variances remain O(1) and the spectrum has a bounded support.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Simple symmetric Jacobi eigenvalue solver (eigenvalues only) 5 struct SymmetricMatrix { 6 int n; 7 vector<double> a; // row-major 8 SymmetricMatrix(int n_=0): n(n_), a(n_*n_, 0.0) {} 9 double& operator()(int i, int j) { return a[(size_t)i*n + j]; } 10 double operator()(int i, int j) const { return a[(size_t)i*n + j]; } 11 }; 12 13 static inline double sign_double(double x) { return (x >= 0.0 ? 1.0 : -1.0); } 14 15 vector<double> jacobiEigenvalues(SymmetricMatrix A, int max_sweeps=60, double tol=1e-10) { 16 int n = A.n; 17 vector<double> d(n); 18 // Iteratively zero out off-diagonal entries by Givens rotations 19 for (int sweep = 0; sweep < max_sweeps; ++sweep) { 20 // Find largest off-diagonal element in absolute value 21 int p = 0, q = 1; 22 double maxOff = 0.0; 23 for (int i = 0; i < n; ++i) { 24 for (int j = i+1; j < n; ++j) { 25 double v = fabs(A(i,j)); 26 if (v > maxOff) { maxOff = v; p = i; q = j; } 27 } 28 } 29 if (maxOff < tol) break; // converged 30 31 double app = A(p,p), aqq = A(q,q), apq = A(p,q); 32 if (fabs(apq) < tol) continue; 33 double tau = (aqq - app) / (2.0 * apq); 34 double t = sign_double(tau) / (fabs(tau) + sqrt(1.0 + tau*tau)); 35 double c = 1.0 / sqrt(1.0 + t*t); 36 double s = t * c; 37 double app_new = app - t * apq; 38 double aqq_new = aqq + t * apq; 39 A(p,p) = app_new; A(q,q) = aqq_new; A(p,q) = A(q,p) = 0.0; 40 // Update other entries in rows/cols p and q 41 for (int r = 0; r < n; ++r) if (r != p && r != q) { 42 double arp = A(r,p), arq = A(r,q); 43 double arp_new = c*arp - s*arq; 44 double arq_new = s*arp + c*arq; 45 A(r,p) = A(p,r) = arp_new; 46 A(r,q) = A(q,r) = arq_new; 47 } 48 } 49 for (int i = 0; i < n; ++i) d[i] = A(i,i); 50 sort(d.begin(), d.end()); 51 return d; 52 } 53 54 // Build a Wigner matrix with variance sigma^2/n. Two entry types: gaussian or rademacher. 55 SymmetricMatrix makeWigner(int n, double sigma = 1.0, const string& type = "gaussian", uint64_t seed = 42) { 56 SymmetricMatrix A(n); 57 mt19937_64 rng(seed); 58 normal_distribution<double> normal(0.0, 1.0); 59 uniform_real_distribution<double> uni(0.0, 1.0); 60 auto draw_entry = [&](bool diagonal){ 61 if (type == "gaussian") { 62 // GOE-style: off-diagonal ~ N(0, sigma^2/n), diagonal ~ N(0, 2 sigma^2/n) 63 double s = diagonal ? sigma*sqrt(2.0) : sigma; 64 return (s * normal(rng)) / sqrt((double)n); 65 } else { // rademacher ±1 with appropriate scaling, diagonal has sqrt(2) factor 66 double r = (uni(rng) < 0.5 ? -1.0 : 1.0); 67 double s = diagonal ? sigma*sqrt(2.0) : sigma; 68 return (s * r) / sqrt((double)n); 69 } 70 }; 71 for (int i = 0; i < n; ++i) { 72 A(i,i) = draw_entry(true); 73 for (int j = i+1; j < n; ++j) { 74 double v = draw_entry(false); 75 A(i,j) = A(j,i) = v; 76 } 77 } 78 return A; 79 } 80 81 // Theoretical semicircle density for sigma=1 (general sigma handled by scaling argument) 82 double semicircle_density(double x, double sigma=1.0) { 83 double R = 2.0 * sigma; 84 if (x < -R || x > R) return 0.0; 85 return (1.0 / (2.0 * M_PI * sigma * sigma)) * sqrt(max(0.0, R*R - x*x)); 86 } 87 88 int main() { 89 ios::sync_with_stdio(false); 90 cin.tie(nullptr); 91 92 int n = 80; // matrix size 93 int bins = 40; // histogram bins 94 string dist = "gaussian"; // or "rademacher" 95 uint64_t seed = 12345; // change for different trials 96 97 SymmetricMatrix A = makeWigner(n, 1.0, dist, seed); 98 vector<double> evals = jacobiEigenvalues(A, 80, 1e-10); 99 100 // Build histogram on [-2.5, 2.5] 101 double L = -2.5, U = 2.5; 102 double bw = (U - L) / bins; 103 vector<int> counts(bins, 0); 104 for (double lam : evals) { 105 if (lam < L || lam >= U) continue; 106 int b = (int)floor((lam - L) / bw); 107 if (0 <= b && b < bins) counts[b]++; 108 } 109 110 cout << fixed << setprecision(4); 111 cout << "BinCenter EmpProb TheoProb (approx)\n"; 112 for (int b = 0; b < bins; ++b) { 113 double a = L + b * bw; 114 double c = a + 0.5 * bw; // midpoint 115 double emp_prob = (double)counts[b] / (double)n; 116 double theo_prob = semicircle_density(c) * bw; // midpoint rule 117 cout << setw(8) << c << " " << setw(7) << emp_prob << " " << setw(7) << theo_prob << "\n"; 118 } 119 120 // Compute L1 distance between histogram and theory (rough measure) 121 double L1 = 0.0; 122 for (int b = 0; b < bins; ++b) { 123 double c = L + (b + 0.5) * bw; 124 double emp_prob = (double)counts[b] / (double)n; 125 double theo_prob = semicircle_density(c) * bw; 126 L1 += fabs(emp_prob - theo_prob); 127 } 128 cout << "\nApproximate L1 distance between histogram and semicircle: " << L1 << "\n"; 129 return 0; 130 } 131
This program generates a Wigner matrix with entries scaled by 1/\sqrt{n} (Gaussian or Rademacher), computes all eigenvalues using the Jacobi method, and builds a histogram on [−2.5, 2.5]. It prints the empirical probability per bin and the theoretical semicircle probability mass estimated by the midpoint rule. The rough L1 distance summarizes how close the histogram is to the semicircle. For n≈80 and a few different seeds, you should see the bulk match the semicircle and fall to zero near ±2.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Matrix { 5 int n; vector<long double> a; // row-major 6 Matrix(int n_=0, long double val=0): n(n_), a((size_t)n_*n_, val) {} 7 long double& operator()(int i, int j){ return a[(size_t)i*n + j]; } 8 long double operator()(int i, int j) const { return a[(size_t)i*n + j]; } 9 }; 10 11 Matrix multiply(const Matrix& A, const Matrix& B){ 12 int n = A.n; Matrix C(n, 0.0L); 13 for(int i=0;i<n;++i){ 14 for(int k=0;k<n;++k){ 15 long double aik = A(i,k); 16 if (aik == 0.0L) continue; 17 for(int j=0;j<n;++j){ 18 C(i,j) += aik * B(k,j); 19 } 20 } 21 } 22 return C; 23 } 24 25 long double trace(const Matrix& A){ 26 long double s=0.0L; int n=A.n; for(int i=0;i<n;++i) s+=A(i,i); return s; 27 } 28 29 Matrix makeWignerLD(int n, long double sigma, const string& type, uint64_t seed){ 30 Matrix A(n, 0.0L); 31 mt19937_64 rng(seed); 32 normal_distribution<long double> normal(0.0L, 1.0L); 33 uniform_real_distribution<long double> uni(0.0L, 1.0L); 34 auto draw_entry = [&](bool diagonal){ 35 long double s = diagonal ? sigma*sqrtl(2.0L) : sigma; 36 if (type == "gaussian") { 37 return s * normal(rng) / sqrtl((long double)n); 38 } else { 39 long double r = (uni(rng) < 0.5L ? -1.0L : 1.0L); 40 return s * r / sqrtl((long double)n); 41 } 42 }; 43 for(int i=0;i<n;++i){ 44 A(i,i) = draw_entry(true); 45 for(int j=i+1;j<n;++j){ 46 long double v = draw_entry(false); 47 A(i,j) = A(j,i) = v; 48 } 49 } 50 return A; 51 } 52 53 long double catalan(int k){ // returns C_k as long double 54 // Stable multiplicative formula: C_k = prod_{i=2..k} (k+i)/i 55 if (k==0) return 1.0L; 56 long double c = 1.0L; 57 for (int i=2;i<=k;++i) c *= (long double)(k+i) / (long double)i; 58 return c; 59 } 60 61 int main(){ 62 ios::sync_with_stdio(false); 63 cin.tie(nullptr); 64 65 int n = 50; // matrix size 66 int T = 30; // number of trials to average 67 int Kmax = 8; // highest power to test 68 string dist = "rademacher"; // try "gaussian" as well 69 70 vector<long double> avg(Kmax+1, 0.0L); 71 72 for(int t=0;t<T;++t){ 73 uint64_t seed = 777 + 911ULL * t; 74 Matrix W = makeWignerLD(n, 1.0L, dist, seed); 75 Matrix P(n, 0.0L); // P = I 76 for(int i=0;i<n;++i) P(i,i) = 1.0L; 77 // k=0 moment is 1 by definition (trace(I)/n) 78 avg[0] += 1.0L; 79 for(int k=1;k<=Kmax;++k){ 80 P = multiply(P, W); // P = W^k 81 long double mk = (long double)trace(P) / (long double)n; 82 avg[k] += mk; 83 } 84 } 85 for(int k=0;k<=Kmax;++k) avg[k] /= (long double)T; 86 87 cout << fixed << setprecision(6); 88 cout << "k Empirical_mk Theoretical_mk\n"; 89 for(int k=0;k<=Kmax;++k){ 90 long double theo = (k%2==1 ? 0.0L : catalan(k/2)); 91 cout << setw(2) << k << " " << setw(14) << (double)avg[k] << " " << setw(14) << (double)theo << "\n"; 92 } 93 return 0; 94 } 95
This program estimates moments m_k = (1/n) Tr(W^k) by averaging over T Wigner matrices, using naive matrix multiplication to form powers. For \sigma=1, odd moments should be near 0 and even moments near the Catalan numbers (1, 1, 2, 5, 14, …). With n≈50, Kmax=8, and T≈30, you should see reasonable agreement despite finite-sample fluctuations.