📚TheoryAdvanced

Random Matrix Theory in High-Dimensional Statistics

Key Points

•
Random Matrix Theory (RMT) explains how eigenvalues of large random matrices behave when the dimension p is comparable to the sample size n.
•
Classical statistical results (like consistent covariance eigenvalues) often break down when p and n grow together with $p / n \to γ$ in (0, ∞).
•
The Marchenko–Pastur law describes the limiting distribution of eigenvalues of sample covariance matrices when p/n → $γ .$
•
The Wigner semicircle law describes eigenvalues of large symmetric random matrices with independent entries of zero mean.
•
In the spiked covariance model, a strong enough signal creates an outlier eigenvalue that separates from the bulk (the BBP phase transition).
•
Extreme eigenvalues fluctuate at an $n^{- 2/3}$ scale and follow Tracy–Widom laws instead of Gaussian limits.
•
RMT guides robust methods like covariance shrinkage, high-dimensional PCA, and understanding overfitting in machine learning.
•
Simple C++ simulations can reproduce the bulk behavior (support and histograms) and detect outliers predicted by RMT.

Prerequisites

→Linear algebra fundamentals — Eigenvalues, eigenvectors, singular values, and symmetric matrices underpin RMT.
→Probability and random variables — RMT assumes knowledge of distributions, expectations, and independence.
→Matrix operations and norms — Covariance formation, spectral norms, and conditioning are central.
→Asymptotic analysis — Understanding limits as n, p → ∞ with p/n → γ is core to RMT.
→Classical statistical inference — Appreciating what breaks in high dimensions requires the classical baseline.
→Principal Component Analysis (PCA) — RMT explains behavior of PCA in high dimensions and bulk vs outliers.
→Numerical linear algebra — Power iteration, eigen-solvers, and stability affect simulations and practice.
→Measure/integration basics — Understanding spectral distributions and densities like MP and semicircle.
→Gaussian random vectors and Wishart distributions — Many canonical RMT results are derived first in the Gaussian case.
→Regularization and convex optimization — High-dimensional covariance inversion often requires shrinkage or penalties.

Detailed Explanation

Tap terms for definitions

01Overview

Random Matrix Theory (RMT) studies the behavior of matrices with random entries as their size grows. In high-dimensional statistics, the most relevant setting is when the number of variables p is of the same order as the number of observations n (p/n → γ, a positive constant). In this regime, classical statistical intuition—built for fixed p and large n—can fail dramatically: sample covariance eigenvalues spread out, sample principal components become noisy, and naive matrix inversions are unstable or impossible. RMT provides precise laws for these phenomena. The Marchenko–Pastur (MP) law gives the limiting distribution of eigenvalues of sample covariance matrices (also called Wishart matrices), showing that they fill a continuous interval whose endpoints depend on γ. For symmetric noise matrices, the Wigner semicircle law describes how eigenvalues concentrate on a fixed interval. Beyond the bulk, RMT also predicts edge behavior: the largest eigenvalue concentrates near the upper edge and its fluctuations follow the Tracy–Widom distribution. These results are not just mathematical curiosities—they motivate practical tools. For example, in PCA we should not interpret every large sample eigenvalue as meaningful: many are explained by the MP bulk. In covariance estimation, we often shrink eigenvalues toward their average to counteract high-dimensional noise. This overview connects theory to computation with C++ simulations that reproduce the main RMT signatures.

02Intuition & Analogies

Imagine listening to a crowded room: many people talk at once (high p), and you only catch a short recording (finite n). With just a few seconds of audio, you cannot perfectly separate voices. The mixture you hear has structure—but also unavoidable overlap. In data, each variable is like a voice and the sample covariance matrix is like measuring how voices co-vary. When p is large relative to n, random overlap dominates, and the raw measurements blur together. Random Matrix Theory says this blur is not arbitrary—it’s predictable. Think of throwing many pebbles (randomness) into a pond (the matrix). The ripples (eigenvalues) don’t form a sharp spike at the true variance; instead they spread across a band. The width of this band depends on how many pebbles you throw per second—analogous to the aspect ratio γ = p/n. The Marchenko–Pastur law quantifies exactly how wide and how dense this band is. If there’s a strong, single drummer in the room (a low-rank signal or "spike"), their beat can stand out above the ambient chatter. But if the drummer is too soft, their rhythm drowns in the crowd—this is the BBP phase transition where the signal either emerges as an outlier eigenvalue or gets absorbed into the bulk. For symmetric noise matrices (Wigner), think of random bumps on a long road. While each bump is random, the distribution of road heights across long stretches settles into a characteristic hill shape (the semicircle). These intuitive pictures explain why, in high dimensions, noise patterns are structured and why naive interpretations (like treating every large eigenvalue as a true factor) can mislead.

03Formal Definition

Consider a data matrix X

\in

R^{n \times p}

with independent entries of mean 0 and variance 1 (or sub-Gaussian tails). The sample covariance is S =

\frac{1}{n}

X^{⊤}

\in

R^{p \times p}

. Let

F^{S}

(

λ

) be its empirical spectral distribution (ESD): the fraction of eigenvalues

\leq

λ

. In the high-dimensional regime with p/n

\to

γ

\in

(0,

\infty

), the ESD converges in probability to the Marchenko–Pastur distribution with support [

λ_{-}

λ_{+}

] where

λ_{\pm}

= (1

\pm

γ

)^{2}. Its density is

f_{γ}

(

λ

) =

\frac

{

(λ_{+} - λ) (λ - λ_{-})

}{2

π

γ

λ

} for

λ

\in

[

λ_{-}

λ_{+}

]. For symmetric random matrices

W_{n}

with independent entries of mean 0 and variance

σ^{2}

/n (Wigner-type), the ESD converges to the semicircle law with density

ρ

(x) =

\frac

{

4 σ^{2} - x^{2}

}{2

π

σ^{2}

} on [-2

σ

, 2

σ

]. Edge fluctuations of the largest eigenvalue are governed by the Tracy–Widom distributions at

n^{- 2/3}

scale. In the spiked covariance model, the population covariance is

Σ

I_{p}

\sum_{k = 1}^{r}

θ_{k}

v_{k}

v_{k}^{⊤}

with orthonormal \{

v_{k}

\}. When a spike strength

θ

exceeds

γ

, the top sample eigenvalue separates from the bulk (BBP transition) and asymptotically equals (1 +

θ

)

\left

(1 +

\frac{\gamma}{\theta}\right

); otherwise it sticks to the MP edge

λ_{+}

. These limits hold broadly under moment conditions (universality).

04When to Use

Use RMT whenever p and n are of the same order, including:

Principal Component Analysis (PCA): To decide which components are signal versus noise, compare eigenvalues to the Marchenko–Pastur bulk and look for outliers above \lambd $a_{+}$ .
Covariance Estimation: When p is large relative to n, the sample covariance is ill-conditioned or singular (if p > n). Apply shrinkage (e.g., Ledoit–Wolf) guided by MP behavior to stabilize inverses.
High-Dimensional Regression and Classification: Understand double descent, effective rank, and generalization through spectra of design matrices and kernels; spectral norms follow Bai–Yin/MP predictions.
Signal Processing and Wireless Communications: MIMO channel capacity and detection rely on eigenvalue spectra of random matrices; MP law provides asymptotics.
Finance (Portfolio Optimization): Large asset universes with limited historical data lead to noisy covariance estimates; RMT helps filter noise before optimization.
Genomics/Neuroscience: Thousands of features with limited samples; RMT-based thresholds prevent overinterpretation of spurious factors. Avoid relying on classical chi-square or F-distribution approximations for eigenvalues/variances when p/n is not negligible; instead, base decisions on MP bounds, spectral norms, or simulations informed by RMT.

⚠️Common Mistakes

Treating sample eigenvalues as unbiased estimates of population eigenvalues when p is not negligible relative to n. In high dimensions, eigenvalues are spread by noise; compare to MP support before interpreting.
Ignoring scaling: forgetting the 1/n factor in S = (1/n) $X^{\top}$ X changes the spectrum drastically. Always standardize variables and use the correct normalization.
Using classical inference (e.g., Bartlett’s test, chi-square intervals) without accounting for p/n. These tests assume fixed p and can be wildly anti-conservative when p/n is large.
Inverting a singular or ill-conditioned covariance (especially when p \ge n). Prefer regularization: shrinkage, ridge penalties, or pseudoinverses.
Confusing singular values and eigenvalues: for rectangular X, principal components relate to eigenvalues of S = (1/n) $X^{\top}$ X (squares of singular values of X/ $\sqrt{n}$ ).
Overfitting signals near the MP edge: sample spikes just above \lambd $a_{+}$ at finite n can still be noise; check BBP threshold and consider cross-validation or permutation tests.
Numerical pitfalls: naive eigenvalue solvers for large p are O( $p^{3}$ ) and unstable. Use power iteration for the top eigenvalue or lanczos/ARPACK in practice; in code demos, keep p modest.
Assuming Gaussianity is required: many RMT limits are universal under finite-moment conditions, but heavy tails may break assumptions—consider truncation or robust covariance.

Key Formulas

MP Support

λ_{\pm} = (1 \pm γ)^{2}

Explanation: These are the endpoints of the Marchenko–Pastur bulk for sample covariance eigenvalues when p/n → $γ .$ Most eigenvalues lie in [ $λ_{-},$ $λ_{+}] .$

Marchenko–Pastur Density

f_{γ} (λ) = \frac{( λ _{+} - λ ) ( λ - λ _{-} )}{2 πγλ} for λ \in [λ_{-}, λ_{+}]

Explanation: This gives the limiting density of eigenvalues for sample covariance matrices. It predicts a continuous bell-like shape over the MP support that depends on $γ .$

Wigner Semicircle Density

ρ (x) = \frac{4 σ ^{2} - x ^{2}}{2 π σ ^{2}} for x \in [- 2 σ, 2 σ]

Explanation: Eigenvalues of large symmetric random matrices with variance $σ^{2}$ /n entries follow this density. The support is a fixed interval whose width scales with $σ .$

Bai–Yin Law (Spectral Norm)

∥ X / n ∥ a . s . 1 + γ

Explanation: The largest singular value of a standardized rectangular random matrix converges almost surely to 1 + √ $γ .$ This characterizes the high-dimensional magnitude of noise.

BBP Transition (Top Eigenvalue)

λ_{m a x} (S) \approx {(1 + γ)^{2}, (1 + θ) (1 + \frac{γ}{θ}), θ \leq γ θ > γ

Explanation: In the spiked covariance model with a single spike of strength $θ,$ the largest sample eigenvalue either sticks to the MP edge or separates as an outlier depending on θ relative to √ $γ .$

Stieltjes Transform

m (z) = \int \frac{1}{λ - z} d F (λ)

Explanation: A transform of the spectral distribution used to derive and study limiting laws. For the MP law, m(z) satisfies a quadratic equation in m and z.

Tracy–Widom Fluctuations

\frac{λ _{m a x} - μ _{n, p}}{σ _{n, p}} d TW_{1}

Explanation: After appropriate centering $μ_{n, p}$ and scaling $σ_{n, p}$ (of order $n^{- 2/3}$ ), the largest eigenvalue converges in distribution to the Tracy–Widom law. This governs edge fluctuations.

Sample Covariance

S = \frac{1}{n} X^{⊤} X

Explanation: Defines the p × p covariance estimator from an n × p data matrix. Its eigenvalues are the squares of singular values of X/√n and follow MP asymptotics.

Empirical Spectral Distribution

ESD_{A} (λ) = \frac{1}{p} i = 1 \sum p 1 {λ_{i} (A) \leq λ}

Explanation: The fraction of eigenvalues below $λ .$ RMT describes how this distribution converges as matrix size grows.

Rank of Sample Covariance

rank (S) \leq min {n, p}

Explanation: When p > n, S is singular and has at least p − n zero eigenvalues. This explains why naive inversion fails in high dimensions.

Complexity Analysis

Generating an n × p random data matrix X with i.i.d. entries requires O(n p) time and O(n p) space if stored explicitly. Forming the sample covariance S =

\frac{1}{n}

X^{⊤}

X costs O(n

p^{2}

) time because we must compute all p(p+1)/2 pairwise inner products of columns (each requiring O(n) operations); the space to store S is O(

p^{2}

). If

p > n

, one can equivalently compute the n × n Gram matrix

\frac{1}{p}

X^{⊤}

and then map its nonzero eigenvalues to those of S to reduce time to O(p

n^{2}

). Full eigen-decomposition of a symmetric p × p matrix using dense methods (e.g., QR or Jacobi) takes O(

p^{3}

) time and O(

p^{2}

) space, which becomes the bottleneck for large p. When only the largest eigenvalue is needed (e.g., to detect an outlier spike), power iteration or Lanczos methods reduce time to O(

p^{2}

T) for T iterations and still require O(

p^{2}

) space to store S (or O(n p) if multiplying by X directly without forming S). For Wigner matrices of size p × p, generation costs O(

p^{2}

) and dense eigenvalue computations cost O(

p^{3}

). In practice, to scale beyond p ≈ 10^{3}, one uses sparse or iterative solvers and avoids materializing full covariances by working with matrix–vector products

X^{⊤}

(X v)/n in O(n p) time per iteration. Memory considerations are crucial: storing X uses O(n p) doubles, which may be prohibitive for large n, p. For demonstrations, keep

p \leq 200

and

n \leq 2000

, or stream rows to compute S incrementally in O(

p^{2}

) space.

Code Examples

Simulate Marchenko–Pastur bulk: sample covariance eigenvalue histogram

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Generate N(0,1) random numbers
5 struct RNG {
6     mt19937_64 gen;
7     normal_distribution<double> nd;
8     RNG(uint64_t seed=42): gen(seed), nd(0.0,1.0) {}
9     double normal(){ return nd(gen); }
10 };
11 
12 // Jacobi eigenvalue algorithm for symmetric matrices (small p, demo only)
13 void jacobiEigen(vector<vector<double>> A, vector<double>& eigenvalues, int maxSweeps=100, double eps=1e-10){
14     int n = (int)A.size();
15     vector<vector<double>> V(n, vector<double>(n, 0.0));
16     for(int i=0;i<n;++i) V[i][i] = 1.0;
17     for(int sweep=0;sweep<maxSweeps;++sweep){
18         double maxOff = 0.0; int p=0,q=1;
19         for(int i=0;i<n;++i){
20             for(int j=i+1;j<n;++j){
21                 if (fabs(A[i][j]) > maxOff){ maxOff = fabs(A[i][j]); p=i; q=j; }
22             }
23         }
24         if (maxOff < eps) break;
25         double app = A[p][p], aqq = A[q][q], apq = A[p][q];
26         double phi = 0.5 * atan2(2.0*apq, aqq - app);
27         double c = cos(phi), s = sin(phi);
28         // Rotate A = J^T A J in p-q plane
29         for(int k=0;k<n;++k){
30             double aik = A[p][k], aqk = A[q][k];
31             A[p][k] = c*aik - s*aqk;
32             A[q][k] = s*aik + c*aqk;
33         }
34         for(int k=0;k<n;++k){
35             double kip = A[k][p], kiq = A[k][q];
36             A[k][p] = c*kip - s*kiq;
37             A[k][q] = s*kip + c*kiq;
38         }
39         // Zero out explicitly to control roundoff
40         A[p][q] = A[q][p] = 0.0;
41     }
42     eigenvalues.resize(n);
43     for(int i=0;i<n;++i) eigenvalues[i] = A[i][i];
44 }
45 
46 int main(){
47     ios::sync_with_stdio(false);
48     cin.tie(nullptr);
49 
50     int n = 1000;   // samples
51     int p = 200;    // dimension (keep modest for Jacobi)
52     int bins = 40;  // histogram bins
53     uint64_t seed = 123;
54 
55     RNG rng(seed);
56     // Generate X (n x p) with N(0,1)
57     vector<vector<double>> X(n, vector<double>(p));
58     for(int i=0;i<n;++i){
59         for(int j=0;j<p;++j) X[i][j] = rng.normal();
60     }
61     // Compute S = (1/n) X^T X (p x p symmetric)
62     vector<vector<double>> S(p, vector<double>(p, 0.0));
63     for(int i=0;i<p;++i){
64         for(int j=i;j<p;++j){
65             long double sum = 0.0;
66             for(int k=0;k<n;++k) sum += (long double)X[k][i] * (long double)X[k][j];
67             double val = (double)(sum / n);
68             S[i][j] = S[j][i] = val;
69         }
70     }
71     // Compute eigenvalues of S
72     vector<double> evals;
73     jacobiEigen(S, evals, 120, 1e-9);
74 
75     // Build histogram
76     double mn = *min_element(evals.begin(), evals.end());
77     double mx = *max_element(evals.begin(), evals.end());
78     vector<int> hist(bins, 0);
79     for(double e : evals){
80         int b = (int)((e - mn) / (mx - mn + 1e-12) * bins);
81         if (b == bins) b = bins - 1;
82         if (b >= 0 && b < bins) hist[b]++;
83     }
84 
85     // Theoretical MP support
86     double gamma = (double)p / (double)n;
87     double lminus = pow(1.0 - sqrt(gamma), 2.0);
88     double lplus  = pow(1.0 + sqrt(gamma), 2.0);
89 
90     cout << fixed << setprecision(4);
91     cout << "p = " << p << ", n = " << n << ", gamma = " << gamma << "\n";
92     cout << "Empirical eigenvalue min/max: [" << mn << ", " << mx << "]\n";
93     cout << "MP support: [" << lminus << ", " << lplus << "]\n\n";
94 
95     cout << "Histogram (" << bins << " bins) of eigenvalues:\n";
96     for(int i=0;i<bins;++i){
97         double left = mn + (mx - mn) * i / bins;
98         double right= mn + (mx - mn) * (i+1) / bins;
99         cout << "[" << left << ", " << right << "): " << string(hist[i], '*') << "\n";
100     }
101     return 0;
102 }
103

This program draws an n × p Gaussian data matrix, forms the sample covariance S = (1/n) X^T X, computes all eigenvalues with a simple Jacobi method, and prints a histogram. It also reports the theoretical Marchenko–Pastur support [(1−√γ)^2, (1+√γ)^2]. For moderate p, the empirical min/max and the bulk of the histogram should align with the MP bounds.

Time: O(n p^2 + p^3) for covariance formation and Jacobi eigen-decompositionSpace: O(n p + p^2)

Spiked covariance and BBP phase transition: detect an outlier eigenvalue

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct RNG {
5     mt19937_64 gen;
6     normal_distribution<double> nd;
7     RNG(uint64_t seed=7): gen(seed), nd(0.0,1.0) {}
8     double normal(){ return nd(gen); }
9     double uniform(){ return uniform_real_distribution<double>(0.0,1.0)(gen); }
10 };
11 
12 // Power iteration for largest eigenvalue of a symmetric matrix A (p x p)
13 double powerIteration(const vector<vector<double>>& A, int maxIter=1000, double tol=1e-8){
14     int p = (int)A.size();
15     vector<double> v(p), w(p);
16     // random init
17     mt19937_64 gen(123);
18     uniform_real_distribution<double> ur(-1.0, 1.0);
19     for(int i=0;i<p;++i) v[i] = ur(gen);
20     // normalize
21     auto norm = [&](const vector<double>& x){ double s=0; for(double xi: x) s+=xi*xi; return sqrt(s); };
22     double nv = norm(v); for(double& x: v) x /= (nv + 1e-18);
23     double lambda_old = 0.0;
24     for(int it=0; it<maxIter; ++it){
25         // w = A v
26         fill(w.begin(), w.end(), 0.0);
27         for(int i=0;i<p;++i){
28             double sum = 0.0;
29             for(int j=0;j<p;++j) sum += A[i][j] * v[j];
30             w[i] = sum;
31         }
32         double nv2 = norm(w);
33         for(int i=0;i<p;++i) v[i] = w[i] / (nv2 + 1e-18);
34         // Rayleigh quotient
35         double lambda = 0.0;
36         for(int i=0;i<p;++i){
37             double s=0.0; for(int j=0;j<p;++j) s += v[j]*A[j][i];
38             lambda += v[i]*s;
39         }
40         if (fabs(lambda - lambda_old) < tol * max(1.0, fabs(lambda_old))) return lambda;
41         lambda_old = lambda;
42     }
43     return lambda_old;
44 }
45 
46 int main(){
47     ios::sync_with_stdio(false);
48     cin.tie(nullptr);
49 
50     int n = 1500;     // samples
51     int p = 500;      // dimension (use power iteration only)
52     double theta = 0.9; // spike strength (try values below/above sqrt(gamma))
53     uint64_t seed = 2024;
54 
55     RNG rng(seed);
56     double gamma = (double)p / (double)n;
57 
58     // Random unit vector v (spike direction)
59     vector<double> v(p);
60     double s2=0.0; for(int i=0;i<p;++i){ v[i] = rng.normal(); s2 += v[i]*v[i]; }
61     for(int i=0;i<p;++i) v[i] /= sqrt(s2 + 1e-18);
62 
63     // Build R = I + (sqrt(1+theta) - 1) v v^T so that R R^T = I + theta v v^T
64     double alpha = sqrt(1.0 + theta) - 1.0;
65 
66     // Generate X = Z R where Z has i.i.d. N(0,1) entries
67     vector<vector<double>> X(n, vector<double>(p, 0.0));
68     for(int i=0;i<n;++i){
69         // draw z
70         vector<double> z(p);
71         for(int j=0;j<p;++j) z[j] = rng.normal();
72         // y = z R = z + alpha * (z·v) * v
73         double zdv = 0.0; for(int j=0;j<p;++j) zdv += z[j]*v[j];
74         for(int j=0;j<p;++j) X[i][j] = z[j] + alpha * zdv * v[j];
75     }
76 
77     // Compute sample covariance S = (1/n) X^T X (p x p)
78     vector<vector<double>> S(p, vector<double>(p, 0.0));
79     for(int i=0;i<p;++i){
80         for(int j=i;j<p;++j){
81             long double sum = 0.0;
82             for(int k=0;k<n;++k) sum += (long double)X[k][i] * (long double)X[k][j];
83             double val = (double)(sum / n);
84             S[i][j] = S[j][i] = val;
85         }
86     }
87 
88     // Largest eigenvalue via power iteration
89     double lambda_max = powerIteration(S, 500, 1e-7);
90 
91     // Theoretical prediction
92     double lplus = pow(1.0 + sqrt(gamma), 2.0);
93     double predicted = (theta <= sqrt(gamma)) ? lplus : (1.0 + theta) * (1.0 + gamma / theta);
94 
95     cout << fixed << setprecision(6);
96     cout << "p = " << p << ", n = " << n << ", gamma = " << gamma << "\n";
97     cout << "Spike theta = " << theta << ", sqrt(gamma) = " << sqrt(gamma) << "\n";
98     cout << "Empirical top eigenvalue: " << lambda_max << "\n";
99     cout << "Predicted (BBP/MP):      " << predicted << "\n";
100     if (theta > sqrt(gamma)) cout << "Outlier expected (separated from bulk).\n";
101     else cout << "No outlier expected (sticks to MP edge).\n";
102     return 0;
103 }
104

This program simulates the rank-1 spiked covariance model Σ = I + θ v v^T by generating X = Z Σ^{1/2} with Σ^{1/2} constructed in closed form. It computes S = (1/n) X^T X and estimates the largest eigenvalue via power iteration. The result is compared to the BBP prediction: either the MP edge (if θ ≤ √γ) or the outlier formula (1+θ)(1+γ/θ).

Time: O(n p^2) to form S and O(p^2 T) for power iteration (T iterations)Space: O(n p + p^2)

Wigner semicircle simulation: eigenvalue histogram of a GOE-scaled matrix

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 struct RNG {
5     mt19937_64 gen;
6     normal_distribution<double> nd;
7     RNG(uint64_t seed=99): gen(seed), nd(0.0,1.0) {}
8     double normal(){ return nd(gen); }
9 };
10 
11 void jacobiEigen(vector<vector<double>> A, vector<double>& eigenvalues, int maxSweeps=120, double eps=1e-10){
12     int n = (int)A.size();
13     for(int sweep=0;sweep<maxSweeps;++sweep){
14         double maxOff = 0.0; int p=0,q=1;
15         for(int i=0;i<n;++i){
16             for(int j=i+1;j<n;++j){
17                 if (fabs(A[i][j]) > maxOff){ maxOff = fabs(A[i][j]); p=i; q=j; }
18             }
19         }
20         if (maxOff < eps) break;
21         double app = A[p][p], aqq = A[q][q], apq = A[p][q];
22         double phi = 0.5 * atan2(2.0*apq, aqq - app);
23         double c = cos(phi), s = sin(phi);
24         for(int k=0;k<n;++k){
25             double aik = A[p][k], aqk = A[q][k];
26             A[p][k] = c*aik - s*aqk;
27             A[q][k] = s*aik + c*aqk;
28         }
29         for(int k=0;k<n;++k){
30             double kip = A[k][p], kiq = A[k][q];
31             A[k][p] = c*kip - s*kiq;
32             A[k][q] = s*kip + c*kiq;
33         }
34         A[p][q] = A[q][p] = 0.0;
35     }
36     int n = (int)A.size();
37     eigenvalues.resize(n);
38     for(int i=0;i<n;++i) eigenvalues[i] = A[i][i];
39 }
40 
41 int main(){
42     ios::sync_with_stdio(false);
43     cin.tie(nullptr);
44 
45     int p = 120;      // matrix size (keep modest)
46     int bins = 40;
47     uint64_t seed = 2025;
48 
49     RNG rng(seed);
50     // Build GOE-scaled symmetric matrix: off-diagonal N(0, 1/n), diagonal N(0, 2/n)
51     vector<vector<double>> A(p, vector<double>(p, 0.0));
52     double scale = 1.0 / sqrt((double)p);
53     for(int i=0;i<p;++i){
54         for(int j=i+1;j<p;++j){
55             double x = rng.normal() * scale; // variance 1/n
56             A[i][j] = A[j][i] = x;
57         }
58         A[i][i] = rng.normal() * sqrt(2.0) * scale; // variance 2/n
59     }
60 
61     vector<double> evals;
62     jacobiEigen(A, evals, 150, 1e-10);
63 
64     double mn = *min_element(evals.begin(), evals.end());
65     double mx = *max_element(evals.begin(), evals.end());
66     vector<int> hist(bins, 0);
67     for(double e : evals){
68         int b = (int)((e - mn) / (mx - mn + 1e-12) * bins);
69         if (b == bins) b = bins - 1;
70         if (b >= 0 && b < bins) hist[b]++;
71     }
72 
73     cout << fixed << setprecision(4);
74     cout << "Wigner (GOE-scaled) matrix size p = " << p << "\n";
75     cout << "Empirical eigenvalue min/max: [" << mn << ", " << mx << "]\n";
76     cout << "Semicircle support expected: [" << -2.0 << ", " << 2.0 << "]\n\n";
77     cout << "Histogram (" << bins << " bins):\n";
78     for(int i=0;i<bins;++i){
79         double left = mn + (mx - mn) * i / bins;
80         double right= mn + (mx - mn) * (i+1) / bins;
81         cout << "[" << left << ", " << right << "): " << string(hist[i], '*') << "\n";
82     }
83     return 0;
84 }
85

This demo generates a symmetric random matrix with GOE scaling (variance 1/n off-diagonal, 2/n on the diagonal), computes its eigenvalues with Jacobi, and prints a histogram. The empirical spectrum should lie close to [−2, 2] and resemble a semicircle.

Time: O(p^2 + p^3) for generation and Jacobi eigen-decompositionSpace: O(p^2)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Generate N(0,1) random numbers
5	struct RNG {
6	mt19937_64 gen;
7	normal_distribution<double> nd;
8	RNG(uint64_t seed=42): gen(seed), nd(0.0,1.0) {}
9	double normal(){ return nd(gen); }
10	};
11
12	// Jacobi eigenvalue algorithm for symmetric matrices (small p, demo only)
13	void jacobiEigen(vector<vector<double>> A, vector<double>& eigenvalues, int maxSweeps=100, double eps=1e-10){
14	int n = (int)A.size();
15	vector<vector<double>> V(n, vector<double>(n, 0.0));
16	for(int i=0;i<n;++i) V[i][i] = 1.0;
17	for(int sweep=0;sweep<maxSweeps;++sweep){
18	double maxOff = 0.0; int p=0,q=1;
19	for(int i=0;i<n;++i){
20	for(int j=i+1;j<n;++j){
21	if (fabs(A[i][j]) > maxOff){ maxOff = fabs(A[i][j]); p=i; q=j; }
22	}
23	}
24	if (maxOff < eps) break;
25	double app = A[p][p], aqq = A[q][q], apq = A[p][q];
26	double phi = 0.5 * atan2(2.0*apq, aqq - app);
27	double c = cos(phi), s = sin(phi);
28	// Rotate A = J^T A J in p-q plane
29	for(int k=0;k<n;++k){
30	double aik = A[p][k], aqk = A[q][k];
31	A[p][k] = caik - saqk;
32	A[q][k] = saik + caqk;
33	}
34	for(int k=0;k<n;++k){
35	double kip = A[k][p], kiq = A[k][q];
36	A[k][p] = ckip - skiq;
37	A[k][q] = skip + ckiq;
38	}
39	// Zero out explicitly to control roundoff
40	A[p][q] = A[q][p] = 0.0;
41	}
42	eigenvalues.resize(n);
43	for(int i=0;i<n;++i) eigenvalues[i] = A[i][i];
44	}
45
46	int main(){
47	ios::sync_with_stdio(false);
48	cin.tie(nullptr);
49
50	int n = 1000; // samples
51	int p = 200; // dimension (keep modest for Jacobi)
52	int bins = 40; // histogram bins
53	uint64_t seed = 123;
54
55	RNG rng(seed);
56	// Generate X (n x p) with N(0,1)
57	vector<vector<double>> X(n, vector<double>(p));
58	for(int i=0;i<n;++i){
59	for(int j=0;j<p;++j) X[i][j] = rng.normal();
60	}
61	// Compute S = (1/n) X^T X (p x p symmetric)
62	vector<vector<double>> S(p, vector<double>(p, 0.0));
63	for(int i=0;i<p;++i){
64	for(int j=i;j<p;++j){
65	long double sum = 0.0;
66	for(int k=0;k<n;++k) sum += (long double)X[k][i] * (long double)X[k][j];
67	double val = (double)(sum / n);
68	S[i][j] = S[j][i] = val;
69	}
70	}
71	// Compute eigenvalues of S
72	vector<double> evals;
73	jacobiEigen(S, evals, 120, 1e-9);
74
75	// Build histogram
76	double mn = *min_element(evals.begin(), evals.end());
77	double mx = *max_element(evals.begin(), evals.end());
78	vector<int> hist(bins, 0);
79	for(double e : evals){
80	int b = (int)((e - mn) / (mx - mn + 1e-12) * bins);
81	if (b == bins) b = bins - 1;
82	if (b >= 0 && b < bins) hist[b]++;
83	}
84
85	// Theoretical MP support
86	double gamma = (double)p / (double)n;
87	double lminus = pow(1.0 - sqrt(gamma), 2.0);
88	double lplus = pow(1.0 + sqrt(gamma), 2.0);
89
90	cout << fixed << setprecision(4);
91	cout << "p = " << p << ", n = " << n << ", gamma = " << gamma << "\n";
92	cout << "Empirical eigenvalue min/max: [" << mn << ", " << mx << "]\n";
93	cout << "MP support: [" << lminus << ", " << lplus << "]\n\n";
94
95	cout << "Histogram (" << bins << " bins) of eigenvalues:\n";
96	for(int i=0;i<bins;++i){
97	double left = mn + (mx - mn) * i / bins;
98	double right= mn + (mx - mn) * (i+1) / bins;
99	cout << "[" << left << ", " << right << "): " << string(hist[i], '*') << "\n";
100	}
101	return 0;
102	}
103