šŸŽ“How I Study AIHISA
šŸ“–Read
šŸ“„PapersšŸ“°BlogsšŸŽ¬Courses
šŸ’”Learn
šŸ›¤ļøPathsšŸ“šTopicsšŸ’”ConceptsšŸŽ“Shorts
šŸŽÆPractice
ā±ļøCoach🧩Problems🧠ThinkingšŸŽÆPrompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
āˆ‘MathAdvanced

Free Probability Theory

Key Points

  • •
    Free probability studies "random variables" that do not commute, where independence is replaced by freeness and noncrossing combinatorics replaces classical partitions.
  • •
    The R-transform linearizes free additive convolution: adding free variables corresponds to adding their R-transforms.
  • •
    Semicircle and free Poisson (Marchenko–Pastur) laws are the central examples; their free cumulants are simple and drive many computations.
  • •
    Moments and free cumulants are related by noncrossing partitions; this enables efficient dynamic-programming conversions in code.
  • •
    Large independent random matrices (under mild invariance and scaling) become asymptotically free, explaining why free convolution predicts spectra of matrix sums.
  • •
    You can compute free additive convolution numerically by converting moments to free cumulants, adding them, and converting back.
  • •
    Catalan numbers count noncrossing partitions and thus index many moments in free probability (e.g., semicircle moments).
  • •
    C++ implementations can simulate random matrices to observe semicircle laws and implement series-based algorithms to compute free convolutions.

Prerequisites

  • →Linear algebra (matrices, trace, eigenvalues) — Random matrices live in matrix algebras; traces approximate expectations and spectra relate to distributions.
  • →Basic probability and moments — Understanding moments, distributions, and classical independence helps contrast with freeness.
  • →Generating functions and power series — R-, S-, and moment series are manipulated formally to compute free convolutions.
  • →Combinatorics of partitions — Noncrossing partitions govern the moment–cumulant relations in free probability.
  • →Asymptotic analysis — Freeness for matrices is asymptotic; recognizing scaling limits and rates is important.
  • →Big-O notation — To analyze algorithmic costs of series conversions and simulations.
  • →Complex analysis (light) — Cauchy transforms and functional equations provide analytic characterizations of laws.
  • →C++ programming (vectors, loops, numeric stability) — Required to implement series manipulations and random-matrix simulations.

Detailed Explanation

Tap terms for definitions

01Overview

Free probability is a framework for studying randomness when variables do not commute, as in matrices or operators. Classical probability assumes that XY = YX and models independence using product measures and classical cumulants. In contrast, free probability replaces independence with freeness, a combinatorial condition on centered, alternating products coming from different subalgebras. This change leads to new convolution operations (free additive and multiplicative convolution) that describe the distributions of sums and products of free variables. A central discovery is that many large random matrix ensembles behave as though their normalized limits are free: when matrix size grows, mixed trace moments factor as predicted by freeness. This explains empirical spectral laws like Wigner’s semicircle (for sums of independent Wigner matrices) and Marchenko–Pastur (for sample covariance matrices). Computationally, free probability is powered by transforms: the R-transform adds under free convolution, and the S-transform multiplies under free multiplicative convolution. Moments and free cumulants are bridged by noncrossing partitions, counted by Catalan numbers. With a few series manipulations, you can implement practical routines to compute the first several moments of a free sum/product, useful for approximating spectra. The theory elegantly marries operator algebras, combinatorics, and random matrices, offering both conceptual insight and concrete numerical tools.

02Intuition & Analogies

Imagine two people telling a story by alternating sentences. In classical independence, whether sentence A comes before B or vice versa does not matter; their combined effect is symmetric because multiplication commutes. In the matrix world, order matters—AB is not BA—so the way we alternate voices becomes crucial. Freeness is the rule that if each speaker says something with zero average, then any story formed by alternating their sentences has zero overall average. It captures a strong ā€œno-interferenceā€ pattern in alternating order. Another intuition comes from drawing arcs over points on a line: pair up or group points without crossing the arcs. These are noncrossing partitions. In classical probability, all partitions matter when relating moments to cumulants. In free probability, only the crossings-free groupings count. This shift from all partitions to noncrossing partitions is the combinatorial fingerprint of freeness. Random matrices provide a physical picture. Take two large, independent, nicely distributed matrices A and B (e.g., Wigner ensembles) and scale them properly. Although they are not strictly free at any finite size, as the size grows, their mixed trace moments behave as if A and B were free. So the spectrum of A+B is predicted by free additive convolution, which you can compute from simple series. The semicircle law emerges because, in the long run, only the noncrossing ways of pairing factors survive when you expand traces like Tr((A+B)^{k}). Crossings cancel out or become negligible, much like noisy interference patterns averaging to zero.

03Formal Definition

A noncommutative probability space is a pair (A, φ) where A is a unital algebra over C and φ: A → C is a unital, positive, tracial linear functional (think of φ as expectation; in matrices φ(X)=n1​Tr(X)). Subalgebras (Ai​)_{i∈ I} are free if for any m ≄ 1 and aj​ ∈ Aij​​ with φ(aj​)=0, whenever adjacent indices differ (i1​ = i2​ = ⋯ = im​), we have φ(a1​ a2​ ⋯ am​)=0. The distribution of X ∈ A is the linear functional on polynomials p ↦ φ(p(X)). Its moments are mn​=φ(Xn). Free cumulants (kn​) are defined via the moment–cumulant relation over noncrossing partitions: mn​ = āˆ‘Ļ€āˆˆNC(n)​ āˆBāˆˆĻ€ā€‹ k∣Bāˆ£ā€‹. The R-transform is RX​(z)=āˆ‘n≄1​ kn​ znāˆ’1. It linearizes free additive convolution: for free X,Y, RX+Y​(z)=RX​(z)+RY​(z). For positive variables and multiplicative freeness, the S-transform satisfies SXY​(z)=SX​(z)SY​(z). Asymptotic freeness: sequences of independent random matrices (An​, Bn​, ...) with appropriate invariance and scaling are asymptotically free with respect to the normalized trace, meaning mixed moments converge to those predicted by freeness. This justifies using free convolution to predict limiting spectra of large matrix sums/products.

04When to Use

  • Predict spectra of large independent random matrices: If A_n and B_n are independent Wigner or unitarily invariant ensembles, the empirical eigenvalue distribution of A_n+B_n is approximated by the free additive convolution of their limits.
  • Signal processing and wireless: The Marchenko–Pastur (free Poisson) law models sample covariance matrices; free multiplicative convolution describes products with channel or precoding effects.
  • Deconvolution: If you measure a sum X+Y and know Y’s distribution, free deconvolution (subtracting R-transforms) can recover X’s law approximately from moment data.
  • Fast spectral prototyping: When exact eigen-computation is expensive, use first 10–20 moments via free transforms to sketch densities (e.g., via moment-matching or orthogonal polynomials).
  • Theoretical proofs: Use noncrossing combinatorics and free cumulants to prove limit laws (semicircle, Marchenko–Pastur) and universality under weak moment assumptions. Avoid: small matrices where asymptotic freeness fails; strongly dependent matrices (e.g., A and polynomial in A); or non-invariant ensembles where assumptions for asymptotic freeness are violated without careful justification.

āš ļøCommon Mistakes

  • Confusing classical independence with freeness: For free variables, centered alternating products vanish; classical cumulant additivity does not hold. Always use R-transform for free sums.
  • Ignoring normalization in random matrices: Wigner matrices require entry variance scaling like 1/n to get a nontrivial limit; otherwise, spectra blow up or collapse.
  • Misusing the S-transform: It is defined for variables with nonzero mean of X (or \varphi(X) \ne 0); applying it when \varphi(X)=0 causes singularities. Check assumptions before multiplying S-transforms.
  • Expecting exact finite-n behavior: Freeness is an asymptotic phenomenon for matrices; at small n, deviations are normal. Compare moments (Tr(X^k)/n) rather than raw eigenvalues for more stable evidence.
  • Overlooking noncrossing nature: Using all set partitions (classical cumulants) instead of noncrossing partitions leads to wrong moment reconstructions. Use algorithms that respect noncrossing combinatorics (Catalan growth).
  • Numerical instability in series: High-order moments may explode; use long double and truncate sensibly. Validate with known laws (semicircle, free Poisson) to catch implementation errors.

Key Formulas

Free moment–cumulant formula

mn​=Ļ€āˆˆNC(n)āˆ‘ā€‹BāˆˆĻ€āˆā€‹k∣Bāˆ£ā€‹

Explanation: The n-th moment equals a sum over noncrossing partitions of {1,...,n}, multiplying free cumulants by block sizes. It replaces classical partition sums and is the backbone of many computations.

R-transform additivity

RX+Y​(z)=RX​(z)+RY​(z)

Explanation: For free X and Y, the R-transform of the sum is the sum of R-transforms. This linearization makes free additive convolution computationally simple.

S-transform multiplicativity

SXY​(z)=SX​(z)SY​(z)

Explanation: For positive free variables X and Y, the S-transform of the product equals the product of S-transforms. It is the multiplicative counterpart to the R-transform.

Catalan numbers

Cn​=n+11​(n2n​)

Explanation: Catalan numbers count noncrossing partitions and many Dyck-path structures. In free probability they index even moments of the semicircle law.

Semicircle moments

m2nsemi​=Cnā€‹Ļƒ2n,m2n+1semi​=0

Explanation: Even moments of a centered semicircle distribution with variance σ2 equal Catalan numbers times σ2n. Odd moments vanish by symmetry.

R-transform of semicircle

Rsemi​(z)=σ2z

Explanation: Only the second free cumulant is nonzero for the semicircle law. Hence its R-transform is linear with slope equal to the variance.

Free Poisson cumulants

knfree-Poisson​=Ī»

Explanation: For Marchenko–Pastur with rate Ī» (and jump size 1), every free cumulant equals Ī». This makes it easy to convolve with other laws.

Cauchy–R relation

GX​(z)=φ((zāˆ’X)āˆ’1),R(GX​(z))+GX​(z)1​=z

Explanation: The Cauchy transform encodes the distribution analytically. Plugging it into the R-transform gives a functional equation that characterizes spectral distributions.

M–R series identity

M(z)=1+zM(z)R(zM(z))

Explanation: The ordinary moment series M relates to the R-transform by composition. It is a convenient starting point for series-based algorithms that convert between moments and cumulants.

Counting noncrossing partitions

∣NC(n)∣=Cn​

Explanation: The number of noncrossing partitions of n elements equals the n-th Catalan number. This is why Catalan numbers appear throughout free probability.

Semicircle density

μsemi​(dx)=2Ļ€Ļƒ21​4σ2āˆ’x2​1∣xāˆ£ā‰¤2Ļƒā€‹dx

Explanation: The limiting eigenvalue distribution of Wigner matrices has this compactly supported density. It is the free analog of the Gaussian in the classical central limit theorem.

Series version of moment–cumulant

mn​=k=1āˆ‘n​kk​[znāˆ’k]M(z)k,M(z)=i≄0āˆ‘ā€‹mi​zi

Explanation: The n-th moment is a weighted coefficient of powers of the moment series M. This identity enables dynamic-programming conversions without enumerating partitions.

Complexity Analysis

There are two computational themes here: series-based transforms and random-matrix simulation. 1) Series (moments↔free cumulants): Using the identity mn​ = āˆ‘_{k=1}^{n} rk​ Ā· [znāˆ’k] M(z)^k with M(0)=1, we can convert moments to cumulants and back by dynamic programming. To compute [zd] M(z)^k up to order N, naive truncated polynomial multiplication takes O(N2) per multiply. Raising to power k costs O(kĀ·N2) if done iteratively. Summed over all k≤n and n≤N, this yields about O(N3) time. Memory usage is O(N) to store series plus O(N) workspace; if you cache intermediate powers, you may transiently use O(N2). For Nā‰ˆ20–50 (typical in practice), this is very fast. 2) Random-matrix moment simulation: For an nƗn dense matrix A, computing Tr(Ap) via repeated multiplications requires pāˆ’1 matrix multiplies. A naive multiply is O(n3), so the total is O(pĀ·n3). Memory is O(n2) to store A and O(n2) for a work matrix. For nā‰ˆ100–300 and p≤8, this comfortably runs in milliseconds to seconds in C++. This approach sidesteps eigenvalue decomposition (typically O(n3) as well) and directly delivers empirical moments that can be compared to free-probability predictions. Accuracy considerations: Series methods are exact up to floating-point precision; use long double to mitigate growth in high-order moments. Random-matrix estimates have variance decreasing like 1/n (for normalized traces) and benefit from averaging multiple trials if needed. Always ensure Wigner scaling (entry variance āˆ 1/n) so that moments stabilize as n grows.

Code Examples

Moments ↔ Free cumulants via truncated power-series DP
1#include <bits/stdc++.h>
2using namespace std;
3
4using ld = long double;
5
6// Truncated polynomial multiplication: (a * b) up to degree D
7vector<ld> poly_mul_trunc(const vector<ld>& a, const vector<ld>& b, int D) {
8 vector<ld> c(min<int>(D+1, (int)a.size() + (int)b.size() - 1), 0.0L);
9 int da = (int)a.size() - 1;
10 int db = (int)b.size() - 1;
11 int dd = min(D, da + db);
12 c.assign(dd + 1, 0.0L);
13 for (int i = 0; i <= da; ++i) {
14 int jmax = min(db, D - i);
15 for (int j = 0; j <= jmax; ++j) {
16 c[i + j] += a[i] * b[j];
17 }
18 }
19 return c;
20}
21
22// Coefficient of z^d in M(z)^k, using iterative multiplication, truncated at degree d
23ld coeff_of_M_pow(const vector<ld>& M, int k, int d) {
24 if (k == 0) return (d == 0 ? 1.0L : 0.0L);
25 vector<ld> P(1, 1.0L); // start with 1
26 for (int t = 0; t < k; ++t) {
27 P = poly_mul_trunc(P, M, d);
28 }
29 if ((int)P.size() <= d) return 0.0L;
30 return P[d];
31}
32
33// Convert free cumulants r[1..N] to moments m[0..N] with m[0]=1
34vector<ld> moments_from_cumulants(const vector<ld>& r) {
35 int N = (int)r.size() - 1; // r[0] unused
36 vector<ld> m(N + 1, 0.0L);
37 m[0] = 1.0L;
38 // Build moments progressively; when computing m[n], M uses m[0..n-1]
39 for (int n = 1; n <= N; ++n) {
40 vector<ld> M(n + 1, 0.0L);
41 for (int i = 0; i <= n - 1; ++i) M[i] = m[i];
42 // m[n] = sum_{k=1..n} r[k] * [z^{n-k}] M(z)^k
43 ld sum = 0.0L;
44 for (int k = 1; k <= n; ++k) {
45 int d = n - k;
46 ld coeff = coeff_of_M_pow(M, k, d);
47 sum += r[k] * coeff;
48 }
49 m[n] = sum;
50 }
51 return m;
52}
53
54// Convert moments m[0..N] (with m[0]=1) to free cumulants r[1..N]
55vector<ld> cumulants_from_moments(const vector<ld>& m) {
56 int N = (int)m.size() - 1;
57 vector<ld> r(N + 1, 0.0L);
58 for (int n = 1; n <= N; ++n) {
59 vector<ld> M(n, 0.0L);
60 for (int i = 0; i <= n - 1; ++i) M[i] = m[i];
61 // m[n] = r[n]*[z^0]M^n + sum_{k=1..n-1} r[k]*[z^{n-k}]M^k
62 // but [z^0]M^n = 1, so r[n] = m[n] - sum_{k=1..n-1} r[k]*coeff
63 ld sum = 0.0L;
64 for (int k = 1; k <= n - 1; ++k) {
65 int d = n - k;
66 ld coeff = coeff_of_M_pow(M, k, d);
67 sum += r[k] * coeff;
68 }
69 r[n] = m[n] - sum; // because coeff_0(M^n)=1
70 }
71 return r;
72}
73
74int main() {
75 cout.setf(std::ios::fixed); cout<<setprecision(8);
76
77 int N = 10; // compute up to moment/cumulant order 10
78 ld sigma2 = 2.0L; // variance of semicircle
79
80 // Semicircle cumulants: k2 = sigma2, others zero
81 vector<ld> r(N + 1, 0.0L);
82 r[2] = sigma2;
83
84 // Moments from cumulants
85 auto m = moments_from_cumulants(r);
86
87 cout << "Semicircle (variance=" << sigma2 << ") moments m_n:\n";
88 for (int n = 0; n <= N; ++n) {
89 cout << "m[" << n << "] = " << m[n] << "\n";
90 }
91
92 // Recover cumulants from moments
93 auto r_back = cumulants_from_moments(m);
94 cout << "\nRecovered free cumulants k_n (should be 0 except k2=sigma2):\n";
95 for (int n = 1; n <= N; ++n) {
96 cout << "k[" << n << "] = " << r_back[n] << "\n";
97 }
98 return 0;
99}
100

This program implements the moment–free-cumulant relationship using truncated formal power series. It avoids explicit enumeration of noncrossing partitions by using the series identity m_n = āˆ‘ r_k Ā· [z^{nāˆ’k}] M(z)^k. It first constructs moments for a semicircle law (only k_2=σ^2) and then recovers cumulants from those moments to validate correctness. Even moments match Catalan numbers times σ^{2n}, and all cumulants except k_2 are numerically zero.

Time: O(N^3) for N moments/cumulants due to repeated truncated polynomial multiplications.Space: O(N) for series storage; up to O(N) extra for workspace.
Free additive convolution: semicircle āŠž free-Poisson via R-transform
1#include <bits/stdc++.h>
2using namespace std;
3using ld = long double;
4
5// Reuse truncated series tools from previous example (inlined here for completeness)
6vector<ld> poly_mul_trunc(const vector<ld>& a, const vector<ld>& b, int D) {
7 vector<ld> c(min<int>(D+1, (int)a.size() + (int)b.size() - 1), 0.0L);
8 int da = (int)a.size() - 1;
9 int db = (int)b.size() - 1;
10 int dd = min(D, da + db);
11 c.assign(dd + 1, 0.0L);
12 for (int i = 0; i <= da; ++i) {
13 int jmax = min(db, D - i);
14 for (int j = 0; j <= jmax; ++j) c[i + j] += a[i] * b[j];
15 }
16 return c;
17}
18ld coeff_of_M_pow(const vector<ld>& M, int k, int d) {
19 if (k == 0) return (d == 0 ? 1.0L : 0.0L);
20 vector<ld> P(1, 1.0L);
21 for (int t = 0; t < k; ++t) P = poly_mul_trunc(P, M, d);
22 return (d < (int)P.size() ? P[d] : 0.0L);
23}
24vector<ld> moments_from_cumulants(const vector<ld>& r) {
25 int N = (int)r.size() - 1;
26 vector<ld> m(N + 1, 0.0L);
27 m[0] = 1.0L;
28 for (int n = 1; n <= N; ++n) {
29 vector<ld> M(n + 1, 0.0L);
30 for (int i = 0; i <= n - 1; ++i) M[i] = m[i];
31 ld sum = 0.0L;
32 for (int k = 1; k <= n; ++k) sum += r[k] * coeff_of_M_pow(M, k, n - k);
33 m[n] = sum;
34 }
35 return m;
36}
37
38int main(){
39 cout.setf(std::ios::fixed); cout<<setprecision(10);
40
41 int N = 12; // compute up to 12th moment
42
43 // Distribution A: semicircle with variance s2
44 ld s2 = 1.0L;
45 vector<ld> rA(N + 1, 0.0L);
46 rA[2] = s2;
47
48 // Distribution B: free Poisson (Marchenko–Pastur) with rate lambda (jump size 1)
49 ld lambda = 0.5L;
50 vector<ld> rB(N + 1, 0.0L);
51 for (int n = 1; n <= N; ++n) rB[n] = lambda;
52
53 // Free sum Z = A āŠž B has cumulants rZ = rA + rB
54 vector<ld> rZ(N + 1, 0.0L);
55 for (int n = 1; n <= N; ++n) rZ[n] = rA[n] + rB[n];
56
57 // Convert to moments
58 auto mZ = moments_from_cumulants(rZ);
59
60 cout << "Moments of Z = semicircle(s2=" << s2 << ") āŠž free-Poisson(lambda=" << lambda << "):\n";
61 for (int n = 0; n <= N; ++n) {
62 cout << "m[" << n << "] = " << mZ[n] << "\n";
63 }
64
65 return 0;
66}
67

We define two distributions by their free cumulants: a semicircle (only k2 nonzero) and a free Poisson law (all cumulants equal to Ī»). Since R-transforms add under free convolution, the cumulants of the sum are just rA+rB. The code then converts these cumulants back into moments, giving the first several moments of the free sum Z. This is a template for free deconvolution and spectral prototyping.

Time: O(N^3) for converting cumulants to moments via truncated series.Space: O(N).
Random-matrix experiment: asymptotic freeness and semicircle moments
1#include <bits/stdc++.h>
2using namespace std; using ld = long double;
3
4// Simple dense matrix type
5struct Mat {
6 int n; vector<ld> a; // row-major
7 Mat(int n_=0, ld val=0): n(n_), a(n_*n_, val) {}
8 ld& operator()(int i,int j){ return a[i*n+j]; }
9 const ld& operator()(int i,int j) const { return a[i*n+j]; }
10};
11
12Mat mat_add(const Mat& A, const Mat& B){ Mat C(A.n); for(int i=0;i<A.n;i++) for(int j=0;j<A.n;j++) C(i,j)=A(i,j)+B(i,j); return C; }
13Mat mat_mul(const Mat& A, const Mat& B){ int n=A.n; Mat C(n,0); for(int i=0;i<n;i++) for(int k=0;k<n;k++){ ld aik=A(i,k); for(int j=0;j<n;j++) C(i,j)+=aik*B(k,j);} return C; }
14ld trace(const Mat& A){ ld t=0; for(int i=0;i<A.n;i++) t+=A(i,i); return t; }
15
16// Generate symmetric Wigner matrix with entry variance sigma^2/n
17Mat random_wigner(int n, ld sigma, mt19937_64& rng){
18 normal_distribution<long double> N01(0.0L,1.0L);
19 Mat A(n,0);
20 ld scale = sigma / sqrt((ld)n);
21 for(int i=0;i<n;i++){
22 for(int j=i;j<n;j++){
23 ld x = (i==j) ? N01(rng) : N01(rng);
24 x *= scale;
25 A(i,j)=x; A(j,i)=x; // symmetric
26 }
27 }
28 return A;
29}
30
31// Estimate m_p = (1/n) Tr( (A)^p ) by repeated multiplication
32ld moment_power(const Mat& A, int p){
33 int n=A.n; if(p==0) return 1.0L; if(p==1) return trace(A)/n;
34 Mat P=A; ld m1 = (ld)trace(P)/n; if(p==1) return m1; // not used
35 for(int t=2;t<=p;t++){
36 P = mat_mul(P, A);
37 }
38 return (ld)trace(P)/n;
39}
40
41// Catalan number (small n) in long double
42long double catalan(int k){ // C_k = (2k)!/((k+1)!k!)
43 // compute multiplicatively to reduce overflow
44 long double res=1.0L; for(int i=1;i<=k;i++){ res *= (k+i); res /= i; }
45 res /= (k+1);
46 return res;
47}
48
49int main(){
50 cout.setf(std::ios::fixed); cout<<setprecision(6);
51 int n=120; // matrix size
52 int pmax=6; // compute even moments up to 6
53 ld s1=1.0L, s2=0.7L; // target standard deviations for two Wigner matrices
54 mt19937_64 rng(42);
55
56 Mat A = random_wigner(n, s1, rng);
57 Mat B = random_wigner(n, s2, rng);
58 Mat S = mat_add(A,B);
59
60 // The sum of two independent Wigner matrices behaves like a semicircle with variance s1^2 + s2^2
61 ld var_sum = s1*s1 + s2*s2;
62
63 for(int p=2;p<=pmax;p+=2){
64 ld m_emp = moment_power(S, p);
65 int k = p/2;
66 ld m_pred = (ld)catalan(k) * pow(var_sum, (ld)k);
67 cout << "p="<<p<<" empirical="<<m_emp<<" predicted="<<m_pred<<" |err|="<<fabsl(m_emp-m_pred)<<"\n";
68 }
69
70 return 0;
71}
72

This simulation draws two independent symmetric Wigner matrices with entry variance scaled by 1/n, adds them, and computes low-order even moments via traces of powers. Free probability predicts that the sum behaves like a semicircle with variance σ1^2+σ2^2, whose even moments are Catalan(k)·(σ1^2+σ2^2)^k. The printed comparison shows small errors that shrink with larger n or by averaging more trials, illustrating asymptotic freeness and the semicircle law.

Time: O(p Ā· n^3) due to pāˆ’1 dense matrix multiplications for each moment.Space: O(n^2) to store matrices.
#free probability#freeness#r-transform#s-transform#noncrossing partitions#catalan numbers#semicircle law#marchenko–pastur#random matrices#free convolution#asymptotic freeness#cauchy transform#moments#free cumulants