Random Walks on Graphs
Key Points
- •A random walk on a graph moves from a node to one of its neighbors chosen uniformly at random at each step.
- •The transition matrix encodes these move probabilities, where A is the adjacency matrix and D is the diagonal degree matrix.
- •For connected, non-bipartite, undirected graphs, the walk converges to a stationary distribution proportional to node degrees.
- •Many properties of random walks (mixing speed, hitting times) are governed by eigenvalues of P or the normalized Laplacian.
- •Power iteration and Monte Carlo simulation let us compute distributions and hitting times efficiently on large graphs.
- •Adding laziness (P' = (I + P)/2) or teleportation avoids periodicity and guarantees convergence in tricky graphs.
- •In C++, it's best to avoid forming dense matrices; use adjacency lists and distribute probability mass along edges.
- •Common pitfalls include mishandling dangling nodes (degree 0), ignoring periodicity in bipartite graphs, and numerical drift in probabilities.
Prerequisites
- →Graph representations (adjacency list/matrix) — You must know how to store and traverse neighbors efficiently to implement P without building dense matrices.
- →Basic probability and Markov chains — Random walks are Markov chains; understanding distributions, transitions, and stationarity is essential.
- →Linear algebra fundamentals — Defining P = D^{-1}A, eigenvalues/eigenvectors, and similarity transforms underpin convergence analysis.
- →Spectral graph theory (intro) — Eigenvalues of the normalized Laplacian or P control mixing, hitting times, and clustering behavior.
- →C++ vectors and numeric stability — Accurate probability computations require careful accumulation, normalization, and random number use.
- →Random number generation — Simulations rely on high-quality RNGs to model random steps and teleportation.
Detailed Explanation
Tap terms for definitions01Overview
Imagine a tourist who is lost in a city. At each intersection (node), they choose one of the connecting streets (edges) uniformly at random and walk to the next intersection. This process is a random walk on a graph. The heart of the model is the transition matrix P = D^{-1}A: it turns the adjacency structure of the graph (A) into movement probabilities by normalizing each row by the node's degree (D). The (i, j)-th entry P_{ij} is the probability of moving from i to j in one step. The concept becomes powerful because it connects discrete graph structure with probability and linear algebra. Over many steps, the distribution of the tourist’s location evolves by multiplying by P repeatedly (p^{(t+1)} = p^{(t)} P). Under mild conditions (connected and aperiodic for undirected graphs), this distribution converges to a stationary distribution that does not change when multiplied by P. This reveals central nodes (they are visited more often), how fast randomness spreads (mixing time), and how long it takes to reach a target (hitting time). Random walks underpin algorithms in ranking (PageRank), clustering (spectral methods), sampling (Markov Chain Monte Carlo), and network analysis (resistance, diffusion). They scale effectively because transitions depend only on local neighborhoods, making them ideal for sparse graphs commonly found in real data sets.
02Intuition & Analogies
Think of P = D^{-1}A as a “fair splitter” for each node’s outgoing probability. If node i has degree d_i, you divide its probability mass equally into d_i buckets and pour one bucket onto each neighbor. That is exactly what a row of P does: normalize the adjacency row by d_i so that probabilities sum to 1. A good everyday analogy is shuffling a deck versus walking hallways. In shuffling, you mix cards globally in one go; in random walks, you mix locally, passing probability through doors from one room to the next. The speed of mixing depends on how many short alternative routes there are (expansion) and whether you can get stuck bouncing back and forth (periodicity in bipartite graphs). Adding “laziness” is like sometimes pausing in place; it breaks rigid back-and-forth cycles so the process settles. Stationary distribution is like the long-run crowd count in a mall: busy intersections (high degree) collect more people on average because there are more ways to arrive. For undirected graphs, the long-run share at node i is proportional to how many doors lead into it (its degree). If degrees are equal (regular graph), everyone is equally likely in the long run. Finally, hitting and commute times reflect navigability: how many steps until the walker reaches a store (target)? Dense, well-connected areas shorten these times; narrow bridges and bottlenecks lengthen them. Spectral quantities (eigenvalues) distill these connectivity features into numbers that predict mixing and search efficiency.
03Formal Definition
04When to Use
- Ranking and centrality: Use random walks to score nodes by long-run visit frequency (stationary distribution). This underlies PageRank (with teleportation) for directed graphs and degree centrality for undirected graphs.
- Community detection and clustering: Short random walks tend to stay within communities before leaking out, which powers methods like spectral clustering and diffusion distances.
- Sampling and estimation: When direct uniform sampling of nodes or substructures is hard, random walks provide approximate samples, especially in massive, implicitly defined graphs.
- Diffusion modeling: Spread of information, heat, or epidemics can be approximated as random walks, where P models local spread.
- Hitting/commute times: Estimate how quickly a search reaches a target, or compare pairwise accessibility via commute times; useful in navigation, recommendation, and network reliability.
- Preprocessing for machine learning on graphs: Random-walk features (node2vec/DeepWalk) and k-step distributions provide informative node embeddings. Choose pure P = D^{-1}A when the graph is undirected, connected, and you can tolerate degree bias. Use lazy walks to avoid periodicity on bipartite graphs. Use teleportation (Google matrix) for directed graphs or when dangling nodes and dead ends are common.
⚠️Common Mistakes
- Ignoring dangling nodes (degree 0): Without a fix, rows in P are invalid. Add self-loops (P_{ii} = 1) for pure walks or use teleportation to distribute probability globally.
- Overlooking periodicity: Non-lazy walks on bipartite graphs oscillate and do not converge to a stationary distribution from arbitrary starts. Use P' = (I + P)/2 to ensure aperiodicity.
- Building dense matrices: Forming dense P costs O(n^2) memory/time and is unnecessary on sparse graphs. Operate with adjacency lists and distribute mass along neighbors.
- Confusing left vs. right eigenvectors: With row-stochastic P, the stationary distribution satisfies \pi = \pi P (a left eigenvector). Iteratively update row distributions p^{(t+1)} = p^{(t)} P, or equivalently multiply P^{\top} by a column vector.
- Numerical drift: Floating-point sums may slowly lose the property that probabilities sum to 1. Renormalize vectors each iteration and use double precision.
- Misinterpreting degree bias: In undirected graphs, high-degree nodes have higher stationary mass by design. If this is undesirable, consider normalized objectives or teleportation.
- Too few iterations: Power iteration may require many steps when the spectral gap is small. Monitor convergence with ||p^{(t+1)} - p^{(t)}|| and cap iterations wisely.
Key Formulas
Transition Matrix
Explanation: Each row of A is normalized by the node’s degree to create a row-stochastic matrix. Entry is the probability of moving from i to j in one step.
Distribution Evolution
Explanation: The distribution after t steps is obtained by multiplying the initial distribution by P t times. This describes how probability mass flows along edges.
Stationary Distribution
Explanation: The stationary distribution is unchanged by the transition. It is a left eigenvector with eigenvalue 1 (or a right eigenvector of ).
Undirected Stationary Law
Explanation: For a connected, non-bipartite undirected graph, the stationary probability of node i is proportional to its degree. Here m is the number of edges.
Detailed Balance
Explanation: This equality makes the chain reversible, enabling spectral analysis via symmetric matrices and simplifying many proofs.
Random-Walk and Normalized Laplacian
Explanation: The random-walk Laplacian relates directly to P. The normalized Laplacian L is symmetric and similar to P, making eigen-analysis convenient.
Symmetric Similarity
Explanation: S is symmetric and similar to P, so it shares eigenvalues with P. This connection allows using tools for symmetric matrices to study P.
Lazy Random Walk
Explanation: Staying put with probability removes periodicity (e.g., bipartite oscillations) while preserving the stationary distribution.
Mixing Bound (Undirected)
Explanation: For an undirected walk, the convergence speed to the stationary distribution is controlled by the second-largest eigenvalue | A larger spectral gap implies faster mixing.
Expected Return Time
Explanation: The expected number of steps to return to node i (after leaving) equals the reciprocal of its stationary probability.
Hitting Time Equations
Explanation: Hitting times satisfy a linear system with boundary condition at the target. Solutions give expected steps to reach t from each node.
Commute Time–Resistance
Explanation: In undirected graphs, the expected time to go from i to j and back equals 2m times the effective electrical resistance between i and j.
Google Matrix (Teleportation)
Explanation: With probability α follow P, and with 1−α jump according to v. This fixes dangling nodes and ensures a unique stationary distribution even in directed graphs.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Simulates a random walk on an unweighted graph using adjacency lists. 5 // Also computes empirical visit frequencies over the trajectory. 6 7 struct Graph { 8 int n; // number of nodes (0..n-1) 9 vector<vector<int>> adj; // adjacency list 10 vector<int> deg; // degrees 11 explicit Graph(int n=0): n(n), adj(n), deg(n,0) {} 12 void add_edge(int u, int v, bool undirected=true) { 13 adj[u].push_back(v); deg[u]++; 14 if (undirected) { adj[v].push_back(u); deg[v]++; } 15 } 16 }; 17 18 int main() { 19 ios::sync_with_stdio(false); 20 cin.tie(nullptr); 21 22 // Build a sample undirected graph (a square with a diagonal) 23 // 0-1-2-3-0 and 1-3 diagonal 24 int n = 4; 25 Graph G(n); 26 G.add_edge(0,1); G.add_edge(1,2); G.add_edge(2,3); G.add_edge(3,0); G.add_edge(1,3); 27 28 int start = 0; 29 int steps = 100000; // large enough to approximate stationary distribution 30 31 // Random engine 32 mt19937_64 rng(123456); 33 34 vector<long long> visits(n, 0); 35 int cur = start; 36 visits[cur]++; 37 38 for (int t = 0; t < steps; ++t) { 39 if (G.deg[cur] == 0) { 40 // Dangling node: stay in place (self-loop) 41 // Alternative: teleport uniformly or to a preference distribution. 42 } else { 43 uniform_int_distribution<int> dist(0, G.deg[cur]-1); 44 int idx = dist(rng); 45 cur = G.adj[cur][idx]; 46 } 47 visits[cur]++; 48 } 49 50 // Compute empirical frequencies 51 double total = accumulate(visits.begin(), visits.end(), 0.0); 52 cout.setf(ios::fixed); cout << setprecision(6); 53 cout << "Empirical visit frequencies (approx. stationary for undirected, non-bipartite):\n"; 54 for (int i = 0; i < n; ++i) { 55 cout << "node " << i << ": " << (visits[i] / total) << "\n"; 56 } 57 58 // For undirected graphs, theoretical stationary pi_i = deg(i)/(2m) 59 long long m = 0; for (int i = 0; i < n; ++i) m += G.deg[i]; m /= 2; // undirected edges count 60 cout << "\nTheory (pi_i = deg(i)/(2m)):\n"; 61 for (int i = 0; i < n; ++i) { 62 cout << "node " << i << ": " << (double)G.deg[i] / (2.0 * m) << "\n"; 63 } 64 65 return 0; 66 } 67
We build a small undirected graph and run a long random walk, counting how often each node is visited. The empirical frequencies approach the stationary distribution. For undirected, connected, non-bipartite graphs, the stationary probability equals degree divided by 2m. We include a simple dangling-node rule (self-loop) to keep P valid even if deg=0.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Computes stationary distribution by power iteration without forming dense P. 5 // Supports pure random walk (alpha=0) and teleportation (alpha in (0,1)). 6 // For alpha=0 and undirected connected non-bipartite graphs, the result converges to pi_i ∝ deg(i). 7 // For directed graphs or with dangling nodes, use alpha in (0,1) (e.g., 0.15) to ensure uniqueness. 8 9 struct Graph { 10 int n; 11 vector<vector<int>> out; // outgoing neighbors 12 vector<int> deg; // out-degrees 13 explicit Graph(int n=0): n(n), out(n), deg(n,0) {} 14 void add_edge(int u, int v) { 15 out[u].push_back(v); deg[u]++; 16 } 17 }; 18 19 vector<double> power_iteration_stationary(const Graph& G, 20 double alpha, // teleport prob (0 => pure walk) 21 const vector<double>& teleport, // teleportation distribution (sum=1), size n 22 int max_iters = 10000, 23 double tol = 1e-12, 24 bool lazy = false // optionally use lazy walk 25 ) { 26 int n = G.n; 27 vector<double> p(n, 1.0 / n), nextp(n, 0.0); 28 29 auto l1_norm = [&](const vector<double>& v){ double s=0; for(double x:v) s += fabs(x); return s; }; 30 31 // Precompute probability of staying put for laziness 32 double stay_prob = lazy ? 0.5 : 0.0; // P' = 0.5 I + 0.5 P if lazy 33 double move_scale = lazy ? 0.5 : 1.0; 34 35 for (int it = 0; it < max_iters; ++it) { 36 fill(nextp.begin(), nextp.end(), 0.0); 37 38 // Teleportation contribution: with probability alpha, jump to 'teleport' 39 if (alpha > 0) { 40 for (int j = 0; j < n; ++j) nextp[j] += alpha * teleport[j]; 41 } 42 43 // Distribute mass along edges according to P (and laziness if enabled) 44 for (int i = 0; i < n; ++i) { 45 if (stay_prob > 0) nextp[i] += stay_prob * p[i]; // lazy self-loop component 46 if (G.deg[i] == 0) { 47 // Dangling node: in pure walk, keep self-loop; with teleportation, redistribute via teleport 48 if (alpha > 0) { 49 for (int j = 0; j < n; ++j) nextp[j] += move_scale * (1.0) * p[i] * (teleport[j]); 50 } else { 51 nextp[i] += move_scale * p[i]; 52 } 53 } else { 54 double share = move_scale * ((alpha > 0) ? (1.0 - 0.0 - 0.0) : 1.0); // move_scale already applied 55 // With teleportation alpha, the non-teleport part is scaled by (1-alpha) 56 if (alpha > 0) share = move_scale * (1.0 - alpha); 57 double mass_per_edge = share * p[i] / (double)G.deg[i]; 58 for (int j : G.out[i]) nextp[j] += mass_per_edge; 59 } 60 } 61 62 // Renormalize to guard against numerical drift 63 double s = 0; for (double x : nextp) s += x; for (double& x : nextp) x /= s; 64 65 // Check convergence in L1 66 vector<double> diff(n); for (int i = 0; i < n; ++i) diff[i] = nextp[i] - p[i]; 67 if (l1_norm(diff) < tol) { 68 return nextp; 69 } 70 p.swap(nextp); 71 } 72 return p; // return last iterate if not converged 73 } 74 75 int main(){ 76 ios::sync_with_stdio(false); 77 cin.tie(nullptr); 78 79 // Example: a small directed graph with a dangling node 80 // 0 -> 1, 0 -> 2, 1 -> 2, 2 -> (none, dangling), 3 -> 0 81 int n = 4; 82 Graph G(n); 83 G.add_edge(0,1); G.add_edge(0,2); G.add_edge(1,2); G.add_edge(3,0); 84 85 // Teleportation distribution: uniform 86 vector<double> teleport(n, 1.0 / n); 87 88 // Compute PageRank-like stationary distribution 89 double alpha = 0.15; // teleport probability 90 bool lazy = false; // could set true to avoid periodicity further 91 auto pr = power_iteration_stationary(G, alpha, teleport, 10000, 1e-12, lazy); 92 93 cout.setf(ios::fixed); cout << setprecision(8); 94 cout << "Stationary distribution (with teleportation alpha=0.15):\n"; 95 for (int i = 0; i < n; ++i) cout << "node " << i << ": " << pr[i] << "\n"; 96 97 // For an undirected connected non-bipartite graph and alpha=0, this becomes degree-proportional. 98 return 0; 99 } 100
This implements power iteration without forming P explicitly. At each step, it adds teleportation mass (if alpha > 0) and distributes non-teleport mass along outgoing edges. Dangling nodes either keep mass (pure walk) or redistribute via teleportation. Optional laziness helps with periodicity. The algorithm returns a probability vector that approximates the stationary distribution.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Graph { 5 int n; vector<vector<int>> adj; vector<int> deg; 6 explicit Graph(int n=0): n(n), adj(n), deg(n,0) {} 7 void add_edge(int u, int v, bool undirected=true) { 8 adj[u].push_back(v); deg[u]++; 9 if (undirected) { adj[v].push_back(u); deg[v]++; } 10 } 11 }; 12 13 // Estimate expected hitting time from s to t by averaging over 'trials' walks. 14 // Uses optional laziness to avoid periodicity in bipartite graphs. 15 double estimate_hitting_time(const Graph& G, int s, int t, int trials, int max_steps, 16 bool lazy, uint64_t seed=12345) { 17 mt19937_64 rng(seed); 18 uniform_real_distribution<double> U(0.0, 1.0); 19 20 long long total_steps = 0; 21 for (int r = 0; r < trials; ++r) { 22 int cur = s; 23 int steps = 0; 24 while (cur != t && steps < max_steps) { 25 if (lazy && U(rng) < 0.5) { 26 // stay in place 27 } else { 28 if (G.deg[cur] == 0) { 29 // dangling: self-loop 30 } else { 31 uniform_int_distribution<int> dist(0, G.deg[cur]-1); 32 int idx = dist(rng); 33 cur = G.adj[cur][idx]; 34 } 35 } 36 ++steps; 37 } 38 total_steps += steps; 39 } 40 return (double) total_steps / (double) trials; 41 } 42 43 int main(){ 44 ios::sync_with_stdio(false); 45 cin.tie(nullptr); 46 47 // A simple graph with a bottleneck: two triangles connected by a bridge edge (2-3) 48 // Triangle1: 0-1-2-0, Triangle2: 3-4-5-3, Bridge: 2-3 49 Graph G(6); 50 G.add_edge(0,1); G.add_edge(1,2); G.add_edge(2,0); 51 G.add_edge(3,4); G.add_edge(4,5); G.add_edge(5,3); 52 G.add_edge(2,3); 53 54 int s = 0, t = 5; 55 double ht_lazy = estimate_hitting_time(G, s, t, /*trials=*/5000, /*max_steps=*/10000, /*lazy=*/true); 56 double ht_plain = estimate_hitting_time(G, s, t, /*trials=*/5000, /*max_steps=*/10000, /*lazy=*/false); 57 58 cout.setf(ios::fixed); cout << setprecision(3); 59 cout << "Estimated hitting time s=0 -> t=5 (lazy walk): " << ht_lazy << " steps\n"; 60 cout << "Estimated hitting time s=0 -> t=5 (plain walk): " << ht_plain << " steps\n"; 61 62 return 0; 63 } 64
We estimate the expected number of steps to hit a target node by averaging over many simulated walks. Laziness can reduce oscillations in bipartite or near-bipartite structures and often stabilizes estimates. A max_steps cap prevents infinite loops in disconnected graphs; if s cannot reach t, the estimate will saturate at max_steps.