Linearity of Expectation Applications
Key Points
- •Linearity of expectation says the expected value of a sum equals the sum of expected values, even if the variables are dependent.
- •It lets you compute expectations by breaking problems into simple indicator variables and summing their probabilities.
- •Expected inversions in a random permutation of n elements is C(n, 2)/2, derived by summing over all pairs.
- •The expected number of cycles in a random permutation is the harmonic number + + ... + 1/n.
- •You do not need independence to use linearity of expectation; dependence only matters for variance and higher moments.
- •In random graphs G(n, p), the expected number of edges is p·C(n, 2) and the expected degree of a vertex is p(n − 1).
- •For sampling with replacement from n labels m times, the expected number of distinct labels seen is n(1 − (1 − 1/n)^m).
Prerequisites
- →Basic probability and expectation — You need to know what an expected value is and how probabilities of simple events are computed.
- →Indicator random variables — Linearity is most easily applied by decomposing counts into sums of 0-1 indicators.
- →Combinatorics (factorials and binomial coefficients) — Counting arguments, such as expected k-cycles = 1/k, rely on combinations and permutations.
- →Permutations and cycle decompositions — Understanding inversions, cycles, and fixed points requires basic permutation structure.
- →Harmonic numbers and logarithms — Expected cycles and coupon collector results use H_n and its approximations.
- →Floating-point arithmetic in C++ — Expectations often yield non-integers; avoiding integer division and ensuring numerical stability is key.
- →Asymptotic analysis — Understanding O(1), O(n) complexity and approximations like H_n ≈ ln n + γ helps with performance reasoning.
Detailed Explanation
Tap terms for definitions01Overview
Hook: Many probability problems look tangled because events depend on each other. Yet, strangely, you can often get the right answer by just adding up simple pieces. Concept: This is the power of linearity of expectation: the expectation of a sum is the sum of expectations, regardless of dependence. Instead of wrestling with complicated joint distributions, we design small random variables that each capture whether a tiny thing happens (like whether a particular pair is an inversion) and add their expectations. Example: In a random permutation of n numbers, whether pair (i, j) is inverted can depend on many other positions, but its probability is exactly 1/2 by symmetry. Summing 1/2 over all C(n, 2) pairs gives C(n, 2)/2 expected inversions. This principle repeats: count cycles in permutations, edges in random graphs, or unique coupons collected; break the quantity into a sum of indicators and sum their probabilities. This direct route bypasses heavy probability machinery (independence, inclusion-exclusion, or conditioning trees) and yields clean, exact expectations that are easy to compute and code up efficiently.
02Intuition & Analogies
Imagine you’re tallying points from many mini-games. Each mini-game either gives you 1 point or 0 points (win or lose), and the total score is the sum. Even if the mini-games influence each other (maybe winning one changes your mood), your average total score is still the sum of your average points from each mini-game. That’s linearity of expectation. In probability terms, we often turn a complicated count into a sum of simple on/off indicators: does this pair form an inversion? does this vertex incident edge appear? does this label show up at least once? Each small question has a straightforward probability answer. Adding them gives the overall expectation. Think of counting cycles in a permutation: it seems global and complex. But you can tag a special representative for each cycle (e.g., its smallest element). The event “element i is a cycle representative” has a simple probability, and there is exactly one representative per cycle. So summing those probabilities over all elements gives the expected number of cycles. Another analogy: checking attendance for many students across days. Your expected daily attendance is just the sum of each student’s probability of showing up. You don’t need to know whether students coordinate; you only need each student’s individual attendance chance. Linearity is that bookkeeping rule for expectation: add first, worry never about dependence.
03Formal Definition
04When to Use
Use linearity of expectation whenever the quantity you want is a sum over simpler parts. It shines when the parts are easy to analyze individually but messily dependent jointly. Typical use cases: (1) Random permutations: counting inversions, fixed points, cycles, records, or descents by summing over positions or pairs with symmetry-based probabilities. (2) Random graphs G(n, p): expected number of edges, triangles, degrees, or other small subgraphs by summing indicators over all candidate substructures. (3) Sampling problems: expected number of distinct elements after m draws from n categories; coupon collector’s waiting time via summing stages; expected collisions in hashing. (4) String/array randomness: expected number of matches between two random strings, or expected number of times a pattern appears in a random text. (5) Algorithm analysis: expected comparisons or swaps, expected bucket sizes in hashing, or expected load in balanced allocations. If a target X can be decomposed into X = \sum X_i and each (\mathbb{E}[X_i]) is tractable (often via simple probability or symmetry), linearity will likely deliver an exact and clean expression for (\mathbb{E}[X]).
⚠️Common Mistakes
- Assuming independence is required: Linearity holds without independence. Don’t waste time proving independence among indicators; it’s unnecessary for expectations. Independence matters for variance or concentration, not for mean. - Wrong or overlapping decomposition: Ensure your count X really equals the sum of defined indicators. Avoid double-counting (e.g., counting an edge twice) or missing cases. Define I_i carefully so that (X = \sum I_i) holds identically. - Miscomputing single-event probabilities: The bottleneck is usually computing (\mathbb{P}(A_i)). For inversions, the probability is 1/2 by symmetry, not 1/(j − i + 1). For cycles, use standard results (expected number of k-cycles is 1/k) or a careful counting argument. - Mixing expectation with probability of nonzero: (\mathbb{E}[I_A] = \mathbb{P}(A)), but (\mathbb{E}[X]) is not the probability that X>0. - Numerical pitfalls in code: Using integer division where floating-point is needed (e.g., n*(n-1)/4 truncates if all integers). Prefer long double and set precision. For expressions like (1 − 1/n)^m, avoid catastrophic cancellation for large n; use log1p/expm1 for stability. - Overcomplicating with inclusion–exclusion: For expectations, inclusion–exclusion is often unnecessary; summing probabilities of indicators is simpler and exact.
Key Formulas
Linearity of Expectation
Explanation: The expectation of a linear combination equals the linear combination of expectations. This holds without any independence assumptions.
Indicator Expectation
Explanation: The expected value of a 0-1 indicator of event A equals the probability that A occurs. This is the core bridge from probabilities to expectations.
Expected Inversions in a Random Permutation
Explanation: Each pair is inverted with probability by symmetry, and there are C(n,2) pairs. Summing gives n(n−1)/4 expected inversions.
Expected k-cycles and Total Cycles
Explanation: The expected number of k-cycles in a random permutation is 1/k. Summing over k yields the harmonic number as the expected total cycles.
Harmonic Number Approximation
Explanation: grows like the natural log of n plus the Euler–Mascheroni constant. The correction terms improve accuracy for finite n.
Expected Distinct Elements in m Draws
Explanation: For each label, the probability it appears at least once is 1 − (1 − 1/n)^m. Summing these n identical probabilities gives the expectation.
Random Graph Expectations
Explanation: Each possible edge appears independently with probability p, so expected edges is p times the number of pairs. Each vertex has n−1 possible incident edges.
Expected Fixed Points in a Random Permutation
Explanation: Each position matches its index with probability 1/n. Summing over all positions yields an expected value of 1.
Coupon Collector Expectation
Explanation: The waiting time to get a new coupon when t are collected has expectation n/(n−t). Summing these stage expectations gives n.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 int main() { 5 ios::sync_with_stdio(false); 6 cin.tie(nullptr); 7 8 long long n; 9 if (!(cin >> n)) return 0; 10 11 // Expected inversions = C(n,2) * 1/2 = n(n-1)/4 12 // Use long double to avoid integer division and preserve precision for large n 13 long double ans = (static_cast<long double>(n) * (n - 1)) / 4.0L; 14 15 cout.setf(ios::fixed); cout << setprecision(10); 16 cout << ans << "\n"; 17 18 // Explanation: 19 // Define indicator I_{i,j} = 1 if pair (i,j) is an inversion (i<j, a[i]>a[j]). 20 // P(I_{i,j}=1) = 1/2 by symmetry. E[sum I_{i,j}] = sum E[I_{i,j}] = C(n,2) * 1/2. 21 22 return 0; 23 } 24
Uses the closed-form expression n(n−1)/4 obtained by summing the 1/2 probability over all C(n,2) pairs. No independence assumptions are needed.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Computes H_n = 1 + 1/2 + ... + 1/n 5 int main() { 6 ios::sync_with_stdio(false); 7 cin.tie(nullptr); 8 9 long long n; 10 if (!(cin >> n)) return 0; 11 12 long double H = 0.0L; 13 for (long long k = 1; k <= n; ++k) H += 1.0L / (long double)k; // O(n) summation 14 15 cout.setf(ios::fixed); cout << setprecision(12); 16 cout << H << "\n"; 17 18 // Facts used: 19 // - Expected number of k-cycles in a random permutation is 1/k. 20 // - Summing over k=1..n gives E[#cycles] = H_n. 21 22 return 0; 23 } 24
Sums the harmonic series to compute the expected number of cycles. This comes from linearity by counting expected k-cycles as 1/k and summing over k.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // E[distinct] = n * (1 - (1 - 1/n)^m) 5 // Uses a numerically stable computation with log1p/expl. 6 int main() { 7 ios::sync_with_stdio(false); 8 cin.tie(nullptr); 9 10 long double n; long double m; // allow large values via long double 11 if (!(cin >> n >> m)) return 0; 12 13 if (n <= 0 || m < 0) { cout << 0 << "\n"; return 0; } 14 15 // Compute (1 - 1/n)^m as exp(m * log(1 - 1/n)) using log1p for stability. 16 long double p = -1.0L / n; // so that 1 + p = 1 - 1/n 17 long double log_term = log1pl(p); // log(1 - 1/n) 18 long double pow_term = expl(m * log_term); // (1 - 1/n)^m 19 long double expected_distinct = n * (1.0L - pow_term); 20 21 cout.setf(ios::fixed); cout << setprecision(12); 22 cout << expected_distinct << "\n"; 23 24 // Indicator proof: 25 // For each label i, let I_i = 1 if label i appears at least once. 26 // P(I_i=1) = 1 - (1 - 1/n)^m. E[sum I_i] = sum E[I_i] = n * P(I_i=1). 27 28 return 0; 29 } 30
Applies linearity with indicators for each label appearing. Uses log1p/expl to compute (1 − 1/n)^m stably for large n and m.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 int main() { 5 ios::sync_with_stdio(false); 6 cin.tie(nullptr); 7 8 long long n; long double p; 9 if (!(cin >> n >> p)) return 0; 10 // Clamp p into [0,1] defensively 11 p = max((long double)0, min((long double)1, p)); 12 13 long double choose2 = (long double)n * (n - 1) / 2.0L; // C(n,2) 14 long double expected_edges = p * choose2; 15 long double expected_degree = p * (n - 1); 16 17 cout.setf(ios::fixed); cout << setprecision(10); 18 cout << expected_edges << "\n" << expected_degree << "\n"; 19 20 // By linearity: 21 // - Each potential edge contributes an indicator with E= p. 22 // - Sum over all C(n,2) edges for total expected edges. 23 // - For a fixed vertex, sum over n-1 incident edges to get expected degree p(n-1). 24 25 return 0; 26 } 27
Uses linearity over all potential edges (and incident edges for a vertex). No independence beyond per-edge probability is necessary to compute the mean.