MathIntermediate

Linearity of Expectation Applications

Key Points

  • Linearity of expectation says the expected value of a sum equals the sum of expected values, even if the variables are dependent.
  • It lets you compute expectations by breaking problems into simple indicator variables and summing their probabilities.
  • Expected inversions in a random permutation of n elements is C(n, 2)/2, derived by summing over all pairs.
  • The expected number of cycles in a random permutation is the harmonic number + + ... + 1/n.
  • You do not need independence to use linearity of expectation; dependence only matters for variance and higher moments.
  • In random graphs G(n, p), the expected number of edges is p·C(n, 2) and the expected degree of a vertex is p(n − 1).
  • For sampling with replacement from n labels m times, the expected number of distinct labels seen is n(1 − (1 − 1/n)^m).

Prerequisites

  • Basic probability and expectationYou need to know what an expected value is and how probabilities of simple events are computed.
  • Indicator random variablesLinearity is most easily applied by decomposing counts into sums of 0-1 indicators.
  • Combinatorics (factorials and binomial coefficients)Counting arguments, such as expected k-cycles = 1/k, rely on combinations and permutations.
  • Permutations and cycle decompositionsUnderstanding inversions, cycles, and fixed points requires basic permutation structure.
  • Harmonic numbers and logarithmsExpected cycles and coupon collector results use H_n and its approximations.
  • Floating-point arithmetic in C++Expectations often yield non-integers; avoiding integer division and ensuring numerical stability is key.
  • Asymptotic analysisUnderstanding O(1), O(n) complexity and approximations like H_n ≈ ln n + γ helps with performance reasoning.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Many probability problems look tangled because events depend on each other. Yet, strangely, you can often get the right answer by just adding up simple pieces. Concept: This is the power of linearity of expectation: the expectation of a sum is the sum of expectations, regardless of dependence. Instead of wrestling with complicated joint distributions, we design small random variables that each capture whether a tiny thing happens (like whether a particular pair is an inversion) and add their expectations. Example: In a random permutation of n numbers, whether pair (i, j) is inverted can depend on many other positions, but its probability is exactly 1/2 by symmetry. Summing 1/2 over all C(n, 2) pairs gives C(n, 2)/2 expected inversions. This principle repeats: count cycles in permutations, edges in random graphs, or unique coupons collected; break the quantity into a sum of indicators and sum their probabilities. This direct route bypasses heavy probability machinery (independence, inclusion-exclusion, or conditioning trees) and yields clean, exact expectations that are easy to compute and code up efficiently.

02Intuition & Analogies

Imagine you’re tallying points from many mini-games. Each mini-game either gives you 1 point or 0 points (win or lose), and the total score is the sum. Even if the mini-games influence each other (maybe winning one changes your mood), your average total score is still the sum of your average points from each mini-game. That’s linearity of expectation. In probability terms, we often turn a complicated count into a sum of simple on/off indicators: does this pair form an inversion? does this vertex incident edge appear? does this label show up at least once? Each small question has a straightforward probability answer. Adding them gives the overall expectation. Think of counting cycles in a permutation: it seems global and complex. But you can tag a special representative for each cycle (e.g., its smallest element). The event “element i is a cycle representative” has a simple probability, and there is exactly one representative per cycle. So summing those probabilities over all elements gives the expected number of cycles. Another analogy: checking attendance for many students across days. Your expected daily attendance is just the sum of each student’s probability of showing up. You don’t need to know whether students coordinate; you only need each student’s individual attendance chance. Linearity is that bookkeeping rule for expectation: add first, worry never about dependence.

03Formal Definition

Let , , , be random variables defined on the same probability space, and let , , be real constants. Linearity of expectation states that for any real constants, the expected value satisfies: \(\mathbb{E}\big[ \big] = []\). No assumptions about independence are required. A particularly powerful special case uses indicator random variables. For an event A, define if A occurs and 0 otherwise. Then \([] = (A)\). If a random count X equals the number of events , , that occur (possibly dependent), then X = and \([X] = [] = ()\). Applications include: inversions in permutations by summing over pairs (each with probability to be inverted); cycles in permutations by summing over cycle indicators (yielding the harmonic number); edges in random graphs G(n, p) by summing over potential edges (each appears with probability p); and distinct labels in sampling by summing indicators that each label appears at least once.

04When to Use

Use linearity of expectation whenever the quantity you want is a sum over simpler parts. It shines when the parts are easy to analyze individually but messily dependent jointly. Typical use cases: (1) Random permutations: counting inversions, fixed points, cycles, records, or descents by summing over positions or pairs with symmetry-based probabilities. (2) Random graphs G(n, p): expected number of edges, triangles, degrees, or other small subgraphs by summing indicators over all candidate substructures. (3) Sampling problems: expected number of distinct elements after m draws from n categories; coupon collector’s waiting time via summing stages; expected collisions in hashing. (4) String/array randomness: expected number of matches between two random strings, or expected number of times a pattern appears in a random text. (5) Algorithm analysis: expected comparisons or swaps, expected bucket sizes in hashing, or expected load in balanced allocations. If a target X can be decomposed into X = \sum X_i and each (\mathbb{E}[X_i]) is tractable (often via simple probability or symmetry), linearity will likely deliver an exact and clean expression for (\mathbb{E}[X]).

⚠️Common Mistakes

  • Assuming independence is required: Linearity holds without independence. Don’t waste time proving independence among indicators; it’s unnecessary for expectations. Independence matters for variance or concentration, not for mean. - Wrong or overlapping decomposition: Ensure your count X really equals the sum of defined indicators. Avoid double-counting (e.g., counting an edge twice) or missing cases. Define I_i carefully so that (X = \sum I_i) holds identically. - Miscomputing single-event probabilities: The bottleneck is usually computing (\mathbb{P}(A_i)). For inversions, the probability is 1/2 by symmetry, not 1/(j − i + 1). For cycles, use standard results (expected number of k-cycles is 1/k) or a careful counting argument. - Mixing expectation with probability of nonzero: (\mathbb{E}[I_A] = \mathbb{P}(A)), but (\mathbb{E}[X]) is not the probability that X>0. - Numerical pitfalls in code: Using integer division where floating-point is needed (e.g., n*(n-1)/4 truncates if all integers). Prefer long double and set precision. For expressions like (1 − 1/n)^m, avoid catastrophic cancellation for large n; use log1p/expm1 for stability. - Overcomplicating with inclusion–exclusion: For expectations, inclusion–exclusion is often unnecessary; summing probabilities of indicators is simpler and exact.

Key Formulas

Linearity of Expectation

Explanation: The expectation of a linear combination equals the linear combination of expectations. This holds without any independence assumptions.

Indicator Expectation

Explanation: The expected value of a 0-1 indicator of event A equals the probability that A occurs. This is the core bridge from probabilities to expectations.

Expected Inversions in a Random Permutation

Explanation: Each pair is inverted with probability by symmetry, and there are C(n,2) pairs. Summing gives n(n−1)/4 expected inversions.

Expected k-cycles and Total Cycles

\mathbb{E}[X_k] = \frac{1}{k},\quad \mathbb{E}[\text{#cycles}] = \sum_{k=1}^{n} \mathbb{E}[X_k] = H_n

Explanation: The expected number of k-cycles in a random permutation is 1/k. Summing over k yields the harmonic number as the expected total cycles.

Harmonic Number Approximation

Explanation: grows like the natural log of n plus the Euler–Mascheroni constant. The correction terms improve accuracy for finite n.

Expected Distinct Elements in m Draws

Explanation: For each label, the probability it appears at least once is 1 − (1 − 1/n)^m. Summing these n identical probabilities gives the expectation.

Random Graph Expectations

Explanation: Each possible edge appears independently with probability p, so expected edges is p times the number of pairs. Each vertex has n−1 possible incident edges.

Expected Fixed Points in a Random Permutation

Explanation: Each position matches its index with probability 1/n. Summing over all positions yields an expected value of 1.

Coupon Collector Expectation

Explanation: The waiting time to get a new coupon when t are collected has expectation n/(n−t). Summing these stage expectations gives n.

Complexity Analysis

Computing expectations via linearity typically reduces to summing simple terms or evaluating closed-form expressions, yielding very efficient algorithms. For expected inversions in a random permutation, the formula n(n−1)/4 evaluates in O(1) time and O(1) space. For expected number of cycles, directly summing the harmonic series = 1/k takes O(n) time and O(1) space; for very large n you can use the asymptotic approximation ≈ ln n + γ (constant time) with small relative error. The expected number of distinct elements after m draws uses a constant-time expression n(1 − (1 − 1/n)^m); in code, the pow/log1p/expm1 functions are O(1) and numerically stable. In random graphs G(n, p), expected edges p·C(n,2) and expected degree p(n−1) are both O(1) computations. Memory usage in all these routines is O(1), since we only keep a few floating-point accumulators. If you verify results empirically by Monte Carlo simulation, the time becomes O(T·C) where T is the number of trials and C the per-trial cost (e.g., O(n log n) to count inversions after shuffling). However, with linearity-derived formulas you avoid simulation entirely and get exact expectations in constant or linear time. For competitive programming constraints (n up to 10^7 for harmonic summation, or up to 10^12 for closed forms), using long double is recommended, and careful use of numerically stable functions prevents catastrophic cancellation when n is large and probabilities are small.

Code Examples

Expected inversions in a random permutation
1#include <bits/stdc++.h>
2using namespace std;
3
4int main() {
5 ios::sync_with_stdio(false);
6 cin.tie(nullptr);
7
8 long long n;
9 if (!(cin >> n)) return 0;
10
11 // Expected inversions = C(n,2) * 1/2 = n(n-1)/4
12 // Use long double to avoid integer division and preserve precision for large n
13 long double ans = (static_cast<long double>(n) * (n - 1)) / 4.0L;
14
15 cout.setf(ios::fixed); cout << setprecision(10);
16 cout << ans << "\n";
17
18 // Explanation:
19 // Define indicator I_{i,j} = 1 if pair (i,j) is an inversion (i<j, a[i]>a[j]).
20 // P(I_{i,j}=1) = 1/2 by symmetry. E[sum I_{i,j}] = sum E[I_{i,j}] = C(n,2) * 1/2.
21
22 return 0;
23}
24

Uses the closed-form expression n(n−1)/4 obtained by summing the 1/2 probability over all C(n,2) pairs. No independence assumptions are needed.

Time: O(1)Space: O(1)
Expected number of cycles in a random permutation (Harmonic number)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Computes H_n = 1 + 1/2 + ... + 1/n
5int main() {
6 ios::sync_with_stdio(false);
7 cin.tie(nullptr);
8
9 long long n;
10 if (!(cin >> n)) return 0;
11
12 long double H = 0.0L;
13 for (long long k = 1; k <= n; ++k) H += 1.0L / (long double)k; // O(n) summation
14
15 cout.setf(ios::fixed); cout << setprecision(12);
16 cout << H << "\n";
17
18 // Facts used:
19 // - Expected number of k-cycles in a random permutation is 1/k.
20 // - Summing over k=1..n gives E[#cycles] = H_n.
21
22 return 0;
23}
24

Sums the harmonic series to compute the expected number of cycles. This comes from linearity by counting expected k-cycles as 1/k and summing over k.

Time: O(n)Space: O(1)
Expected number of distinct elements after m draws with replacement from n labels
1#include <bits/stdc++.h>
2using namespace std;
3
4// E[distinct] = n * (1 - (1 - 1/n)^m)
5// Uses a numerically stable computation with log1p/expl.
6int main() {
7 ios::sync_with_stdio(false);
8 cin.tie(nullptr);
9
10 long double n; long double m; // allow large values via long double
11 if (!(cin >> n >> m)) return 0;
12
13 if (n <= 0 || m < 0) { cout << 0 << "\n"; return 0; }
14
15 // Compute (1 - 1/n)^m as exp(m * log(1 - 1/n)) using log1p for stability.
16 long double p = -1.0L / n; // so that 1 + p = 1 - 1/n
17 long double log_term = log1pl(p); // log(1 - 1/n)
18 long double pow_term = expl(m * log_term); // (1 - 1/n)^m
19 long double expected_distinct = n * (1.0L - pow_term);
20
21 cout.setf(ios::fixed); cout << setprecision(12);
22 cout << expected_distinct << "\n";
23
24 // Indicator proof:
25 // For each label i, let I_i = 1 if label i appears at least once.
26 // P(I_i=1) = 1 - (1 - 1/n)^m. E[sum I_i] = sum E[I_i] = n * P(I_i=1).
27
28 return 0;
29}
30

Applies linearity with indicators for each label appearing. Uses log1p/expl to compute (1 − 1/n)^m stably for large n and m.

Time: O(1)Space: O(1)
Random graph G(n, p): expected edges and expected degree
1#include <bits/stdc++.h>
2using namespace std;
3
4int main() {
5 ios::sync_with_stdio(false);
6 cin.tie(nullptr);
7
8 long long n; long double p;
9 if (!(cin >> n >> p)) return 0;
10 // Clamp p into [0,1] defensively
11 p = max((long double)0, min((long double)1, p));
12
13 long double choose2 = (long double)n * (n - 1) / 2.0L; // C(n,2)
14 long double expected_edges = p * choose2;
15 long double expected_degree = p * (n - 1);
16
17 cout.setf(ios::fixed); cout << setprecision(10);
18 cout << expected_edges << "\n" << expected_degree << "\n";
19
20 // By linearity:
21 // - Each potential edge contributes an indicator with E= p.
22 // - Sum over all C(n,2) edges for total expected edges.
23 // - For a fixed vertex, sum over n-1 incident edges to get expected degree p(n-1).
24
25 return 0;
26}
27

Uses linearity over all potential edges (and incident edges for a vertex). No independence beyond per-edge probability is necessary to compute the mean.

Time: O(1)Space: O(1)