∑MathIntermediate

Linearity of Expectation Applications

Key Points

•
Linearity of expectation says the expected value of a sum equals the sum of expected values, even if the variables are dependent.
•
It lets you compute expectations by breaking problems into simple indicator variables and summing their probabilities.
•
Expected inversions in a random permutation of n elements is C(n, 2)/2, derived by summing $\frac{1}{2}$ over all pairs.
•
The expected number of cycles in a random permutation is the harmonic number $H_{n} = 1$ + $\frac{1}{2}$ + ... + 1/n.
•
You do not need independence to use linearity of expectation; dependence only matters for variance and higher moments.
•
In random graphs G(n, p), the expected number of edges is p·C(n, 2) and the expected degree of a vertex is p(n − 1).
•
For sampling with replacement from n labels m times, the expected number of distinct labels seen is n(1 − (1 − 1/n)^m).

Prerequisites

→Basic probability and expectation — You need to know what an expected value is and how probabilities of simple events are computed.
→Indicator random variables — Linearity is most easily applied by decomposing counts into sums of 0-1 indicators.
→Combinatorics (factorials and binomial coefficients) — Counting arguments, such as expected k-cycles = 1/k, rely on combinations and permutations.
→Permutations and cycle decompositions — Understanding inversions, cycles, and fixed points requires basic permutation structure.
→Harmonic numbers and logarithms — Expected cycles and coupon collector results use H_n and its approximations.
→Floating-point arithmetic in C++ — Expectations often yield non-integers; avoiding integer division and ensuring numerical stability is key.
→Asymptotic analysis — Understanding O(1), O(n) complexity and approximations like H_n ≈ ln n + γ helps with performance reasoning.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Many probability problems look tangled because events depend on each other. Yet, strangely, you can often get the right answer by just adding up simple pieces. Concept: This is the power of linearity of expectation: the expectation of a sum is the sum of expectations, regardless of dependence. Instead of wrestling with complicated joint distributions, we design small random variables that each capture whether a tiny thing happens (like whether a particular pair is an inversion) and add their expectations. Example: In a random permutation of n numbers, whether pair (i, j) is inverted can depend on many other positions, but its probability is exactly 1/2 by symmetry. Summing 1/2 over all C(n, 2) pairs gives C(n, 2)/2 expected inversions. This principle repeats: count cycles in permutations, edges in random graphs, or unique coupons collected; break the quantity into a sum of indicators and sum their probabilities. This direct route bypasses heavy probability machinery (independence, inclusion-exclusion, or conditioning trees) and yields clean, exact expectations that are easy to compute and code up efficiently.

02Intuition & Analogies

Imagine you’re tallying points from many mini-games. Each mini-game either gives you 1 point or 0 points (win or lose), and the total score is the sum. Even if the mini-games influence each other (maybe winning one changes your mood), your average total score is still the sum of your average points from each mini-game. That’s linearity of expectation. In probability terms, we often turn a complicated count into a sum of simple on/off indicators: does this pair form an inversion? does this vertex incident edge appear? does this label show up at least once? Each small question has a straightforward probability answer. Adding them gives the overall expectation. Think of counting cycles in a permutation: it seems global and complex. But you can tag a special representative for each cycle (e.g., its smallest element). The event “element i is a cycle representative” has a simple probability, and there is exactly one representative per cycle. So summing those probabilities over all elements gives the expected number of cycles. Another analogy: checking attendance for many students across days. Your expected daily attendance is just the sum of each student’s probability of showing up. You don’t need to know whether students coordinate; you only need each student’s individual attendance chance. Linearity is that bookkeeping rule for expectation: add first, worry never about dependence.

03Formal Definition

Let

X_{1}

X_{2}

\dots

X_{m}

be random variables defined on the same probability space, and let

a_{1}

\dots

a_{m}

be real constants. Linearity of expectation states that for any real constants, the expected value satisfies: \(

\mathbb{E}\big

[

\sum_{i = 1}^{m}

a_{i}

X_{i}

\big

] =

\sum_{i = 1}^{m}

a_{i}

E

[

X_{i}

]\). No assumptions about independence are required. A particularly powerful special case uses indicator random variables. For an event A, define

I_{A} = 1

if A occurs and 0 otherwise. Then \(

E

[

I_{A}

] =

P

(A)\). If a random count X equals the number of events

A_{1}

\dots

A_{m}

that occur (possibly dependent), then X =

\sum_{i = 1}^{m}

I_{A_{i}}

and \(

E

[X] =

\sum_{i = 1}^{m}

E

[

I_{A_{i}}

] =

\sum_{i = 1}^{m}

P

(

A_{i}

)\). Applications include: inversions in permutations by summing over pairs (each with probability

\frac{1}{2}

to be inverted); cycles in permutations by summing over cycle indicators (yielding the harmonic number); edges in random graphs G(n, p) by summing over potential edges (each appears with probability p); and distinct labels in sampling by summing indicators that each label appears at least once.

04When to Use

Use linearity of expectation whenever the quantity you want is a sum over simpler parts. It shines when the parts are easy to analyze individually but messily dependent jointly. Typical use cases: (1) Random permutations: counting inversions, fixed points, cycles, records, or descents by summing over positions or pairs with symmetry-based probabilities. (2) Random graphs G(n, p): expected number of edges, triangles, degrees, or other small subgraphs by summing indicators over all candidate substructures. (3) Sampling problems: expected number of distinct elements after m draws from n categories; coupon collector’s waiting time via summing stages; expected collisions in hashing. (4) String/array randomness: expected number of matches between two random strings, or expected number of times a pattern appears in a random text. (5) Algorithm analysis: expected comparisons or swaps, expected bucket sizes in hashing, or expected load in balanced allocations. If a target X can be decomposed into X = \sum X_i and each (\mathbb{E}[X_i]) is tractable (often via simple probability or symmetry), linearity will likely deliver an exact and clean expression for (\mathbb{E}[X]).

⚠️Common Mistakes

Assuming independence is required: Linearity holds without independence. Don’t waste time proving independence among indicators; it’s unnecessary for expectations. Independence matters for variance or concentration, not for mean. - Wrong or overlapping decomposition: Ensure your count X really equals the sum of defined indicators. Avoid double-counting (e.g., counting an edge twice) or missing cases. Define I_i carefully so that (X = \sum I_i) holds identically. - Miscomputing single-event probabilities: The bottleneck is usually computing (\mathbb{P}(A_i)). For inversions, the probability is 1/2 by symmetry, not 1/(j − i + 1). For cycles, use standard results (expected number of k-cycles is 1/k) or a careful counting argument. - Mixing expectation with probability of nonzero: (\mathbb{E}[I_A] = \mathbb{P}(A)), but (\mathbb{E}[X]) is not the probability that X>0. - Numerical pitfalls in code: Using integer division where floating-point is needed (e.g., n*(n-1)/4 truncates if all integers). Prefer long double and set precision. For expressions like (1 − 1/n)^m, avoid catastrophic cancellation for large n; use log1p/expm1 for stability. - Overcomplicating with inclusion–exclusion: For expectations, inclusion–exclusion is often unnecessary; summing probabilities of indicators is simpler and exact.

Key Formulas

Linearity of Expectation

E [i = 1 \sum m a_{i} X_{i}] = i = 1 \sum m a_{i} E [X_{i}]

Explanation: The expectation of a linear combination equals the linear combination of expectations. This holds without any independence assumptions.

Indicator Expectation

E [I_{A}] = P (A)

Explanation: The expected value of a 0-1 indicator of event A equals the probability that A occurs. This is the core bridge from probabilities to expectations.

Expected Inversions in a Random Permutation

E [inversions] = 1 \leq i < j \leq n \sum P (a_{i} > a_{j}) = (2 n) \cdot \frac{1}{2} = \frac{n ( n - 1 )}{4}

Explanation: Each pair is inverted with probability $\frac{1}{2}$ by symmetry, and there are C(n,2) pairs. Summing gives n(n−1)/4 expected inversions.

Expected k-cycles and Total Cycles

\mathbb{E}[X_k] = \frac{1}{k},\quad \mathbb{E}[\text{#cycles}] = \sum_{k=1}^{n} \mathbb{E}[X_k] = H_n

Explanation: The expected number of k-cycles in a random permutation is 1/k. Summing over k yields the harmonic number $H_{n}$ as the expected total cycles.

Harmonic Number Approximation

H_{n} = k = 1 \sum n \frac{1}{k} \approx ln n + γ + \frac{1}{2 n} - \frac{1}{12 n ^{2}}

Explanation: $H_{n}$ grows like the natural log of n plus the Euler–Mascheroni constant. The correction terms improve accuracy for finite n.

Expected Distinct Elements in m Draws

E [distinct] = n (1 - (1 - \frac{1}{n})^{m})

Explanation: For each label, the probability it appears at least once is 1 − (1 − 1/n)^m. Summing these n identical probabilities gives the expectation.

Random Graph Expectations

E [edges in G (n, p)] = p (2 n), E [de g (v)] = p (n - 1)

Explanation: Each possible edge appears independently with probability p, so expected edges is p times the number of pairs. Each vertex has n−1 possible incident edges.

Expected Fixed Points in a Random Permutation

E [fixed points] = i = 1 \sum n P (a_{i} = i) = n \cdot \frac{1}{n} = 1

Explanation: Each position matches its index with probability 1/n. Summing over all positions yields an expected value of 1.

Coupon Collector Expectation

E [T_{collector}] = n H_{n}

Explanation: The waiting time to get a new coupon when t are collected has expectation n/(n−t). Summing these stage expectations gives n $H_{n}$ .

Complexity Analysis

Computing expectations via linearity typically reduces to summing simple terms or evaluating closed-form expressions, yielding very efficient algorithms. For expected inversions in a random permutation, the formula n(n−1)/4 evaluates in O(1) time and O(1) space. For expected number of cycles, directly summing the harmonic series

H_{n}

\sum_{k = 1}^{n}

1/k takes O(n) time and O(1) space; for very large n you can use the asymptotic approximation

H_{n}

≈ ln n + γ (constant time) with small relative error. The expected number of distinct elements after m draws uses a constant-time expression n(1 − (1 − 1/n)^m); in code, the pow/log1p/expm1 functions are O(1) and numerically stable. In random graphs G(n, p), expected edges p·C(n,2) and expected degree p(n−1) are both O(1) computations. Memory usage in all these routines is O(1), since we only keep a few floating-point accumulators. If you verify results empirically by Monte Carlo simulation, the time becomes O(T·C) where T is the number of trials and C the per-trial cost (e.g., O(n log n) to count inversions after shuffling). However, with linearity-derived formulas you avoid simulation entirely and get exact expectations in constant or linear time. For competitive programming constraints (n up to 10^7 for harmonic summation, or up to 10^12 for closed forms), using long double is recommended, and careful use of numerically stable functions

\frac{l o g 1 p}{e x p m 1}

prevents catastrophic cancellation when n is large and probabilities are small.

Code Examples

Expected inversions in a random permutation

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 int main() {
5     ios::sync_with_stdio(false);
6     cin.tie(nullptr);
7 
8     long long n;
9     if (!(cin >> n)) return 0;
10 
11     // Expected inversions = C(n,2) * 1/2 = n(n-1)/4
12     // Use long double to avoid integer division and preserve precision for large n
13     long double ans = (static_cast<long double>(n) * (n - 1)) / 4.0L;
14 
15     cout.setf(ios::fixed); cout << setprecision(10);
16     cout << ans << "\n";
17 
18     // Explanation:
19     // Define indicator I_{i,j} = 1 if pair (i,j) is an inversion (i<j, a[i]>a[j]).
20     // P(I_{i,j}=1) = 1/2 by symmetry. E[sum I_{i,j}] = sum E[I_{i,j}] = C(n,2) * 1/2.
21 
22     return 0;
23 }
24

Uses the closed-form expression n(n−1)/4 obtained by summing the 1/2 probability over all C(n,2) pairs. No independence assumptions are needed.

Time: O(1)Space: O(1)

Expected number of cycles in a random permutation (Harmonic number)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Computes H_n = 1 + 1/2 + ... + 1/n
5 int main() {
6     ios::sync_with_stdio(false);
7     cin.tie(nullptr);
8 
9     long long n;
10     if (!(cin >> n)) return 0;
11 
12     long double H = 0.0L;
13     for (long long k = 1; k <= n; ++k) H += 1.0L / (long double)k; // O(n) summation
14 
15     cout.setf(ios::fixed); cout << setprecision(12);
16     cout << H << "\n";
17 
18     // Facts used:
19     // - Expected number of k-cycles in a random permutation is 1/k.
20     // - Summing over k=1..n gives E[#cycles] = H_n.
21 
22     return 0;
23 }
24

Sums the harmonic series to compute the expected number of cycles. This comes from linearity by counting expected k-cycles as 1/k and summing over k.

Time: O(n)Space: O(1)

Expected number of distinct elements after m draws with replacement from n labels

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // E[distinct] = n * (1 - (1 - 1/n)^m)
5 // Uses a numerically stable computation with log1p/expl.
6 int main() {
7     ios::sync_with_stdio(false);
8     cin.tie(nullptr);
9 
10     long double n; long double m; // allow large values via long double
11     if (!(cin >> n >> m)) return 0;
12 
13     if (n <= 0 || m < 0) { cout << 0 << "\n"; return 0; }
14 
15     // Compute (1 - 1/n)^m as exp(m * log(1 - 1/n)) using log1p for stability.
16     long double p = -1.0L / n;                 // so that 1 + p = 1 - 1/n
17     long double log_term = log1pl(p);          // log(1 - 1/n)
18     long double pow_term = expl(m * log_term); // (1 - 1/n)^m
19     long double expected_distinct = n * (1.0L - pow_term);
20 
21     cout.setf(ios::fixed); cout << setprecision(12);
22     cout << expected_distinct << "\n";
23 
24     // Indicator proof:
25     // For each label i, let I_i = 1 if label i appears at least once.
26     // P(I_i=1) = 1 - (1 - 1/n)^m. E[sum I_i] = sum E[I_i] = n * P(I_i=1).
27 
28     return 0;
29 }
30

Applies linearity with indicators for each label appearing. Uses log1p/expl to compute (1 − 1/n)^m stably for large n and m.

Time: O(1)Space: O(1)

Random graph G(n, p): expected edges and expected degree

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 int main() {
5     ios::sync_with_stdio(false);
6     cin.tie(nullptr);
7 
8     long long n; long double p;
9     if (!(cin >> n >> p)) return 0;
10     // Clamp p into [0,1] defensively
11     p = max((long double)0, min((long double)1, p));
12 
13     long double choose2 = (long double)n * (n - 1) / 2.0L; // C(n,2)
14     long double expected_edges = p * choose2;
15     long double expected_degree = p * (n - 1);
16 
17     cout.setf(ios::fixed); cout << setprecision(10);
18     cout << expected_edges << "\n" << expected_degree << "\n";
19 
20     // By linearity:
21     // - Each potential edge contributes an indicator with E= p.
22     // - Sum over all C(n,2) edges for total expected edges.
23     // - For a fixed vertex, sum over n-1 incident edges to get expected degree p(n-1).
24 
25     return 0;
26 }
27

Uses linearity over all potential edges (and incident edges for a vertex). No independence beyond per-edge probability is necessary to compute the mean.

Time: O(1)Space: O(1)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	int main() {
5	ios::sync_with_stdio(false);
6	cin.tie(nullptr);
7
8	long long n;
9	if (!(cin >> n)) return 0;
10
11	// Expected inversions = C(n,2) * 1/2 = n(n-1)/4
12	// Use long double to avoid integer division and preserve precision for large n
13	long double ans = (static_cast<long double>(n) * (n - 1)) / 4.0L;
14
15	cout.setf(ios::fixed); cout << setprecision(10);
16	cout << ans << "\n";
17
18	// Explanation:
19	// Define indicator I_{i,j} = 1 if pair (i,j) is an inversion (i<j, a[i]>a[j]).
20	// P(I_{i,j}=1) = 1/2 by symmetry. E[sum I_{i,j}] = sum E[I_{i,j}] = C(n,2) * 1/2.
21
22	return 0;
23	}
24

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Computes H_n = 1 + 1/2 + ... + 1/n
5	int main() {
6	ios::sync_with_stdio(false);
7	cin.tie(nullptr);
8
9	long long n;
10	if (!(cin >> n)) return 0;
11
12	long double H = 0.0L;
13	for (long long k = 1; k <= n; ++k) H += 1.0L / (long double)k; // O(n) summation
14
15	cout.setf(ios::fixed); cout << setprecision(12);
16	cout << H << "\n";
17
18	// Facts used:
19	// - Expected number of k-cycles in a random permutation is 1/k.
20	// - Summing over k=1..n gives E[#cycles] = H_n.
21
22	return 0;
23	}
24

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// E[distinct] = n * (1 - (1 - 1/n)^m)
5	// Uses a numerically stable computation with log1p/expl.
6	int main() {
7	ios::sync_with_stdio(false);
8	cin.tie(nullptr);
9
10	long double n; long double m; // allow large values via long double
11	if (!(cin >> n >> m)) return 0;
12
13	if (n <= 0 \|\| m < 0) { cout << 0 << "\n"; return 0; }
14
15	// Compute (1 - 1/n)^m as exp(m * log(1 - 1/n)) using log1p for stability.
16	long double p = -1.0L / n; // so that 1 + p = 1 - 1/n
17	long double log_term = log1pl(p); // log(1 - 1/n)
18	long double pow_term = expl(m * log_term); // (1 - 1/n)^m
19	long double expected_distinct = n * (1.0L - pow_term);
20
21	cout.setf(ios::fixed); cout << setprecision(12);
22	cout << expected_distinct << "\n";
23
24	// Indicator proof:
25	// For each label i, let I_i = 1 if label i appears at least once.
26	// P(I_i=1) = 1 - (1 - 1/n)^m. E[sum I_i] = sum E[I_i] = n * P(I_i=1).
27
28	return 0;
29	}
30