⚙️AlgorithmIntermediate

Double Counting

Key Points

  • Double counting is the strategy of counting the same quantity in two different ways to derive an equality or an efficient algorithm.
  • It turns nested sums into reordered sums, often reducing O() loops to O(n) or O(n log n) by switching the counting perspective.
  • Classic examples include the Handshake Lemma and counting edges in bipartite graphs via the sum of degrees on each side.
  • In competitive programming, double counting appears as the “contribution technique,” where each element’s contribution to all structures (pairs, subarrays) is counted once.
  • Indicator variables plus linearity of expectation are a probabilistic form of double counting used to compute expected values cleanly.
  • When summing over pairs or triples, try reversing the order: count by items versus by values, or by left endpoints versus right endpoints.
  • Be careful about overcounting (e.g., i = j, ordered vs unordered pairs) and tie-breaking (strict vs non-strict inequalities).
  • Switching perspectives preserves correctness but can drastically change runtime and memory, enabling near-linear or harmonic-series time bounds.

Prerequisites

  • Big-O notationTo understand how switching perspectives affects runtime like O(n^2) to O(n log n).
  • Basic graph theoryDegree, edges, and bipartite graphs are classic double-counting grounds.
  • Prefix sums and counting arraysImplement per-value counting and contribution sweeps efficiently.
  • Stacks (monotonic stacks)Needed for contribution-based subarray computations in O(n).
  • Number theory basicsDivisibility, multiples, and harmonic bounds appear in value-based counting.
  • Discrete probabilityIndicator variables and linearity of expectation are a probabilistic double-counting tool.

Detailed Explanation

Tap terms for definitions

01Overview

Double counting is a problem-solving technique where we compute the same total quantity in two different ways and equate the results. In proofs, this gives identities like the Handshake Lemma: the sum of degrees in an undirected graph equals twice the number of edges. In algorithms, the same idea reduces complexity by switching viewpoints—for example, instead of “for each i, count j,” we do “for each j, count i,” or we count per-value rather than per-index. This often transforms explicit nested loops into structured loops over values, buckets, or boundaries. The method is especially powerful when a quantity is symmetric or can be indexed multiple ways (left/right partitions, endpoints vs elements, items vs categories). In competitive programming, the “contribution technique” is a double-counting mindset: sum the contribution of each element to the final answer rather than enumerating all composite structures. This yields efficient solutions for counting pairs, computing sums over all subarrays, and analyzing expected values.

02Intuition & Analogies

Imagine counting handshakes at a party. You could ask every person how many hands they shook and add it all up. Or you could simply count each handshake event directly. Both approaches must give the same total because they are different perspectives of the same thing. This is double counting: two viewpoints, one truth. Another analogy: calculating how many stickers are stuck in a sticker book. You can count per page (sum stickers on each page), or per sticker type (sum how many pages contain each type). The total stickers are the same either way, but one method may be faster depending on what information you have. In code, we mimic this by switching loops. If your original algorithm says “for each index i, inspect all j,” you might flip it to “for each value v, process all indices i that carry v.” If there are far fewer distinct values than indices, this re-indexing slashes runtime. For subarrays, rather than enumerating all O(n^2) subarrays and finding their minima, count for each element how many subarrays it is the minimum of, then multiply by its value. Each element is considered once, but its influence across many subarrays is captured through clever boundaries found with a stack. The heartbeat of double counting is: find a conserved total, choose two natural indexings, then pick the one that’s algorithmically cheaper.

03Formal Definition

Let S be a finite set of “events” we want to count (e.g., pairs (i,j) satisfying a property, all handshakes, all subarrays where an element is a minimum). Suppose there are two index families A and B such that each event s ∈ S corresponds to exactly one pair (a, s) with a ∈ A and exactly one pair (b, s) with b ∈ B. Then S can be expressed both as c(a) and d(b), where c(a) and d(b) count how many events relate to a and b, respectively. Therefore, c(a) = d(b). In algorithmic double counting, we generalize this to weighted sums: a function f over events satisfies f(s) = f(s), where S(a) and S(b) partition S by two different indexings. Switching the order of summation f(i,j) = f(i,j) (for finite sums) formalizes the idea. The contribution technique is the special case where the answer equals contribution(x), with contributions defined so each event is accounted for exactly once.

04When to Use

Use double counting when: (1) You see nested loops counting pairs/triples with a symmetric property—try reversing the order or counting by categories (values, degrees, buckets). (2) There is a conserved quantity that can be indexed in two natural ways, e.g., edges counted by left vs right partition in a bipartite graph. (3) You need an O(n) or O(n \log n) solution for a problem that naively seems O(n^2), especially when data values are constrained (e.g., array values up to 10^6). (4) You compute expected values—indicator variables with linearity of expectation transform probabilistic counting into simple sums. (5) You are aggregating over all substructures (subarrays, subsequences) and can express the total as a sum of per-element contributions, with boundaries found via monotonic stacks or two-pointers. Typical use cases: counting pairs satisfying divisibility or equality, counting edges/degrees, summing subarray minima/maxima, number of inversions or pattern occurrences via contributions, and inclusion–exclusion derivations that treat overlaps carefully.

⚠️Common Mistakes

  • Overcounting or undercounting: forgetting whether pairs are ordered or unordered, or whether i = j is allowed. Always fix the domain (i < j vs i \neq j vs ordered) before designing counts.\n- Tie-breaking errors: when using contribution methods (e.g., subarray minima), you must choose strict vs non-strict comparisons consistently on one side to avoid double counting equal values.\n- Ignoring constraints: switching to “count by value” only helps if the value domain is manageable; otherwise, you might exceed memory/time with a huge frequency array.\n- Boundary off-by-ones: in stack-based contributions, previous/next less elements need careful indexing; L_i and R_i must reflect distances correctly.\n- Integer overflow: contribution totals (e.g., sum over all subarrays) can exceed 32-bit range. Use 64-bit (long long) or big integers.\n- Misapplied expectation: indicator variables require clear event definitions and independence is not required for linearity of expectation—but you must define each indicator correctly and sum expectations, not products.\n- Falsely assuming equality: ensure the two perspectives truly count the same set with the same weights. A mismatch in what is being counted leads to incorrect equations.

Key Formulas

Handshake Lemma

Explanation: Each edge contributes 1 to the degree of its two endpoints, so the degree sum counts each edge twice. This identity often proves parity facts and enables quick edge counts.

Bipartite degree sum

Explanation: In a bipartite graph (L,R,E), every edge touches exactly one vertex in L and one in R. Counting edges by left or by right degrees yields the same total.

Order of summation (finite)

Explanation: For finite sums, switching the order of summation does not change the result. This is the algebraic backbone for reversing nested loops in algorithms.

Linearity via indicators

Explanation: If X = with as indicators of events, then E[X] is the sum of their expectations. This form of double counting avoids dependence issues.

Contribution technique

Explanation: Instead of enumerating all composite objects, sum how much each atomic element contributes. Correct tie-breaking ensures each object is counted once.

Harmonic multiple count

Explanation: The number of multiples across all v up to V is bounded by V times the harmonic number. It explains why looping over multiples often yields O(V log V).

Sum of subarray minima

Explanation: If is the number of choices for the left boundary and for the right boundary where is the minimum, then contributes to the total. Monotonic stacks compute and in O(n).

Inclusion–exclusion (two sets)

Explanation: Counting by sets A and B overcounts the intersection once; subtract it to fix the total. This is a simple double counting of membership.

Complexity Analysis

Double counting itself is not an algorithm but a lens that can drastically change asymptotic costs by re-indexing the computation. The most common improvement is from O() pair enumeration to near-linear time by switching the order of summation or counting by value. For example, counting pairs (i,j) where a[i] divides a[j] is O() naively, but counting by values v and iterating over their multiples yields O(V log V) where V is the maximum value, because V/v = O(V V). Similarly, counting pattern pairs like (# of 'a' before each 'b') becomes O(n) by sweeping once and accumulating contributions. The contribution technique for subarray minima or maxima achieves O(n) by attributing subarrays to their defining element, with distances and computed using monotonic stacks. Space often reduces as well: frequency arrays use O(V) memory, and stacks use O(n). However, the choice of perspective matters: counting by value only helps when V is not too large; otherwise, an O(V) approach may be worse than an O(n ) or O(n n) strategy. Boundary handling affects complexity constants (strict vs non-strict comparisons determine unique ownership without backtracking). In expectation-based double counting, runtime can remain deterministic O(n) while computing probabilistic answers. Overall, double counting converts explicit enumeration to structured aggregation, commonly yielding O(n), O(n n), or O(V V) time with O(n) or O(V) space, provided the indexing sets are chosen to fit constraints.

Code Examples

Handshake Lemma: counting edges by summing degrees
1#include <bits/stdc++.h>
2using namespace std;
3
4int main() {
5 ios::sync_with_stdio(false);
6 cin.tie(nullptr);
7
8 int n, m; // n = number of vertices, m = number of edges (undirected)
9 if (!(cin >> n >> m)) return 0;
10 vector<int> deg(n, 0);
11
12 // Read edges and build degree counts
13 for (int e = 0; e < m; ++e) {
14 int u, v; cin >> u >> v; // assume 0-indexed input
15 deg[u]++;
16 deg[v]++;
17 }
18
19 long long sum_deg = 0;
20 for (int d : deg) sum_deg += d;
21
22 // Handshake Lemma: sum of degrees equals 2 * number of edges
23 long long edges_from_degrees = sum_deg / 2; // integer division, sum_deg must be even
24
25 cout << "Sum of degrees: " << sum_deg << "\n";
26 cout << "Edges (from input m): " << m << "\n";
27 cout << "Edges (by double counting degrees/2): " << edges_from_degrees << "\n";
28
29 // Note: This demonstrates counting every edge twice (once at each endpoint).
30 return 0;
31}
32

We read an undirected graph, compute the degree of each vertex, and apply the Handshake Lemma. Each edge contributes 1 to two vertices’ degrees, so the total degree sum equals 2m. This is a direct double counting: counting edges via endpoints instead of by edges themselves.

Time: O(n + m)Space: O(n)
Counting divisible pairs by switching to value multiples (O(V log V))
1#include <bits/stdc++.h>
2using namespace std;
3
4// Count ordered pairs (i, j) such that a[i] divides a[j].
5// Switch perspective: for each value v, pair it with all multiples k*v.
6
7int main() {
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n; cin >> n;
12 vector<int> a(n);
13 int maxA = 0;
14 for (int i = 0; i < n; ++i) { cin >> a[i]; maxA = max(maxA, a[i]); }
15
16 vector<long long> freq(maxA + 1, 0);
17 for (int x : a) freq[x]++;
18
19 long long ordered_pairs = 0; // counts (i, j), i and j can be equal if a[i] divides itself
20
21 for (int v = 1; v <= maxA; ++v) {
22 if (freq[v] == 0) continue;
23 // For all multiples m of v, each occurrence of v can pair with each occurrence of m
24 for (int m = v; m <= maxA; m += v) {
25 if (freq[m] == 0) continue;
26 ordered_pairs += freq[v] * freq[m];
27 }
28 }
29
30 // If you want unordered pairs with i < j, remove self-pairings and divide by 2 appropriately.
31 // Example for unordered distinct indices:
32 // long long unordered_pairs = 0;
33 // for (int v = 1; v <= maxA; ++v) {
34 // if (!freq[v]) continue;
35 // // pairs (v, v): choose 2
36 // unordered_pairs += freq[v] * (freq[v] - 1) / 2; // since v divides v
37 // // pairs (v, m) with m > v
38 // for (int m = 2 * v; m <= maxA; m += v) {
39 // if (!freq[m]) continue;
40 // unordered_pairs += freq[v] * freq[m];
41 // }
42 // }
43
44 cout << ordered_pairs << "\n";
45 return 0;
46}
47

Naively checking all n^2 pairs is slow. By double counting over values, we fix a value v and count all its matches among multiples m = k·v. The total work is proportional to the number of multiples across all v, which is O(V log V) where V is the maximum value. Adjust the counting for ordered/unordered pairs as needed.

Time: O(V log V) where V = max(a[i])Space: O(V)
Sum of subarray minimums via contribution (monotonic stack)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute sum of minimums of all subarrays in O(n) using contribution counting.
5// Each a[i] contributes a[i] * L[i] * R[i], where:
6// L[i] = number of choices for left boundary so that a[i] is the first strictly smaller from the left
7// R[i] = number of choices for right boundary so that a[i] is the first less-or-equal from the right
8// Tie-breaking (< on left, <= on right) ensures each subarray has a unique owning minimum.
9
10int main() {
11 ios::sync_with_stdio(false);
12 cin.tie(nullptr);
13
14 int n; cin >> n;
15 vector<long long> a(n);
16 for (int i = 0; i < n; ++i) cin >> a[i];
17
18 vector<long long> L(n), R(n);
19
20 // Previous strictly less element distances (L)
21 {
22 vector<int> st; // increasing stack of indices
23 for (int i = 0; i < n; ++i) {
24 while (!st.empty() && a[st.back()] > a[i]) st.pop_back();
25 int prev_less = st.empty() ? -1 : st.back();
26 L[i] = i - prev_less; // number of choices for left boundary
27 st.push_back(i);
28 }
29 }
30
31 // Next less-or-equal element distances (R)
32 {
33 vector<int> st; // increasing stack of indices
34 for (int i = n - 1; i >= 0; --i) {
35 while (!st.empty() && a[st.back()] >= a[i]) st.pop_back();
36 int next_leq = st.empty() ? n : st.back();
37 R[i] = next_leq - i; // number of choices for right boundary
38 st.push_back(i);
39 }
40 }
41
42 long long ans = 0;
43 for (int i = 0; i < n; ++i) {
44 ans += a[i] * L[i] * R[i];
45 }
46
47 cout << ans << "\n";
48 return 0;
49}
50

Instead of enumerating all O(n^2) subarrays and finding their minima, we double count by attributing each subarray to its unique minimum. For each index i, L[i] and R[i] count how many left and right boundaries keep a[i] as the minimum, computed by monotonic stacks. Summing a[i] * L[i] * R[i] yields the total.

Time: O(n)Space: O(n)