MathIntermediate

Inclusion-Exclusion Principle

Key Points

  • The Inclusion-Exclusion Principle (IEP) corrects overcounting by alternately adding and subtracting sizes of intersections of sets.
  • It is used to count elements that satisfy at least one of several properties or to count objects avoiding all forbidden properties via the complement.
  • Classic applications include counting derangements, surjections, and integers coprime to a given number.
  • Although the naive formula has up to 2^n terms, structure (symmetry, few primes, or algebraic patterns) often reduces the computation to O(n) or O(n log n).
  • In probability, IEP computes the probability of a union of events by alternating sums of intersection probabilities.
  • Using modular arithmetic and precomputed factorials lets us implement IEP efficiently for large inputs in C++.
  • A common pitfall is forgetting signs or the empty-set term; careful iteration order and sign handling prevent mistakes.
  • For number-theoretic counts, IEP connects to the Möbius function and leads to sieve-like algorithms.

Prerequisites

  • Basic set theoryUnderstanding unions, intersections, complements, and cardinality is essential to state and apply IEP.
  • Combinatorics fundamentalsBinomial coefficients, permutations, and factorials appear directly in inclusion-exclusion sums.
  • Modular arithmeticLarge counts require computation modulo a prime, using modular inverses and fast exponentiation.
  • Number theory basicsPrime factorization, gcd, and the Möbius function connect IEP to sieve-like counting problems.
  • Time complexity analysisRecognizing when naive 2^n inclusion-exclusion is feasible or when structure is needed to optimize.

Detailed Explanation

Tap terms for definitions

01Overview

The Inclusion-Exclusion Principle (IEP) is a powerful counting technique for determining the size of a union of sets or the number of objects that avoid a set of forbidden conditions. The basic idea is that simply summing the sizes of sets overcounts elements that appear in multiple sets; to fix this, we subtract the sizes of all pairwise intersections, but then we have subtracted elements appearing in three sets too many times, so we add back triple intersections, and so on. This alternating add–subtract process continues through all possible intersections. While the full expression appears to have an exponential number of terms, many problems exhibit special structure that lets us evaluate it efficiently, such as symmetry that makes intersection sizes depend only on how many constraints are chosen, or a small number of distinct prime factors. IEP underlies many classic combinatorial results: the number of derangements (permutations with no fixed points), the number of surjections between finite sets, and counts of integers coprime to a fixed modulus. It also appears in probability theory to compute the probability of at least one event occurring, and in number theory via the Möbius function and sieve techniques. In algorithmic practice, we often re-express inclusion-exclusion sums in closed or nearly closed forms with O(n) or O(n log n) complexity, leveraging precomputation (factorials, powers) and bitmask enumeration when the number of constraints is small.

02Intuition & Analogies

Imagine surveying students about three clubs: Art (A), Biology (B), and Chess (C). If you add |A| + |B| + |C|, students in both Art and Biology are counted twice, and those in all three are counted three times. To fix this, subtract the pairwise intersections |A ∩ B| + |A ∩ C| + |B ∩ C|, which removes the double counts; but now anyone in all three clubs has been removed three times, so you must add back |A ∩ B ∩ C| once. This alternating correction perfectly counts students in at least one club. This mindset generalizes: when counting objects that violate any of several rules (e.g., a password containing a forbidden character), you can start with the total number of objects and subtract those violating each rule, then add back those violating any two rules (since they were subtracted twice), subtract those violating three, and continue. The point is that overlaps cause overcounting; alternating inclusion and exclusion precisely fixes this. A helpful picture is a balance scale: adding single sets pushes the count up, subtracting intersections brings it down, and adding triple intersections nudges it back up, converging to the exact answer. In problems with symmetry—say, every rule looks the same—the cost of computing each intersection is the same function of how many rules you picked, so the entire sum collapses into a short loop over k = 0..n with binomial coefficients. In number theory, the same idea appears through prime factors: to count numbers divisible by at least one prime from a set, we add multiples of each prime, subtract multiples of products of two primes, and so on—exactly inclusion-exclusion.

03Formal Definition

Let , , , be finite sets. The Inclusion-Exclusion Principle states \[ = - + - + (-1)^{n+1} .\] Equivalently, for the complement of the union, if U is the universe, then \[ = - ,\] so we often count the desired objects as total minus the alternating sum of intersections of the "bad" events. In probability, for events , the same alternating pattern holds for probabilities, replacing cardinalities with P(). When the intersections depend only on the number k of selected sets (symmetry), we can write \[ = (-1)^{k+1} f(k),\] where f(k) is the common size of an intersection of any k sets. Many combinatorial counts, such as derangements and surjections, arise by choosing k constraints to violate and counting objects consistent with those k violations. In number theory, inclusion-exclusion over prime factors yields sums over subsets or, more abstractly, over divisors using the Möbius function \(\).

04When to Use

Use IEP when you need to count objects that satisfy at least one of several properties, or equivalently, objects that avoid all of a set of forbidden properties. It is ideal when direct counting leads to overcounting due to overlaps. Typical scenarios include:

  • Symmetric constraints: Each constraint contributes the same type of restriction; intersection sizes depend mainly on how many constraints are chosen, not which ones. Examples: derangements (forbidden fixed points), surjections (forbidden empty images), and occupancy problems.
  • Small number of constraints: When n is small (e.g., n ≤ 20), you can enumerate all subsets with a bitmask and compute intersection sizes per subset.
  • Number theory with few prime factors: Counting integers up to N divisible by at least one of k primes; with k small, 2^k enumeration is fast.
  • Probability of unions: Computing P(at least one event occurs) when independence is absent and intersections are available or estimable.
  • Transform techniques: When a closed form exists via Möbius inversion or generating functions, converting inclusion-exclusion into O(n) or O(n log n) sums with precomputations. Avoid IEP if the number of constraints is large and intersections lack exploitable structure; naive evaluation is exponential and may time out. In such cases, seek alternative combinatorial models, DP over states, or problem-specific decompositions.

⚠️Common Mistakes

  • Sign errors: Forgetting that the pattern alternates starting with plus for single sets leads to off-by-large errors. A robust approach is to compute a sign as ((k % 2) ? -1 : +1) or (-1)^k in code, and test on small cases.
  • Omitting the empty subset term: When using the complement trick, the k=0 term (which is the total) must be included. In symmetric formulas, this often appears as f(0) with coefficient 1.
  • Double counting intersections incorrectly: For asymmetric constraints, assuming all k-wise intersections have the same size can be wrong. Verify symmetry or compute intersections carefully per subset.
  • Overflow and precision: Factorials and powers grow quickly. Use 64-bit integers, modular arithmetic (e.g., 1,000,000,007), or big integers when necessary. For floor expressions like (\left\lfloor \frac{N}{d} \right\rfloor), ensure safe multiplication/division order.
  • Exponential blow-up: Applying naive 2^n inclusion-exclusion when n is large will TLE. Look for structure to collapse the sum (e.g., depend only on k), or use Möbius inversion or sieves.
  • Ignoring zero intersections: Intersections can be empty for large k; exploiting this can prune computation. In code, early-continue when a product exceeds N or when constraints become impossible.
  • Mixing sets and probabilities: Don’t substitute probabilities for counts unless you’re truly in a probability setting; normalizing by |U| is required if converting from counts to probabilities.

Key Formulas

General Inclusion-Exclusion

Explanation: This is the core identity: add sizes of single sets, subtract pairwise intersections, add triple intersections, and so on. It exactly counts elements in at least one set.

Probability Version of IEP

Explanation: Replace set sizes with probabilities to compute the chance that at least one event occurs. Useful when events are not independent and intersection probabilities are known.

Derangements

Explanation: To count permutations with no fixed points, forbid each i from being fixed and apply IEP. The result expresses derangements through an alternating series involving factorials.

Number of Surjections

Explanation: Count all functions and subtract those missing at least one of the n target elements, using symmetry to collapse intersections. This yields the number of onto functions.

Coprime Count via Möbius

Explanation: The number of integers in [1..N] coprime to m equals the alternating sum over divisors weighted by the Möbius function. It is an inclusion-exclusion over prime factors compressed onto divisors.

Alternating Binomial Sum

Explanation: A simple corollary of IEP or (1-1)^n. It is often used to check signs or to simplify symmetric inclusion-exclusion sums.

At-least-one Prime Divisibility

Explanation: Counts integers up to N divisible by at least one prime from a set P. Add multiples of one prime, subtract multiples of two-prime products, etc.

Möbius Inversion on Divisors

Explanation: Expresses a function summed over divisors in terms of another and inverts it using the Möbius function. This is the algebraic backbone of many inclusion-exclusion transformations.

Complexity Analysis

Naively, inclusion-exclusion over n constraints requires evaluating all 2^n intersections, which is exponential in n and infeasible for n much larger than about 22–24 in typical competitive programming limits. However, many applications exhibit symmetry, letting us collapse the sum to a single loop over ..n with binomial coefficients. In such symmetric cases, the complexity is dominated by computing intersection sizes f(k). For derangements, f(k) = (n-k)! and the total cost is O(n) with precomputed factorials; space is O(n) to store factorials and inverse factorials modulo a prime. For surjections, the collapsed formula involves a loop over ..n and fast exponentiation of (n-k)^m, giving O(n log m) time and O(n) space for factorials. In number-theoretic counts based on prime factors of m, inclusion-exclusion runs over 2^ subsets, where is the number of distinct primes dividing m. Since is small for most inputs (≤ 9 for 32-bit m), this is efficient; runtime is O(2^ with O(1) extra space beyond storing primes. When overflow risks exist, modular arithmetic or 128-bit intermediates are needed for safety. In all cases, careful pruning (e.g., skipping subsets whose product exceeds N) reduces constant factors. Thus, the practical complexity of IEP is problem-dependent: exponential in the worst case, but often near-linear or n log m with structure and precomputation.

Code Examples

Derangements D(n) using Inclusion-Exclusion (mod 1,000,000,007)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute derangements D(n) = n! * sum_{k=0}^{n} (-1)^k / k! (mod MOD)
5// We precompute factorials and inverse factorials to get 1/k! quickly.
6
7static const long long MOD = 1'000'000'007LL;
8
9long long mod_pow(long long a, long long e) {
10 long long r = 1 % MOD;
11 a %= MOD;
12 while (e > 0) {
13 if (e & 1) r = (r * a) % MOD;
14 a = (a * a) % MOD;
15 e >>= 1;
16 }
17 return r;
18}
19
20int main() {
21 ios::sync_with_stdio(false);
22 cin.tie(nullptr);
23
24 int n;
25 if (!(cin >> n)) return 0;
26
27 // Precompute factorials and inverse factorials up to n
28 vector<long long> fact(n + 1), invfact(n + 1);
29 fact[0] = 1;
30 for (int i = 1; i <= n; ++i) fact[i] = fact[i - 1] * i % MOD;
31 invfact[n] = mod_pow(fact[n], MOD - 2); // Fermat's little theorem (MOD is prime)
32 for (int i = n; i >= 1; --i) invfact[i - 1] = invfact[i] * i % MOD;
33
34 // sum_{k=0}^{n} (-1)^k / k!
35 long long sum = 0;
36 for (int k = 0; k <= n; ++k) {
37 long long term = invfact[k]; // 1/k!
38 if (k % 2 == 1) term = (MOD - term) % MOD; // multiply by (-1)^k
39 sum += term;
40 if (sum >= MOD) sum -= MOD;
41 }
42
43 long long Dn = fact[n] * sum % MOD;
44 cout << Dn << '\n';
45 return 0;
46}
47

We apply the symmetric inclusion-exclusion for derangements: forbid k fixed points (choose which k, then arrange the rest), leading to D(n) = \(n! \sum_{k=0}^{n} (-1)^k/k!\). Using precomputed factorials and modular inverses, we compute the alternating series and multiply by n! under modulo arithmetic.

Time: O(n)Space: O(n)
Count Surjections: number of onto functions from [m] to [n] (mod 1,000,000,007)
1#include <bits/stdc++.h>
2using namespace std;
3
4// onto(m -> n) = sum_{k=0}^{n} (-1)^k * C(n, k) * (n - k)^m (mod MOD)
5// We precompute factorials for binomial coefficients and use fast exponentiation for powers.
6
7static const long long MOD = 1'000'000'007LL;
8
9long long mod_pow(long long a, long long e) {
10 long long r = 1 % MOD;
11 a %= MOD;
12 while (e > 0) {
13 if (e & 1) r = (r * a) % MOD;
14 a = (a * a) % MOD;
15 e >>= 1;
16 }
17 return r;
18}
19
20int main() {
21 ios::sync_with_stdio(false);
22 cin.tie(nullptr);
23
24 long long m; int n; // m can be large, n typically smaller
25 if (!(cin >> m >> n)) return 0;
26
27 // Edge cases
28 if (n == 0) { cout << (m == 0 ? 1 : 0) << '\n'; return 0; }
29 if (m < n) { cout << 0 << '\n'; return 0; } // cannot be onto if fewer elements in domain
30
31 vector<long long> fact(n + 1), invfact(n + 1);
32 fact[0] = 1;
33 for (int i = 1; i <= n; ++i) fact[i] = fact[i - 1] * i % MOD;
34 auto inv = [&](long long x){ return mod_pow(x, MOD - 2); };
35 invfact[n] = inv(fact[n]);
36 for (int i = n; i >= 1; --i) invfact[i - 1] = invfact[i] * i % MOD;
37
38 auto C = [&](int N, int K)->long long{
39 if (K < 0 || K > N) return 0;
40 return fact[N] * invfact[K] % MOD * invfact[N - K] % MOD;
41 };
42
43 long long ans = 0;
44 for (int k = 0; k <= n; ++k) {
45 long long ways_miss_k = mod_pow(n - k, m); // (n-k)^m
46 long long comb = C(n, k);
47 long long term = comb * ways_miss_k % MOD;
48 if (k % 2 == 1) term = (MOD - term) % MOD; // multiply by (-1)^k
49 ans += term;
50 if (ans >= MOD) ans -= MOD;
51 }
52
53 cout << ans % MOD << '\n';
54 return 0;
55}
56

We count functions from an m-element set to an n-element set that hit every target at least once. Inclusion-exclusion over the set of targets missing reduces to an alternating sum over k missing targets with symmetry: choose which k targets to miss (C(n,k)), then all m elements map into the remaining n−k targets ((n−k)^m).

Time: O(n log m)Space: O(n)
Count integers in [1..N] coprime to m via inclusion-exclusion over prime factors
1#include <bits/stdc++.h>
2using namespace std;
3
4// Count numbers x in [1..N] with gcd(x, m) = 1.
5// Inclusion-exclusion over distinct prime factors p1,...,pk of m:
6// not_coprime = sum_{empty != S subset of primes} (-1)^{|S|+1} floor(N / prod(S))
7// answer = N - not_coprime.
8
9int main() {
10 ios::sync_with_stdio(false);
11 cin.tie(nullptr);
12
13 unsigned long long N, m;
14 if (!(cin >> N >> m)) return 0;
15
16 // Factor m into distinct primes
17 vector<unsigned long long> primes;
18 unsigned long long x = m;
19 for (unsigned long long p = 2; p * p <= x; ++p) {
20 if (x % p == 0) {
21 primes.push_back(p);
22 while (x % p == 0) x /= p; // remove all powers; only distinct primes matter
23 }
24 }
25 if (x > 1) primes.push_back(x);
26
27 int k = (int)primes.size();
28 unsigned long long not_coprime = 0ULL;
29
30 // Enumerate all non-empty subsets of primes
31 for (int mask = 1; mask < (1 << k); ++mask) {
32 // Compute product of selected primes with overflow-safe check
33 __uint128_t prod128 = 1; // use 128-bit to avoid overflow when multiplying
34 int bits = __builtin_popcount((unsigned)mask);
35 for (int i = 0; i < k; ++i) {
36 if (mask & (1 << i)) {
37 prod128 *= primes[i];
38 if (prod128 > N) break; // no contribution if product exceeds N
39 }
40 }
41 if (prod128 == 0 || prod128 > N) continue;
42 unsigned long long prod = (unsigned long long)prod128;
43 unsigned long long cnt = N / prod; // multiples of prod in [1..N]
44 if (bits % 2 == 1) not_coprime += cnt; // add for odd subset size
45 else not_coprime -= cnt; // subtract for even subset size
46 }
47
48 unsigned long long coprime = N - not_coprime;
49 cout << coprime << '\n';
50 return 0;
51}
52

We factor m into its distinct prime divisors p1,...,pk and apply inclusion-exclusion over these primes. For each non-empty subset S, we add or subtract floor(N / product(S)) depending on |S|. Using 128-bit intermediates avoids overflow when multiplying primes. The final answer is N minus the count of integers having at least one common prime factor with m.

Time: O(\sqrt{m} + 2^{\omega(m)})Space: O(\omega(m))