MathIntermediate

Binomial Theorem and Identities

Key Points

  • The binomial theorem expands (x + y)^n into a sum of terms using binomial coefficients that count how many ways to choose k items from n.
  • Key identities like Vandermonde’s convolution and the hockey-stick identity help simplify sums of binomial coefficients.
  • You can compute C(n, k) efficiently modulo a prime using factorials and modular inverses with Fermat’s little theorem.
  • Pascal’s triangle builds all C(n, k) in O() time and gives an intuitive picture of how coefficients combine.
  • Sums such as C(n, k) = 2^n and (-1)^k C(n, k) = 0 come from evaluating (1 + 1)^n and (1 - 1)^n.
  • Vandermonde’s identity is a discrete convolution equality that matches polynomial multiplication of (1 + z)^m and (1 + z)^n.
  • The central binomial coefficient C(2n, n) counts lattice paths and equals C(n, k)^2, the middle coefficient of (1 + z)^{2n}.
  • In code, prefer precomputation for many nCk queries, use big integers for exact arithmetic, and beware of overflow.

Prerequisites

  • Factorials and permutationsUnderstanding n! and how ordering affects counts is essential for deriving C(n, k) = n!/(k!(n-k)!).
  • Modular arithmeticMany problems require computing binomial coefficients under a modulus, especially a prime.
  • Fast exponentiationNeeded for modular inverses with Fermat’s little theorem and computing large powers efficiently.
  • Dynamic programming basicsPascal’s triangle is a classic DP recurrence and teaches boundary conditions and transitions.
  • Polynomials and generating functions (intro)Vandermonde’s identity and coefficient extraction are naturally explained via generating functions.

Detailed Explanation

Tap terms for definitions

01Overview

The binomial theorem explains how to expand a power of a sum: (x + y)^n. Instead of multiplying (x + y) by itself n times, we can write the result as a sum of terms x^k y^{n-k} each weighted by a binomial coefficient C(n, k), which counts how many ways to pick which of the n factors contributed an x. This simple idea unlocks a family of powerful identities: the total sum of coefficients is 2^n, alternating sums vanish, and more subtle relationships like Vandermonde’s convolution link double sums to single binomial coefficients. In competitive programming and discrete math, these identities simplify counting problems, evaluate tricky sums, and connect to polynomial multiplication via generating functions. Practically, we implement binomial coefficients with dynamic programming (Pascal’s triangle), factorial formulas with modular inverses (for large n modulo a prime), or big integers for exact answers. Understanding when and why these identities hold—often through combinatorial proofs—builds intuition and a reliable toolkit for problems involving choices, paths, and partitions.

02Intuition & Analogies

Imagine making a smoothie with n scoops total, and for each scoop you can choose either flavor X or flavor Y. After n scoops, how many outcomes have exactly k scoops of X and n − k of Y? First you decide which k positions get X: there are C(n, k) ways. That’s why the coefficient of x^k y^{n-k} in (x + y)^n is C(n, k): each term represents a particular pattern of choices. Now consider adding up all possible outcomes regardless of how many X scoops you used. That’s like plugging x = y = 1 into (x + y)^n: every outcome counts as 1, and you get 2^n total combinations because every scoop had 2 options. If you instead subtract Y choices by plugging x = 1, y = −1, then every arrangement with an odd number of Y scoops cancels with a matching arrangement with an even number, yielding 0. For Vandermonde’s identity, picture selecting r people from two groups of sizes m and n. You can first pick k from the first group and r − k from the second, summing over k, or you can just pick r from m + n at once—both counts must match. The hockey-stick identity is like stacking bricks diagonally in Pascal’s triangle; the diagonal sums “collect” into a single larger brick one row below, forming a handle that looks like a hockey stick. These everyday selection stories mirror the algebra perfectly and make the formulas feel inevitable rather than mysterious.

03Formal Definition

For integers n 0 and indeterminates x, y, the binomial theorem states: (x + y)^n = . Here = counts the number of k-element subsets of an n-element set (with the convention = 0 when or ). Core identities follow from combinatorial arguments or generating functions: = 2^n and (-1)^k = 0 are evaluations of (1 + 1)^n and (1 - 1)^n. Vandermonde’s identity says = , reflecting coefficient extraction from (1+z)^m(1+z)^n = (1+z)^{m+n}. The “hockey-stick” identity = follows from summing Pascal’s rule = + . Another classic is = , derived from selecting n items from two disjoint n-sets or from the middle coefficient of (1+z)^{2n}. These results extend to generalized exponents via the generalized binomial series when < 1 and , though in competitive programming we typically work with integer n and modular arithmetic.

04When to Use

Use the binomial theorem when expanding (a + b)^n, counting selections, or evaluating sums involving \binom{n}{k}. It’s especially helpful when: 1) you need the coefficient of a particular power in a polynomial product; 2) you are summing over ways to split a choice among two groups (Vandermonde); 3) you want to simplify sums of binomial coefficients (e.g., compute \sum \binom{n}{k} quickly as 2^n); 4) you must prove a recurrence or identity using Pascal’s rule; 5) you are convolving two distributions or combinatorial counts, since binomials naturally arise as coefficients of (1 + z)^n; and 6) you perform many nCk queries modulo a prime—precompute factorials and inverse factorials for O(1) queries. In problems about lattice paths, partitions into two types, or probability with independent binary choices, binomial identities often convert a messy sum into a compact closed form. In code, reach for Pascal’s triangle for small n (exact integers), for big n under a modulus use factorials with modular inverses, and for a single query with moderate n choose multiplicative formulas to avoid overflow.

⚠️Common Mistakes

Common pitfalls include: 1) Overflow: C(n, k) grows fast; 64-bit integers overflow by around n ≈ 67 for central binomials. Use big integers (cpp_int) or compute modulo a prime with care. 2) Wrong ranges: In sums like \sum_{k} \binom{m}{k}\binom{n}{r-k}, remember terms with k < 0 or k > m or r - k < 0 or r - k > n are zero; clamp the loop bounds. 3) Off-by-one errors in the hockey-stick identity: \sum_{i=k}^{n} \binom{i}{k} equals \binom{n+1}{k+1}, not \binom{n}{k+1}. 4) Misusing factorial formulas modulo non-primes: Fermat’s little theorem for inverses only works when the modulus is prime and arguments are not multiples of the modulus. 5) Ignoring symmetry: \binom{n}{k} = \binom{n}{n-k}; always take k = min(k, n - k) in multiplicative computations to reduce overflow and time. 6) Numerical instability: Evaluating (x + y)^n with floating points may incur error; separate coefficient computation from numeric evaluation, and prefer integer arithmetic when possible. 7) Assuming alternating sum is always zero: \sum (-1)^k \binom{n}{k} = 0 only for n > 0; for n = 0 it equals 1. 8) Forgetting base cases: \binom{n}{0} = \binom{n}{n} = 1 and empty sums should be handled carefully in code to match identity conventions.

Key Formulas

Binomial Theorem

Explanation: Expands the nth power of a sum into a sum of monomials weighted by binomial coefficients. Each term corresponds to choosing which factors contribute x versus y.

Factorial Formula

Explanation: Counts the number of k-element subsets of an n-element set. The denominator removes overcounting of reorderings.

Symmetry

Explanation: Choosing k items to include is equivalent to choosing n − k items to exclude. This halves work in computations.

Pascal's Rule

Explanation: Each selection either includes a fixed element or not, splitting counts into two smaller binomial coefficients. This underlies Pascal’s triangle.

Sum of Coefficients

Explanation: Set x = y = 1 in the binomial theorem to count all subsets of an n-element set. There are 2^n subsets in total.

Alternating Sum

Explanation: Evaluate (1 - 1)^n using the binomial theorem. Positive and negative contributions cancel for n > 0.

Sum of Squares

Explanation: Counts selections of n items from two disjoint n-sets or equals the middle coefficient of (1 + z)^{2n}. It is the central binomial coefficient.

Vandermonde's Identity

Explanation: Coefficient extraction from (1+z)^m(1+z)^n = (1+z)^{m+n}. Also a direct counting argument over two groups.

Hockey-stick Identity

Explanation: Summing a diagonal in Pascal’s triangle collapses to a single binomial coefficient one row down and one step right.

Ordinary Generating Function

Explanation: Encodes binomial coefficients as coefficients of a power series. Useful for coefficient extraction and convolution reasoning.

Generalized Binomial Coefficient

Explanation: Extends binomial coefficients to real or complex . Appears in the generalized binomial series.

Generalized Binomial Series

Explanation: Expands powers with non-integer exponents as an infinite series. Used in analysis and generating function manipulations.

Complexity Analysis

There are several standard ways to compute binomial coefficients and apply identities, each with distinct complexity trade-offs. - Pascal’s triangle DP: Building all C(n, k) for 0 ≤ costs O() time and O() space if stored fully. This is ideal for small to moderate N (e.g., ≤ 2000) or when you need many coefficients across rows. Memory can be reduced to O(N) if you keep only the current row but then you cannot answer arbitrary queries later without recomputation. - Multiplicative formula: Computing a single C(n, k) by the product ∏_{i=1}^{k} (n + 1 − i)/i takes O(k) time and O(1) space. With 64-bit integers, overflow occurs quickly; using big integers (cp) removes overflow concerns but increases constant factors. - Factorials with modular inverses (prime modulus p): Precompute fact[i] and invfact[i] for i up to N in O(N) time and O(N) space. Each query C(n, k) becomes O(1) via fact[n] · invfact[k] · invfact[n − k] mod p. Modular inverses of factorials are obtained using Fermat’s little theorem with fast exponentiation (O(log p)) for invfact[N] and a linear sweep for the rest. - Convolution for Vandermonde: Computing the coefficient of in (1 + z)^m(1 + z)^n by naive convolution requires O(r) time after generating both rows in O(m + n). Full polynomial multiplication would be O(mn) (or O(n log n) with FFT), but identities let you jump directly to O(1) using C(m + n, r) when allowed. Space usage is typically dominated by arrays holding factorials or Pascal rows (O(N) or O()). For identity verification over many test cases modulo p, the factorial precomputation approach is optimal. For exact arithmetic and single queries, the multiplicative formula with big integers balances simplicity and safety.

Code Examples

Modular nCk with factorials and identity checks (sum, alternating sum, sum of squares)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Modulus must be prime for Fermat's little theorem
5const int MOD = 1'000'000'007; // 1e9+7
6
7long long modpow(long long a, long long e) {
8 long long r = 1 % MOD;
9 a %= MOD;
10 while (e > 0) {
11 if (e & 1) r = (r * a) % MOD;
12 a = (a * a) % MOD;
13 e >>= 1;
14 }
15 return r;
16}
17
18struct Comb {
19 int N;
20 vector<long long> fact, invfact;
21 Comb(int n) { init(n); }
22 void init(int n) {
23 N = n;
24 fact.assign(N + 1, 1);
25 invfact.assign(N + 1, 1);
26 for (int i = 1; i <= N; ++i) fact[i] = fact[i - 1] * i % MOD;
27 // invfact[N] = fact[N]^{-1} mod MOD by Fermat's little theorem
28 invfact[N] = modpow(fact[N], MOD - 2);
29 for (int i = N; i >= 1; --i) invfact[i - 1] = invfact[i] * i % MOD;
30 }
31 long long C(int n, int k) const {
32 if (k < 0 || k > n || n < 0 || n > N) return 0;
33 return fact[n] * invfact[k] % MOD * invfact[n - k] % MOD;
34 }
35};
36
37int main() {
38 ios::sync_with_stdio(false);
39 cin.tie(nullptr);
40
41 int n; cin >> n; // e.g., 1000000 for stress test with modulo arithmetic
42 Comb comb(n * 2); // need up to 2n for sum of squares identity
43
44 // 1) Sum of binomial coefficients: sum C(n,k) = 2^n (mod MOD)
45 long long sum = 0;
46 for (int k = 0; k <= n; ++k) sum = (sum + comb.C(n, k)) % MOD;
47 long long two_pow_n = modpow(2, n);
48
49 // 2) Alternating sum: sum (-1)^k C(n,k) = 0 for n>0 (mod MOD)
50 long long alt = 0;
51 for (int k = 0; k <= n; ++k) {
52 long long term = comb.C(n, k);
53 if (k & 1) term = (MOD - term) % MOD; // multiply by -1
54 alt += term;
55 if (alt >= MOD) alt -= MOD;
56 }
57
58 // 3) Sum of squares: sum C(n,k)^2 = C(2n, n) (mod MOD)
59 long long sumsq = 0;
60 for (int k = 0; k <= n; ++k) {
61 long long t = comb.C(n, k);
62 sumsq = (sumsq + t * t) % MOD;
63 }
64 long long central = comb.C(2 * n, n);
65
66 cout << "sum C(n,k) mod M: " << sum << " expected: " << two_pow_n << "\n";
67 cout << "alt sum mod M: " << alt << " expected: " << (n == 0 ? 1 : 0) << "\n";
68 cout << "sum C(n,k)^2: " << sumsq << " expected: " << central << "\n";
69 return 0;
70}
71

We precompute factorials and inverse factorials modulo a prime to answer C(n, k) in O(1). Then we verify three identities: (1) total sum equals 2^n, (2) alternating sum is 0 for n > 0, and (3) the sum of squares equals the central binomial coefficient C(2n, n). Modular exponentiation computes 2^n and inverses.

Time: Precomputation O(n), each identity sum O(n), total O(n).Space: O(n) for factorial and inverse factorial arrays.
Exact Pascal's triangle and the Hockey-stick identity with big integers
1#include <bits/stdc++.h>
2#include <boost/multiprecision/cpp_int.hpp>
3using namespace std;
4using boost::multiprecision::cpp_int;
5
6int main() {
7 ios::sync_with_stdio(false);
8 cin.tie(nullptr);
9
10 int N; cin >> N; // build rows up to N (e.g., N=200)
11 vector<vector<cpp_int>> C(N + 1, vector<cpp_int>(N + 1));
12 for (int n = 0; n <= N; ++n) {
13 C[n][0] = C[n][n] = 1;
14 for (int k = 1; k < n; ++k) C[n][k] = C[n - 1][k] + C[n - 1][k - 1];
15 }
16
17 // Demonstrate Hockey-stick: sum_{i=k}^{n} C(i, k) = C(n+1, k+1)
18 int k, n; cin >> k >> n; // require 0 <= k <= n < N
19 cpp_int lhs = 0;
20 for (int i = k; i <= n; ++i) lhs += C[i][k];
21 cpp_int rhs = C[n + 1][k + 1];
22
23 cout << "Sum_{i=" << k << ".." << n << "} C(i," << k << ") = " << lhs << "\n";
24 cout << "C(" << (n + 1) << "," << (k + 1) << ") = " << rhs << "\n";
25 cout << (lhs == rhs ? "Identity holds" : "Identity fails") << "\n";
26
27 // Optionally print row N to see Pascal's triangle structure
28 // for (int j = 0; j <= N; ++j) cout << C[N][j] << ' ';
29 // cout << '\n';
30 return 0;
31}
32

We construct Pascal’s triangle exactly using big integers to avoid overflow. Then we compute the diagonal sum on the left-hand side of the hockey-stick identity and compare it to the corresponding binomial coefficient on the right-hand side, proving the relationship numerically.

Time: Building triangle O(N^2); answering one identity query O(n − k + 1).Space: O(N^2) to store all C(n, k).
Vandermonde’s identity via convolution and direct binomial
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute a single row of Pascal's triangle using 64-bit if safe.
5// For larger inputs, consider big integers or modular arithmetic.
6vector<unsigned long long> pascal_row(int n) {
7 vector<unsigned long long> row(n + 1, 1);
8 for (int k = 1; k <= n; ++k) {
9 // Use multiplicative formula with symmetry to reduce overflow risk
10 int kk = min(k, n - k);
11 __int128 num = 1, den = 1;
12 for (int i = 1; i <= kk; ++i) {
13 num *= (n + 1 - i);
14 den *= i;
15 // Reduce by GCD occasionally to keep numbers small
16 long long g = std::gcd((long long)(num % (1LL<<60)), (long long)(den % (1LL<<60)));
17 if (g > 1) { num /= g; den /= g; }
18 }
19 unsigned long long val = (unsigned long long)(num / den);
20 row[k] = val;
21 }
22 return row;
23}
24
25int main() {
26 ios::sync_with_stdio(false);
27 cin.tie(nullptr);
28
29 int m, n, r; cin >> m >> n >> r; // e.g., 20 25 15
30 // Clamp r to valid range when summing
31 int lo = max(0, r - n);
32 int hi = min(r, m);
33
34 auto A = pascal_row(m);
35 auto B = pascal_row(n);
36
37 // Compute coefficient of z^r in (1+z)^m (1+z)^n by convolution
38 __int128 lhs = 0;
39 for (int k = lo; k <= hi; ++k) {
40 lhs += (__int128)A[k] * B[r - k];
41 }
42
43 auto C = pascal_row(m + n);
44 unsigned long long rhs = (r >= 0 && r <= m + n) ? C[r] : 0ULL;
45
46 cout << "Vandermonde LHS: ";
47 // Print 128-bit value safely by converting to string
48 __int128 tmp = lhs;
49 if (tmp == 0) cout << 0;
50 else {
51 string s;
52 bool neg = false; if (tmp < 0) { neg = true; tmp = -tmp; }
53 while (tmp > 0) { int d = (int)(tmp % 10); s.push_back('0' + d); tmp /= 10; }
54 if (neg) cout << '-';
55 reverse(s.begin(), s.end());
56 cout << s;
57 }
58 cout << "\nVandermonde RHS: " << rhs << "\n";
59 cout << ((unsigned long long)lhs == rhs ? "Identity holds (within 64-bit range)" : "May overflow or mismatch") << "\n";
60 return 0;
61}
62

We compute the mth and nth rows of Pascal’s triangle, then convolve them to obtain the coefficient of z^r, which is the Vandermonde left-hand side. We compare it with C(m + n, r), the right-hand side. For safety, we use 128-bit temporaries for products, but for very large inputs you should switch to big integers or modular arithmetic.

Time: Building rows O(m + n), convolution O(hi − lo + 1) = O(min(r, m, n)).Space: O(m + n) for the three Pascal rows.
Expanding (x + y)^n and evaluating using the binomial theorem
1#include <bits/stdc++.h>
2#include <boost/multiprecision/cpp_int.hpp>
3using namespace std;
4using boost::multiprecision::cpp_int;
5
6// Build nth row of Pascal's triangle exactly
7vector<cpp_int> pascal_row_big(int n) {
8 vector<cpp_int> C(n + 1);
9 C[0] = 1;
10 for (int k = 1; k <= n; ++k) {
11 // C(n,k) = C(n,k-1) * (n-k+1) / k
12 C[k] = C[k - 1] * (n - k + 1) / k;
13 }
14 return C;
15}
16
17int main() {
18 ios::sync_with_stdio(false);
19 cin.tie(nullptr);
20
21 int n; long double x, y;
22 cin >> n >> x >> y; // e.g., 20 1.5 2.0
23
24 auto C = pascal_row_big(n);
25
26 // Print symbolic expansion coefficients
27 cout << "(x + y)^" << n << " = ";
28 for (int k = 0; k <= n; ++k) {
29 cout << C[k] << "*x^" << k << "*y^" << (n - k);
30 if (k != n) cout << " + ";
31 }
32 cout << "\n";
33
34 // Evaluate numerically at given x, y
35 long double value = 0.0L;
36 // Horner-like accumulation using powers to avoid repeated pow calls
37 vector<long double> xp(n + 1, 1.0L), yp(n + 1, 1.0L);
38 for (int i = 1; i <= n; ++i) { xp[i] = xp[i - 1] * x; yp[i] = yp[i - 1] * y; }
39 for (int k = 0; k <= n; ++k) {
40 // Convert big integer coefficient to long double for evaluation
41 long double ck = (long double)C[k];
42 value += ck * xp[k] * yp[n - k];
43 }
44
45 long double direct = pow(x + y, n);
46 cout << fixed << setprecision(8);
47 cout << "Binomial evaluation: " << value << "\n";
48 cout << "Direct pow(x+y,n): " << direct << "\n";
49 cout << "Absolute error: " << fabsl(value - direct) << "\n";
50 return 0;
51}
52

We construct the exact coefficients C(n, k) using a stable recurrence and print the symbolic expansion of (x + y)^n. Then we evaluate the polynomial numerically at chosen x and y and compare it to pow(x + y, n), demonstrating the binomial theorem computationally.

Time: O(n) to build the row, O(n) to evaluate with precomputed powers.Space: O(n) for coefficients and power arrays.