∑MathIntermediate

Lucas' Theorem

Key Points

•
Lucas' Theorem lets you compute C(n, k) modulo a prime p by working digit-by-digit in base p.
•
Write n and k in base p as n = $\sum$ $n_{i}$ $p^{i}$ and k = $\sum$ $k_{i}$ $p^{i}$ , then C(n, k) $mod$ p = $\prod$ C( $n_{i}$ , $k_{i}$ ) $mod$ p.
•
If any digit $k_{i} > n_{i}$ , the answer is 0 because that small binomial is 0.
•
For a prime power $p^{e}$ , you can extend Lucas using p-adic valuation and factorials with multiples of p removed.
•
The exponent of p in C(n, k) is $v = v_{p} (n!$ ) - $v_{p}$ (k!) - $v_{p}$ ((n-k)!), computed by Legendre's formula.
•
If v $\geq$ e, then C(n, k) $\equiv$ 0 $(mod p^{e})$ ; otherwise multiply by $p^{v}$ and use modular inverses modulo $p^{e}$ .
•
For a composite modulus m, factor m into prime powers and combine answers using the Chinese Remainder Theorem.
•
Precompute factorials up to p-1 (for prime p) or prefix products up to $p^{e}$ (for prime power) to make queries fast.

Prerequisites

→Modular arithmetic and congruences — Understanding how operations behave modulo m is essential for interpreting and manipulating residues.
→Binary exponentiation (fast power) — Used to compute modular powers and inverses efficiently, especially a^{p-2} mod p and powmod for p^e.
→Prime numbers and factorization — Lucas requires a prime modulus; extensions need factorization of m into prime powers for CRT.
→Chinese Remainder Theorem — Needed to combine residues from different prime-power moduli back into a single answer modulo m.
→Combinatorics: factorials and binomial coefficients — Core definitions and properties of C(n, k) underpin Lucas and its extensions.
→p-adic valuation and Legendre's formula — The extension to prime powers depends on counting powers of p in factorials.
→Base representation of integers — Lucas operates on base-p digits of n and k, so comfort with base conversions is helpful.
→Extended Euclidean algorithm — Computes modular inverses modulo p^e for numbers coprime to p and supports CRT merging.

Detailed Explanation

Tap terms for definitions

01Overview

Lucas' Theorem is a powerful tool in modular combinatorics that allows the calculation of binomial coefficients C(n, k) modulo a prime p even when n and k are extremely large. The key idea is to express n and k in base p and reduce the large problem into a product of many small binomial coefficients, each involving digits less than p. This makes precomputation feasible: we only need factorials up to p-1 to evaluate each small C(n_i, k_i) modulo p. For prime powers p^e, there is a more involved extension: we use p-adic valuation (how many times p divides a number) and compute factorials with all multiples of p removed, modulo p^e. The answer is either zero (if too many factors of p appear) or a product involving these reduced factorials and a power of p determined by Legendre's formula. Finally, for a general composite modulus m, we factor m into prime powers, compute each residue modulo p_i^{e_i}, and stitch the results back together via the Chinese Remainder Theorem. This approach is common in programming contests because it scales to huge n (e.g., 10^18) as long as p (or p^e) is reasonably small, and it elegantly explains when a binomial coefficient is divisible by p through base-p digit comparisons.

02Intuition & Analogies

Think of numbers written in base p like stacks of coins in denominations of p^0, p^1, p^2, and so on. Choosing k items from n items (C(n, k)) is a global operation, but Lucas' Theorem says we can do it locally, per coin stack (per digit), and then multiply the local answers. If at some digit you try to choose more coins than you have (k_i > n_i), the whole selection becomes impossible in that position, making the overall count 0 modulo p. This is like needing to borrow from a higher denomination; if a borrow is needed in any digit, the combination count becomes divisible by p. For prime powers, imagine that some items are “sticky” because they carry factors of p. We first count how many sticky factors p appear in C(n, k) using a simple digit-sum/division rule (Legendre’s formula). If there are too many sticky factors (at least e), the result vanishes modulo p^e. Otherwise, strip away all p-factors from the factorials (keeping only the parts coprime to p), multiply what remains modulo p^e, and then reattach exactly the right number of sticky p’s. The periodicity trick says: the product of numbers not divisible by p repeats in blocks of size p^e, so we can reduce massive products to a few precomputed pieces. Finally, for a general modulus m, think of locks and keys: each prime power p_i^{e_i} is a separate lock. You compute a key (the residue) for each lock independently. The Chinese Remainder Theorem is the locksmith that forges a single master key (the unique number modulo m) that works with all locks simultaneously.

03Formal Definition

Let p be a prime and write the base-p expansions n =

\sum_{i = 0}^{t}

n_{i}

p^{i}

and k =

\sum_{i = 0}^{t}

k_{i}

p^{i}

with 0

\leq

n_{i}

k_{i} < p

. Lucas' Theorem states: C(n, k)

\equiv

\prod_{i = 0}^{t}

n_{i}

k_{i}

)

(mod p)

. Consequently, C(n, k)

\equiv

(mod p)

if and only if there exists i with

k_{i} > n_{i}

. This has a p-adic interpretation via Kummer’s theorem: the exponent

v_{p}

(C(n, k)) equals the number of carries when adding k and n-k in base p (equivalently, borrows when subtracting k from n). For a prime power

p^{e}

, write

v = v_{p} (n!

) -

v_{p}

(k!) -

v_{p}

((n-k)!) where

v_{p}

(m!) =

\sum_{j = 1}^{\infty}

\left\lfloor

\frac{m}{p ^{j}}

\right\rfloor

(Legendre’s formula). Define n!_p as the product of numbers from 1 to n with all factors of p removed (so n! =

p^{v_{p} (n!)}

n!_p). Then C(n, k)

\equiv

p^{v}

\cdot

\frac{n ! _{p}}{k ! _{p} ( n - k ) ! _{p}}

(mod p^{e})

. If v

\geq

e, the residue is 0. The values n!_p modulo

p^{e}

can be computed using a periodic decomposition with period

p^{e}

over numbers coprime to p. For a general modulus m with prime power factorization m =

\prod

i^{e_{i}}

, we compute C(n, k) modulo each p_

i^{e_{i}}

and combine the results using the Chinese Remainder Theorem to obtain C(n, k) modulo m.

04When to Use

Use Lucas' Theorem when:

The modulus is a prime p and n, k can be very large (even up to 10^{18} or beyond), but you can afford O(p) precomputation and O(\log_p n) per query.
You need many queries of C(n, k) modulo a fixed small or medium prime p; precomputing factorials up to p-1 makes each query fast. Use the prime-power extension when:
The modulus is p^e with small p and modest e (so p^e is manageable to precompute prefix products). This commonly appears when m is a small composite and you plan to apply CRT. Use the CRT combination when:
The modulus m is composite but factorizable into reasonably small prime powers. You compute residues modulo each p_i^{e_i} and combine them. This is especially helpful in problems where the final modulus is, say, 10^9+6, 10^9, 998244353^2, or other composite values. Do not use plain Lucas when p is not prime or when p is extremely large (e.g., near 10^9), because precomputing up to p-1 becomes infeasible. In those cases, consider other techniques or restructure the problem.

⚠️Common Mistakes

Using Lucas with a non-prime modulus: The classic Lucas formula requires a prime p. For composite moduli, factor the modulus and use the prime-power extension plus CRT.
Forgetting the zero condition: If any base-p digit k_i exceeds n_i, then C(n, k) \equiv 0 \pmod{p}. A frequent bug is to multiply small binomials without this check, yielding incorrect nonzero answers.
Precomputing factorials up to n instead of p-1: Lucas only needs factorials modulo p up to p-1. Precomputing up to n (when n is huge) is impossible and unnecessary.
Wrong modular inverses: For prime p, use a^{p-2} \bmod p (Fermat’s little theorem). For p^e, you must ensure the number is coprime to p^e to have an inverse (only numbers not divisible by p). In our construction, k!_p and (n-k)!_p are coprime to p, so their inverses modulo p^e exist.
Ignoring p-adic valuation in prime powers: If v = v_p(C(n, k)) \ge e, the answer is 0 modulo p^e. Forgetting this check leads to nonzero residues when the true answer is zero.
Overflow in modular multiplication: When moduli are up to 64-bit (like p^e), use 128-bit intermediates (e.g., __int128 in C++) to avoid overflow.
Not caching precomputations: For multiple queries with the same p (or p^e), precompute factorials or prefix products once. Recomputing them per query slows the solution drastically.
Misapplying Kummer’s theorem: The number of carries relates to divisibility by p, not the exact residue. You still need Lucas products (mod p) or the prime-power machinery to get the residue value.

Key Formulas

Lucas' Theorem

C (n, k) \equiv i = 0 \prod t C (n_{i}, k_{i}) (mod p)

Explanation: Write n and k in base p with digits $n_{i}$ and $k_{i}$ . The binomial modulo a prime p equals the product of small binomials at each digit. If any $k_{i} > n_{i}$ , the corresponding term is zero.

Base-p Expansion

n = i = 0 \sum t n_{i} p^{i}, k = i = 0 \sum t k_{i} p^{i}, 0 \leq n_{i}, k_{i} < p

Explanation: Every integer has a unique base-p representation. Lucas works by processing these digits independently.

Legendre's Formula

v_{p} (n!) = j = 1 \sum \infty ⌊ \frac{n}{p ^{j}} ⌋

Explanation: This counts how many times p divides n!. It is central to determining the power of p in C(n, k) for prime powers.

p-adic valuation of C(n,k)

v_{p} (C (n, k)) = v_{p} (n!) - v_{p} (k!) - v_{p} ((n - k)!)

Explanation: The exponent of p in the binomial coefficient equals the difference of exponents in the factorials. If this is at least e, C(n, k) is 0 modulo $p^{e}$ .

Factorial Decomposition

n! = p^{v_{p} (n!)} \cdot n!_{p}

Explanation: Split n! into all the p-factors and a part coprime to p. The latter has an inverse modulo $p^{e}$ .

Binomial Mod Prime Power

C (n, k) \equiv p^{v} \cdot \frac{n ! _{p}}{k ! _{p} ( n - k ) ! _{p}} (mod p^{e})

Explanation: For $v = v_{p} (C (n$ ,k)), multiply the reduced factorial ratio by $p^{v}$ to get the residue modulo $p^{e}$ when $v < e;$ otherwise it is zero.

Reduced Factorial Recurrence

F (n) \equiv F (⌊ \frac{n}{p} ⌋) \cdot pref (n mod p^{e}) \cdot cyc^{⌊ \frac{n}{p ^{e}} ⌋} (mod p^{e})

Explanation: Here F(n) = n!_p mod $p^{e}$ , pref(x) is the product of 1..x skipping multiples of p, and cyc is the product over one full block 1.. $p^{e}$ skipping multiples of p. This enables fast computation of n!_p modulo $p^{e}$ .

Chinese Remainder Theorem

If m = i = 1 \prod r m_{i}, g cd (m_{i}, m_{j}) = 1, \exists! x (mod m) : x \equiv a_{i} (mod m_{i})

Explanation: Given pairwise coprime moduli $m_{i}$ and residues $a_{i}$ , there is a unique solution modulo their product. We use this to combine prime-power residues.

Fermat's Little Theorem (Inverse)

a^{p - 1} \equiv 1 (mod p) (g cd (a, p) = 1) \Rightarrow a^{- 1} \equiv a^{p - 2} (mod p)

Explanation: Provides a fast way to compute modular inverses modulo a prime using exponentiation.

Digit Count

# digits in base p for n = 1 + ⌊ lo g_{p} n ⌋

Explanation: Lucas' running time depends on the number of base-p digits, which grows logarithmically in n.

Kummer's Theorem

v_{p} (C (n, k)) = number of carries when adding k and (n - k) in base p

Explanation: A combinatorial way to see when C(n, k) is divisible by p, aligning with the digit-wise condition $k_{i} \leq n_{i}$ .

Complexity Analysis

For a prime modulus p, we precompute factorials and inverse factorials up to p−1 in O(p) time and O(p) space. Each query decomposes n and k into base-p digits, requiring O(lo

g_{p}

n) digits. For each digit, computing C(

n_{i}

k_{i}

) uses O(1) time via table lookups and a couple of modular multiplications. Thus a single query runs in O(lo

g_{p}

n) time. For Q queries under a fixed p, the total is O(p + Q lo

g_{p}

n). For a prime power modulus

p^{e}

, we precompute a prefix table pref[0..

p^{e}

] of products skipping multiples of p and the one-block product cyc in O(

p^{e}

) time and space. Evaluating n!_p modulo

p^{e}

via the recurrence requires O(lo

g_{p}

n) recursive levels; at each level we perform O(1) table lookups and a few modular exponentiations of cyc with exponent ⌊n/

p^{e}

⌋. Using fast exponentiation, powmod costs O(log

p^{e}

) = O(e log p), but the exponent is small (quotients decrease rapidly). In practice, each query is O(lo

g_{p}

n · log

p^{e}

). Computing the p-adic valuations

v_{p}

(n!),

v_{p}

(k!),

v_{p}

(n−k)! via Legendre’s formula takes O(lo

g_{p}

n) time. Overall a query is roughly O(lo

g_{p}

n · log

p^{e}

). For a general composite modulus m = ∏ p_

i^{e_{i}}

, we factor m (e.g., trial division up to √m) in O(√m) time, which is usually a one-time cost. Then we answer per prime power as above and combine residues with CRT. Pairwise CRT merges two congruences in O(log m) time using the extended Euclidean algorithm with 64-bit-safe arithmetic. Space usage is dominated by the largest

p^{e}

table among the factors. These methods are suitable for many competitive programming tasks when moduli’s prime powers are not too large.

Code Examples

Lucas' Theorem modulo a prime p (many queries, huge n and k)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 using int64 = long long;
5 using i128 = __int128_t;
6 
7 // Fast modular exponentiation: computes a^e mod m
8 static inline uint64_t powmod_u64(uint64_t a, uint64_t e, uint64_t m) {
9     uint64_t r = 1 % m;
10     while (e) {
11         if (e & 1) r = (uint64_t)((i128)r * a % m);
12         a = (uint64_t)((i128)a * a % m);
13         e >>= 1;
14     }
15     return r;
16 }
17 
18 struct LucasPrime {
19     uint64_t p;                 // prime modulus
20     vector<uint64_t> fact;      // factorials mod p up to p-1
21     vector<uint64_t> invfact;   // inverse factorials mod p up to p-1
22 
23     explicit LucasPrime(uint64_t prime): p(prime) {
24         // Precompute factorial and inverse factorial modulo prime p
25         fact.resize(p);
26         invfact.resize(p);
27         fact[0] = 1;
28         for (uint64_t i = 1; i < p; ++i) fact[i] = (fact[i-1] * i) % p;
29         // Fermat's little theorem for inverse factorial of (p-1)!
30         invfact[p-1] = powmod_u64(fact[p-1], p-2, p);
31         for (uint64_t i = p-1; i > 0; --i) invfact[i-1] = (invfact[i] * i) % p;
32     }
33 
34     // Small binomial C(n, k) for 0 <= n, k < p
35     uint64_t C_small(uint64_t n, uint64_t k) const {
36         if (k > n) return 0;
37         return (((fact[n] * invfact[k]) % p) * invfact[n - k]) % p;
38     }
39 
40     // Lucas: C(n, k) mod p for arbitrarily large n, k
41     uint64_t C(uint64_t n, uint64_t k) const {
42         if (k > n) return 0;
43         uint64_t res = 1;
44         while (n > 0 || k > 0) {
45             uint64_t ni = n % p;
46             uint64_t ki = k % p;
47             if (ki > ni) return 0; // one digit invalid → product is zero mod p
48             res = (res * C_small(ni, ki)) % p;
49             n /= p;
50             k /= p;
51         }
52         return res;
53     }
54 };
55 
56 int main() {
57     ios::sync_with_stdio(false);
58     cin.tie(nullptr);
59 
60     // Example usage:
61     // Compute many C(n, k) modulo a fixed prime p efficiently.
62     // Input format:
63     // p Q\n then Q lines of n k
64     // Example:
65     // 5 4
66     // 12 7
67     // 1000000000000 3
68     // 20 10
69     // 7 9
70 
71     uint64_t p; int Q;
72     if (!(cin >> p >> Q)) return 0;
73     LucasPrime lucas(p);
74     while (Q--) {
75         uint64_t n, k; cin >> n >> k;
76         cout << lucas.C(n, k) << '\n';
77     }
78     return 0;
79 }
80

This program precomputes factorials and inverse factorials modulo a prime p just once and answers each query using the Lucas digit product. It works even for very large n and k because it only processes O(log_p n) digits per query.

Time: Preprocessing: O(p). Per query: O(log_p n).Space: O(p) for factorial and inverse factorial tables.

Binomial coefficient modulo a prime power p^e

1 #include <bits/stdc++.h>
2 using namespace std;
3 using i128 = __int128_t;
4 using u64 = unsigned long long;
5 
6 static inline u64 mul_mod(u64 a, u64 b, u64 m) {
7     return (u64)((i128)a * b % m);
8 }
9 
10 static inline u64 powmod(u64 a, u64 e, u64 m) {
11     u64 r = 1 % m;
12     while (e) {
13         if (e & 1) r = mul_mod(r, a, m);
14         a = mul_mod(a, a, m);
15         e >>= 1;
16     }
17     return r;
18 }
19 
20 // Compute v_p(n!) using Legendre's formula
21 static inline unsigned long long vp_fact(unsigned long long n, unsigned long long p) {
22     unsigned long long v = 0;
23     while (n) { n /= p; v += n; }
24     return v;
25 }
26 
27 // Precompute prefix products skipping multiples of p up to pe
28 struct PrimePowerData {
29     u64 p, e, pe;          // prime p, exponent e, modulus pe = p^e
30     vector<u64> pref;      // pref[i] = product_{1..i, gcd(j,p)=1} j mod pe
31     u64 cyc;               // product over one full block 1..pe skipping multiples of p
32 
33     PrimePowerData(u64 p_, u64 e_) : p(p_), e(e_) {
34         pe = 1; for (u64 i = 0; i < e; ++i) pe *= p;
35         pref.assign(pe + 1, 1);
36         for (u64 i = 1; i <= pe; ++i) {
37             if (i % p != 0) pref[i] = mul_mod(pref[i-1], i % pe, pe);
38             else pref[i] = pref[i-1];
39         }
40         cyc = pref[pe];
41     }
42 
43     // Compute n!_p modulo pe using a standard recurrence
44     u64 fact_red(u64 n) const {
45         if (n == 0) return 1 % pe;
46         u64 res = fact_red(n / p);
47         // multiply by product over remainder block (skipping multiples of p)
48         res = mul_mod(res, pref[n % pe], pe);
49         // multiply by full cycles of size pe (numbers coprime to p repeat every pe)
50         res = mul_mod(res, powmod(cyc, n / pe, pe), pe);
51         return res;
52     }
53 
54     // Compute C(n, k) mod p^e
55     u64 C_mod_pe(u64 n, u64 k) const {
56         if (k > n) return 0;
57         u64 v = vp_fact(n, p) - vp_fact(k, p) - vp_fact(n - k, p);
58         if (v >= e) return 0; // too many factors of p
59         u64 num = fact_red(n);
60         u64 den = fact_red(k);
61         den = mul_mod(den, fact_red(n - k), pe);
62         // den is coprime to p, hence invertible modulo p^e
63         u64 inv_den = 0;
64         // Inverse via Euler's theorem works only for phi(pe) and gcd=1, but we prefer extended GCD.
65         // Implement extended GCD-based inverse since gcd(den, pe) = 1.
66         auto egcd = [](u64 a, u64 b) {
67             // returns (g, x, y) with ax + by = g
68             long long x0 = 1, y0 = 0, x1 = 0, y1 = 1;
69             long long aa = (long long)a, bb = (long long)b;
70             while (bb) {
71                 long long q = aa / bb;
72                 long long aa2 = aa - q * bb; aa = bb; bb = aa2;
73                 long long x2 = x0 - q * x1; x0 = x1; x1 = x2;
74                 long long y2 = y0 - q * y1; y0 = y1; y1 = y2;
75             }
76             return tuple<long long,long long,long long>(aa, x0, y0);
77         };
78         auto [g, x, y] = egcd(den % pe, pe);
79         // gcd should be 1
80         (void)y;
81         assert(g == 1);
82         inv_den = (x % (long long)pe + (long long)pe) % (long long)pe;
83 
84         u64 res = mul_mod(num, (u64)inv_den, pe);
85         res = mul_mod(res, powmod(p, v, pe), pe);
86         return res;
87     }
88 };
89 
90 int main() {
91     ios::sync_with_stdio(false);
92     cin.tie(nullptr);
93 
94     // Example: compute several C(n, k) modulo p^e
95     // Input: p e Q then Q lines of n k
96     // Example:
97     // 5 2 3
98     // 12 7
99     // 100 30
100     // 7 3
101 
102     u64 p, e; int Q;
103     if (!(cin >> p >> e >> Q)) return 0;
104     PrimePowerData pp(p, e);
105     while (Q--) {
106         u64 n, k; cin >> n >> k;
107         cout << pp.C_mod_pe(n, k) << '\n';
108     }
109     return 0;
110 }
111

This program computes C(n, k) modulo p^e. It precomputes prefix products that skip multiples of p and uses a recursive reduced factorial to evaluate n! with p-factors removed. The p-adic valuation decides if the answer is zero; otherwise we multiply by p^v and divide by the reduced factorials modulo p^e using an extended-GCD modular inverse.

Time: Preprocessing: O(p^e). Per query: O(log_p n · log(p^e)).Space: O(p^e) for prefix products and constants.

C(n, k) modulo a composite m via CRT (factor to prime powers, then combine)

1 #include <bits/stdc++.h>
2 using namespace std;
3 using u64 = unsigned long long;
4 using i128 = __int128_t;
5 
6 static inline u64 mul_mod(u64 a, u64 b, u64 m) { return (u64)((i128)a * b % m); }
7 static inline u64 powmod(u64 a, u64 e, u64 m) { u64 r=1%m; while(e){ if(e&1) r=mul_mod(r,a,m); a=mul_mod(a,a,m); e>>=1;} return r; }
8 
9 // Legendre v_p(n!)
10 static inline u64 vp_fact(u64 n, u64 p){ u64 v=0; while(n){ n/=p; v+=n; } return v; }
11 
12 // Data for one prime power p^e
13 struct PrimePowerData {
14     u64 p,e,pe; vector<u64> pref; u64 cyc;
15     PrimePowerData(u64 p_, u64 e_): p(p_), e(e_) {
16         pe=1; for(u64 i=0;i<e;i++) pe*=p;
17         pref.assign(pe+1,1);
18         for(u64 i=1;i<=pe;i++) pref[i] = (i%p? mul_mod(pref[i-1], i%pe, pe): pref[i-1]);
19         cyc = pref[pe];
20     }
21     u64 fact_red(u64 n) const { if(!n) return 1%pe; u64 r=fact_red(n/p); r=mul_mod(r, pref[n%pe], pe); r=mul_mod(r, powmod(cyc, n/pe, pe), pe); return r; }
22     u64 inv_mod(u64 a) const { // inverse modulo pe, gcd(a,pe)=1
23         long long x0=1,y0=0,x1=0,y1=1; long long aa=(long long)(a%pe), bb=(long long)pe;
24         while(bb){ long long q=aa/bb; long long aa2=aa-q*bb; aa=bb; bb=aa2; long long x2=x0-q*x1; x0=x1; x1=x2; long long y2=y0-q*y1; y0=y1; y1=y2; }
25         // aa is gcd, should be 1
26         long long x = (x0 % (long long)pe + (long long)pe) % (long long)pe;
27         return (u64)x;
28     }
29     u64 C_mod_pe(u64 n, u64 k) const {
30         if(k>n) return 0;
31         u64 v = vp_fact(n,p) - vp_fact(k,p) - vp_fact(n-k,p);
32         if(v>=e) return 0;
33         u64 num=fact_red(n);
34         u64 den=fact_red(k); den=mul_mod(den, fact_red(n-k), pe);
35         u64 res = mul_mod(num, inv_mod(den), pe);
36         res = mul_mod(res, powmod(p, v, pe), pe);
37         return res;
38     }
39 };
40 
41 // Factor m into prime powers (trial division)
42 vector<pair<u64,u64>> factor_prime_powers(u64 m){
43     vector<pair<u64,u64>> fac; 
44     for(u64 p=2; p*p<=m; ++p){ if(m%p==0){ u64 e=0; while(m%p==0){ m/=p; ++e; } fac.push_back({p,e}); } }
45     if(m>1) fac.push_back({m,1});
46     return fac;
47 }
48 
49 // Combine x ≡ a1 (mod m1), x ≡ a2 (mod m2), with gcd(m1,m2)=1
50 pair<u64,u64> crt_combine(u64 a1, u64 m1, u64 a2, u64 m2){
51     // Solve: m1*t ≡ (a2 - a1) (mod m2)
52     long long x0=1,y0=0,x1=0,y1=1; long long aa=(long long)m1, bb=(long long)m2;
53     while(bb){ long long q=aa/bb; long long aa2=aa-q*bb; aa=bb; bb=aa2; long long x2=x0-q*x1; x0=x1; x1=x2; long long y2=y0-q*y1; y0=y1; y1=y2; }
54     // gcd should be 1: aa==1
55     long long inv_m1_mod_m2 = (x0 % (long long)m2 + (long long)m2) % (long long)m2;
56     u64 t = mul_mod((u64)((a2 + m2 - (a1 % m2)) % m2), (u64)inv_m1_mod_m2, m2);
57     u64 res = a1 + (u64)((i128)m1 * t % (i128)(m1*m2));
58     u64 mod = m1 * m2;
59     res %= mod;
60     return {res, mod};
61 }
62 
63 int main(){
64     ios::sync_with_stdio(false);
65     cin.tie(nullptr);
66 
67     // Input: m Q then Q lines: n k
68     // For each query, compute C(n,k) mod m via prime-power computations and CRT.
69     // Example:
70     // 1000 3
71     // 12 7
72     // 100 30
73     // 7 3
74 
75     u64 m; int Q; if(!(cin>>m>>Q)) return 0;
76     auto fac = factor_prime_powers(m);
77 
78     // Build data per prime power
79     vector<PrimePowerData> blocks; blocks.reserve(fac.size());
80     vector<u64> moduli; moduli.reserve(fac.size());
81     for(auto [p,e]: fac){ blocks.emplace_back(p,e); u64 pe=1; for(u64 i=0;i<e;i++) pe*=p; moduli.push_back(pe); }
82 
83     while(Q--){
84         u64 n,k; cin>>n>>k;
85         // Compute residues per block
86         vector<u64> rems; rems.reserve(blocks.size());
87         for(size_t i=0;i<blocks.size();++i){ rems.push_back(blocks[i].C_mod_pe(n,k)); }
88         // Combine via CRT
89         u64 x = rems[0], M = moduli[0];
90         for(size_t i=1;i<rems.size();++i){ auto [nx, nM] = crt_combine(x, M, rems[i], moduli[i]); x=nx; M=nM; }
91         cout << (x % m) << '\n';
92     }
93     return 0;
94 }
95

This program computes C(n, k) modulo an arbitrary composite m. It factors m into prime powers, evaluates each residue using the prime-power method, and then recombines them using CRT. It is a robust template for many competitive programming tasks where the modulus is not prime.

Time: One-time factorization: O(√m). Precompute per prime power: O(p^e). Per query: sum over blocks O(log_p n · log(p^e)) plus O(#blocks · log m) for CRT merging.Space: Sum of O(p^e) over all prime-power factors.

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	using int64 = long long;
5	using i128 = __int128_t;
6
7	// Fast modular exponentiation: computes a^e mod m
8	static inline uint64_t powmod_u64(uint64_t a, uint64_t e, uint64_t m) {
9	uint64_t r = 1 % m;
10	while (e) {
11	if (e & 1) r = (uint64_t)((i128)r * a % m);
12	a = (uint64_t)((i128)a * a % m);
13	e >>= 1;
14	}
15	return r;
16	}
17
18	struct LucasPrime {
19	uint64_t p; // prime modulus
20	vector<uint64_t> fact; // factorials mod p up to p-1
21	vector<uint64_t> invfact; // inverse factorials mod p up to p-1
22
23	explicit LucasPrime(uint64_t prime): p(prime) {
24	// Precompute factorial and inverse factorial modulo prime p
25	fact.resize(p);
26	invfact.resize(p);
27	fact[0] = 1;
28	for (uint64_t i = 1; i < p; ++i) fact[i] = (fact[i-1] * i) % p;
29	// Fermat's little theorem for inverse factorial of (p-1)!
30	invfact[p-1] = powmod_u64(fact[p-1], p-2, p);
31	for (uint64_t i = p-1; i > 0; --i) invfact[i-1] = (invfact[i] * i) % p;
32	}
33
34	// Small binomial C(n, k) for 0 <= n, k < p
35	uint64_t C_small(uint64_t n, uint64_t k) const {
36	if (k > n) return 0;
37	return (((fact[n] * invfact[k]) % p) * invfact[n - k]) % p;
38	}
39
40	// Lucas: C(n, k) mod p for arbitrarily large n, k
41	uint64_t C(uint64_t n, uint64_t k) const {
42	if (k > n) return 0;
43	uint64_t res = 1;
44	while (n > 0 \|\| k > 0) {
45	uint64_t ni = n % p;
46	uint64_t ki = k % p;
47	if (ki > ni) return 0; // one digit invalid → product is zero mod p
48	res = (res * C_small(ni, ki)) % p;
49	n /= p;
50	k /= p;
51	}
52	return res;
53	}
54	};
55
56	int main() {
57	ios::sync_with_stdio(false);
58	cin.tie(nullptr);
59
60	// Example usage:
61	// Compute many C(n, k) modulo a fixed prime p efficiently.
62	// Input format:
63	// p Q\n then Q lines of n k
64	// Example:
65	// 5 4
66	// 12 7
67	// 1000000000000 3
68	// 20 10
69	// 7 9
70
71	uint64_t p; int Q;
72	if (!(cin >> p >> Q)) return 0;
73	LucasPrime lucas(p);
74	while (Q--) {
75	uint64_t n, k; cin >> n >> k;
76	cout << lucas.C(n, k) << '\n';
77	}
78	return 0;
79	}
80

1	#include <bits/stdc++.h>
2	using namespace std;
3	using u64 = unsigned long long;
4	using i128 = __int128_t;
5
6	static inline u64 mul_mod(u64 a, u64 b, u64 m) { return (u64)((i128)a * b % m); }
7	static inline u64 powmod(u64 a, u64 e, u64 m) { u64 r=1%m; while(e){ if(e&1) r=mul_mod(r,a,m); a=mul_mod(a,a,m); e>>=1;} return r; }
8
9	// Legendre v_p(n!)
10	static inline u64 vp_fact(u64 n, u64 p){ u64 v=0; while(n){ n/=p; v+=n; } return v; }
11
12	// Data for one prime power p^e
13	struct PrimePowerData {
14	u64 p,e,pe; vector<u64> pref; u64 cyc;
15	PrimePowerData(u64 p_, u64 e_): p(p_), e(e_) {
16	pe=1; for(u64 i=0;i<e;i++) pe*=p;
17	pref.assign(pe+1,1);
18	for(u64 i=1;i<=pe;i++) pref[i] = (i%p? mul_mod(pref[i-1], i%pe, pe): pref[i-1]);
19	cyc = pref[pe];
20	}
21	u64 fact_red(u64 n) const { if(!n) return 1%pe; u64 r=fact_red(n/p); r=mul_mod(r, pref[n%pe], pe); r=mul_mod(r, powmod(cyc, n/pe, pe), pe); return r; }
22	u64 inv_mod(u64 a) const { // inverse modulo pe, gcd(a,pe)=1
23	long long x0=1,y0=0,x1=0,y1=1; long long aa=(long long)(a%pe), bb=(long long)pe;
24	while(bb){ long long q=aa/bb; long long aa2=aa-qbb; aa=bb; bb=aa2; long long x2=x0-qx1; x0=x1; x1=x2; long long y2=y0-q*y1; y0=y1; y1=y2; }
25	// aa is gcd, should be 1
26	long long x = (x0 % (long long)pe + (long long)pe) % (long long)pe;
27	return (u64)x;
28	}
29	u64 C_mod_pe(u64 n, u64 k) const {
30	if(k>n) return 0;
31	u64 v = vp_fact(n,p) - vp_fact(k,p) - vp_fact(n-k,p);
32	if(v>=e) return 0;
33	u64 num=fact_red(n);
34	u64 den=fact_red(k); den=mul_mod(den, fact_red(n-k), pe);
35	u64 res = mul_mod(num, inv_mod(den), pe);
36	res = mul_mod(res, powmod(p, v, pe), pe);
37	return res;
38	}
39	};
40
41	// Factor m into prime powers (trial division)
42	vector<pair<u64,u64>> factor_prime_powers(u64 m){
43	vector<pair<u64,u64>> fac;
44	for(u64 p=2; p*p<=m; ++p){ if(m%p==0){ u64 e=0; while(m%p==0){ m/=p; ++e; } fac.push_back({p,e}); } }
45	if(m>1) fac.push_back({m,1});
46	return fac;
47	}
48
49	// Combine x ≡ a1 (mod m1), x ≡ a2 (mod m2), with gcd(m1,m2)=1
50	pair<u64,u64> crt_combine(u64 a1, u64 m1, u64 a2, u64 m2){
51	// Solve: m1*t ≡ (a2 - a1) (mod m2)
52	long long x0=1,y0=0,x1=0,y1=1; long long aa=(long long)m1, bb=(long long)m2;
53	while(bb){ long long q=aa/bb; long long aa2=aa-qbb; aa=bb; bb=aa2; long long x2=x0-qx1; x0=x1; x1=x2; long long y2=y0-q*y1; y0=y1; y1=y2; }
54	// gcd should be 1: aa==1
55	long long inv_m1_mod_m2 = (x0 % (long long)m2 + (long long)m2) % (long long)m2;
56	u64 t = mul_mod((u64)((a2 + m2 - (a1 % m2)) % m2), (u64)inv_m1_mod_m2, m2);
57	u64 res = a1 + (u64)((i128)m1 * t % (i128)(m1*m2));
58	u64 mod = m1 * m2;
59	res %= mod;
60	return {res, mod};
61	}
62
63	int main(){
64	ios::sync_with_stdio(false);
65	cin.tie(nullptr);
66
67	// Input: m Q then Q lines: n k
68	// For each query, compute C(n,k) mod m via prime-power computations and CRT.
69	// Example:
70	// 1000 3
71	// 12 7
72	// 100 30
73	// 7 3
74
75	u64 m; int Q; if(!(cin>>m>>Q)) return 0;
76	auto fac = factor_prime_powers(m);
77
78	// Build data per prime power
79	vector<PrimePowerData> blocks; blocks.reserve(fac.size());
80	vector<u64> moduli; moduli.reserve(fac.size());
81	for(auto [p,e]: fac){ blocks.emplace_back(p,e); u64 pe=1; for(u64 i=0;i<e;i++) pe*=p; moduli.push_back(pe); }
82
83	while(Q--){
84	u64 n,k; cin>>n>>k;
85	// Compute residues per block
86	vector<u64> rems; rems.reserve(blocks.size());
87	for(size_t i=0;i<blocks.size();++i){ rems.push_back(blocks[i].C_mod_pe(n,k)); }
88	// Combine via CRT
89	u64 x = rems[0], M = moduli[0];
90	for(size_t i=1;i<rems.size();++i){ auto [nx, nM] = crt_combine(x, M, rems[i], moduli[i]); x=nx; M=nM; }
91	cout << (x % m) << '\n';
92	}
93	return 0;
94	}
95