Stirling Numbers of Second Kind
Key Points
- â˘Stirling numbers of the second kind S(n,k) count how many ways to split n labeled items into k non-empty, unlabeled groups.
- â˘They satisfy the recurrence S(n,k) = k S(n-1,k) + S(n-1,k-1) with base cases S(0,0)=1 and S(n,0)=0 for n>0.
- â˘A closed form using inclusionâexclusion is S(n,k) = (1/k!) (-1)^i (k-i)^n.
- â˘Bell numbers B(n) = S(n,k) count all set partitions of n elements.
- â˘The number of surjections from an n-element set onto a k-element set equals k! S(n,k).
- â˘Exponential generating functions package these numbers neatly: S(n,k) /n! = ( - 1)^k/k!.
- â˘In C++, you can compute S(n,k) with O(nk) DP, or use the closed form in O(k MOD) modulo a prime.
- â˘Common pitfalls include mixing first vs. second kind Stirling numbers and forgetting boundary cases like S(n,n)=1.
Prerequisites
- âBasic combinatorics (counting, binomial coefficients) â Understanding partitions, combinations, and factorials is essential to follow the recurrence and closed form.
- âInclusionâexclusion principle â The closed form for S(n,k) is derived by inclusionâexclusion over empty blocks.
- âModular arithmetic and modular inverses â Efficient computation modulo a prime requires Fermat inverses for factorial division.
- âBinary exponentiation (fast power) â The closed form uses (kâi)^n, which must be computed efficiently for large n.
- âDynamic programming â The standard O(nk) method to compute S(n,k) relies on a DP recurrence.
Detailed Explanation
Tap terms for definitions01Overview
Stirling numbers of the second kind, denoted S(n,k), answer a very natural counting question: in how many ways can you partition a set of n distinct (labeled) elements into exactly k non-empty, unlabeled subsets? For example, if n=4 and k=2, we count the different ways to split {1,2,3,4} into two groups, ignoring the order of the groups but respecting which elements are together. These numbers appear throughout combinatorics, probability, and computer science, especially in problems about grouping, hashing, and distributing tasks or keys. They connect to surjections (onto functions), to Bell numbers (the total number of partitions), and to generating functions that compress entire sequences into succinct analytic objects. Algorithmically, S(n,k) can be computed via a simple dynamic programming recurrence or by an inclusionâexclusion closed form that is often efficient modulo a prime. Beyond raw counting, they serve as change-of-basis coefficients between ordinary powers x^n and falling factorials x^{\underline{k}}, a key tool in finite calculus and polynomial interpolation. Understanding Stirling numbers of the second kind equips you with a versatile lens for partition-type problems and a bridge between combinatorial reasoning and algebraic identities.
02Intuition & Analogies
Imagine you are organizing students into study groups. You have n distinct students and want exactly k study groups, with no empty group. How many different groupings are possible if the groups themselves have no names? That number is S(n,k). The groups are like unlabeled boxes; only which students share a box matters. A useful way to build intuition is to add students one at a time. Suppose you already placed nâ1 students into k non-empty groups. When the nth student arrives, there are two types of choices: either join one of the existing k groups (k options), or start a brand-new group. But since we must end with exactly k groups, the second option is only valid if we previously had kâ1 groups; this motivates the recurrence S(n,k) = k S(nâ1,k) + S(nâ1,kâ1). Another viewpoint comes from functions. Consider all functions from an n-element set to a k-element set of labels (say, colors). If you only care which color classes are non-empty and forget the colorsâ names, then every surjective function corresponds to a partition into k blocks, while different permutations of colors donât change the partition. Counting surjections gives k! S(n,k), so dividing by k! returns S(n,k). Finally, the inclusionâexclusion formula says: start by treating color classes as distinguishable, count all functions (k^n), then subtract those where at least one color class is empty, and so on. This alternating sum neatly collapses into the clean closed form for S(n,k). These storiesâgradually placing elements or transforming labeled colorings into unlabeled groupsâare the everyday metaphors behind the algebra.
03Formal Definition
04When to Use
Use S(n,k) when the core structure is âpartition n labeled items into k non-empty, unlabeled groups.â Typical scenarios include distributing distinct tasks among k identical servers so that each server gets at least one task; grouping test cases into k batches without batch names; or analyzing hash functions where you care about which buckets are non-empty, not their identities. In counting surjections, S(n,k) is central because the number of onto functions from an n-set to a k-set is k! S(n,k). This appears in occupancy problems, coupon-collector variants, and probability of no-empty-bucket events in hashing. For algorithmic purposes: if you need all S(n,\cdot) for a fixed n (e.g., to compute Bell numbers), build the DP table in O(nk). If you only need a single S(n,k) for moderately large k under a prime modulus, the inclusionâexclusion closed form with fast exponentiation and precomputed factorials is typically faster. In algebraic manipulation of sequences and finite calculus, use the identity x^n = \sum_k S(n,k) x^{\underline{k}} to move between standard powers and falling factorials, enabling simpler difference operators and Newton series expansions. In generating function techniques, the EGF (e^xâ1)^k/k! often simplifies convolution-like sums involving partitions.
â ď¸Common Mistakes
⢠Confusing first vs. second kind: The first kind counts permutations by cycles; the second kind counts set partitions. Check whether your blocks are cycles (ordered around a circle) or plain groups. ⢠Forgetting boundaries: S(n,0)=0 for n>0, S(n,n)=1, and S(0,0)=1. Off-by-one errors here ripple through DP tables. ⢠Mixing labeled vs. unlabeled groups: S(n,k) uses unlabeled blocks. If your groups are labeled, multiply by k! (e.g., surjections count is k! S(n,k)). ⢠Misusing the recurrence: In S(n,k) = k S(nâ1,k) + S(nâ1,kâ1), the term k S(nâ1,k) comes from inserting the nth element into one of the k existing blocks, not from any binomial coefficient. ⢠Inclusionâexclusion sign errors: The closed form alternates signs with (â1)^i and uses (kâi)^n; a common bug is to use i^n or to skip the factorial division by k!. ⢠Overflow in plain integers: S(n,k) grows quickly. Use 64-bit integers cautiously or compute modulo a prime (e.g., 10^9+7) with modular arithmetic. ⢠Building full DP when unnecessary: If you only need one S(n,k) for large n and k, prefer the closed form with precomputed factorials to avoid O(nk) memory/time. ⢠Confusing surjections with partitions: Surjections count functions onto labeled codomain; partitions ignore label identities. Convert via k! when needed.
Key Formulas
Stirling Recurrence (Second Kind)
Explanation: Insert the nth element into one of k existing blocks (k S(nâ1,k)) or start a new block (S(nâ1,kâ1)). This underlies the standard DP.
Boundary Conditions
Explanation: There is exactly one way to partition the empty set, no way to use zero blocks for non-empty sets, and only one way to split n elements into n singletons.
InclusionâExclusion Closed Form
Explanation: Start from all functions and subtract assignments with empty colors. Dividing by k! removes color labels to yield unlabeled blocks.
Bell Number Sum
Explanation: Summing over all valid k counts all possible partitions of an n-element set. This defines the Bell numbers.
Surjection Count
Explanation: Every surjection induces a partition with labeled colors; for each unlabeled partition there are k! labelings, so multiply by k!.
EGF for Fixed k
Explanation: Blocks are sets of labeled atoms with EGF â1; k blocks combined and unlabeled yield division by k!.
OGF for Fixed k
Explanation: The ordinary generating function factorizes over block counts, revealing poles at x=1/j that govern growth.
Change of Basis
Explanation: Monomials expand in the falling-factorial basis with coefficients S(n,k). This is central in finite calculus and difference equations.
Count Functions by Image Size
Explanation: Partition functions by the size of their image: choose i used labels, count surjections to them (i! S(n,i)), and sum over i.
Useful Special Cases
Explanation: Quick checks and closed forms for edge values that help validate implementations or compute small k efficiently.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute S(n,k) for all k up to n using DP recurrence with rolling array. 5 // MOD is prime; we work modulo MOD to avoid overflow. 6 // Time: O(n*k) overall when called for all k up to n. Space: O(k). 7 8 static const int MOD = 1'000'000'007; 9 10 int addmod(long long a, long long b){ a += b; if(a >= MOD) a -= MOD; return (int)a; } 11 int mulmod(long long a, long long b){ return (int)((a * b) % MOD); } 12 13 // Returns vector S[0..k] containing S(n,0..k) modulo MOD. 14 vector<int> stirling2_row(int n, int k){ 15 vector<int> prev(k+1, 0), cur(k+1, 0); 16 // Base: S(0,0)=1 17 prev[0] = 1; 18 for(int i = 1; i <= n; ++i){ 19 cur.assign(k+1, 0); 20 int upto = min(i, k); 21 for(int j = 1; j <= upto; ++j){ 22 // S(i,j) = j*S(i-1,j) + S(i-1,j-1) 23 int term1 = mulmod(j, prev[j]); 24 int term2 = prev[j-1]; 25 cur[j] = addmod(term1, term2); 26 } 27 // S(i,0)=0 already; S(i,i)=1 when i<=k will be set by recurrence naturally 28 swap(prev, cur); 29 } 30 return prev; // now prev holds S(n,0..k) 31 } 32 33 int main(){ 34 ios::sync_with_stdio(false); 35 cin.tie(nullptr); 36 37 int n = 10, K = 10; 38 auto row = stirling2_row(n, K); 39 40 cout << "S(" << n << ",k) for k=0.." << K << " modulo " << MOD << "\n"; 41 for(int k = 0; k <= K; ++k){ 42 cout << "S(" << n << "," << k << ") = " << row[k] << '\n'; 43 } 44 45 // Bell number B(n) = sum_k S(n,k) 46 long long bell = 0; 47 for(int k = 0; k <= K; ++k) bell = (bell + row[k]) % MOD; 48 cout << "Bell B(" << n << ") mod MOD = " << bell << "\n"; 49 return 0; 50 } 51
Fills S(i,j) row by row using S(i,j) = j S(iâ1,j) + S(iâ1,jâ1) under modulo arithmetic. Rolling arrays reduce memory to O(k). We also show how to sum a row to get the Bell number modulo MOD.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Computes S(n,k) mod prime MOD using the closed form: 5 // S(n,k) = 1/k! * sum_{i=0..k} (-1)^i * C(k,i) * (k-i)^n. 6 // Precompute factorials and inverse factorials up to k once. 7 8 static const int MOD = 1'000'000'007; // must be prime for inv via Fermat 9 10 long long modpow(long long a, long long e){ 11 long long r = 1 % MOD; a %= MOD; 12 while(e){ if(e & 1) r = (r * a) % MOD; a = (a * a) % MOD; e >>= 1; } 13 return r; 14 } 15 16 long long inv(long long x){ return modpow(x, MOD-2); } 17 18 struct Comb { 19 vector<long long> fact, invfact; 20 Comb(int N=0){ init(N); } 21 void init(int N){ 22 fact.assign(N+1, 1); 23 for(int i=1;i<=N;++i) fact[i] = fact[i-1]*i % MOD; 24 invfact.assign(N+1, 1); 25 invfact[N] = inv(fact[N]); 26 for(int i=N;i>0;--i) invfact[i-1] = invfact[i]*i % MOD; 27 } 28 long long C(int n,int k) const{ 29 if(k<0||k>n) return 0; 30 return (((fact[n]*invfact[k])%MOD)*invfact[n-k])%MOD; 31 } 32 }; 33 34 long long stirling2_closed_form(long long n, int k, const Comb &cb){ 35 if(k<0 || n<0) return 0; 36 if(k==0) return (n==0); // S(0,0)=1 else 0 37 if(n==0) return 0; // k>0 and n==0 => 0 38 long long sum = 0; 39 for(int i=0;i<=k;++i){ 40 long long ways = cb.C(k, i) * modpow(k - i, n) % MOD; 41 if(i & 1) sum = (sum - ways + MOD) % MOD; else sum = (sum + ways) % MOD; 42 } 43 long long inv_k_fact = cb.invfact[k]; // equals (k!)^{-1} 44 return sum * inv_k_fact % MOD; 45 } 46 47 int main(){ 48 ios::sync_with_stdio(false); 49 cin.tie(nullptr); 50 51 long long n = 1000000000000LL; // large n supported via fast pow 52 int k = 5; 53 54 Comb cb(k); // factorials up to k 55 long long S = stirling2_closed_form(n, k, cb); 56 cout << "S(" << n << "," << k << ") mod " << MOD << " = " << S << "\n"; 57 58 // Surjections count: k! * S(n,k) mod MOD 59 long long surj = (cb.fact[k] * S) % MOD; 60 cout << "#Surjections [n]->[k] mod MOD = " << surj << "\n"; 61 return 0; 62 } 63
Uses the inclusionâexclusion closed form. We precompute factorials and inverse factorials up to k for binomial coefficients modulo a prime. Fast exponentiation handles (kâi)^n even for very large n efficiently, making this ideal when only one S(n,k) is needed.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute Bell numbers B(0)..B(N) using Bell triangle modulo MOD. 5 // Bell triangle rules: 6 // Row 0: T[0][0] = 1 => B(0)=1 7 // For r>0: T[r][0] = T[r-1][r-1]; and T[r][c] = (T[r][c-1] + T[r-1][c-1]) for 1<=c<=r 8 // Then B(r) = T[r][0]. 9 10 static const int MOD = 1'000'000'007; 11 12 int addmod(int a, int b){ a += b; if(a>=MOD) a-=MOD; return a; } 13 14 vector<int> bell_numbers(int N){ 15 vector<vector<int>> T(N+1); 16 T[0] = vector<int>(1, 1); // row 0 17 for(int r=1; r<=N; ++r){ 18 T[r].assign(r+1, 0); 19 // first element of row r equals last element of previous row 20 T[r][0] = T[r-1].back(); 21 for(int c=1; c<=r; ++c){ 22 T[r][c] = addmod(T[r][c-1], T[r-1][c-1]); 23 } 24 } 25 vector<int> B(N+1,0); 26 for(int r=0;r<=N;++r) B[r] = T[r][0]; 27 return B; 28 } 29 30 int main(){ 31 ios::sync_with_stdio(false); 32 cin.tie(nullptr); 33 34 int N = 10; 35 auto B = bell_numbers(N); 36 for(int n=0; n<=N; ++n){ 37 cout << "B(" << n << ") = " << B[n] << '\n'; 38 } 39 return 0; 40 } 41
Computes Bell numbers using the Bell triangle, which avoids constructing the entire S(n,k) table explicitly. This is convenient if you only need B(n) values. The recurrence structure mirrors the combinatorial growth of partitions.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Demonstrates the relationship between surjections and S(n,k): 5 // #surjections [n]->[k] = k! * S(n,k) 6 // #ways to distribute n labeled balls into k unlabeled non-empty boxes = S(n,k) 7 // We use DP to compute a single row S(n,0..n) and then compute both quantities. 8 9 static const int MOD = 998244353; // NTT prime (just as an example) 10 11 int addmod(long long a, long long b){ a += b; if(a>=MOD) a-=MOD; return (int)a; } 12 int mulmod(long long a, long long b){ return (int)((a*b)%MOD); } 13 14 vector<int> stirling2_row_dp(int n){ 15 vector<int> prev(n+1, 0), cur(n+1, 0); 16 prev[0] = 1; // S(0,0)=1 17 for(int i=1;i<=n;++i){ 18 cur.assign(n+1, 0); 19 for(int k=1;k<=i;++k){ 20 int term1 = mulmod(k, prev[k]); 21 int term2 = prev[k-1]; 22 cur[k] = addmod(term1, term2); 23 } 24 swap(prev, cur); 25 } 26 return prev; // S(n,0..n) 27 } 28 29 long long fact_mod(int n){ long long f=1; for(int i=2;i<=n;++i) f=(f*i)%MOD; return f; } 30 31 int main(){ 32 ios::sync_with_stdio(false); 33 cin.tie(nullptr); 34 35 int n = 12; // number of labeled balls / domain size 36 auto row = stirling2_row_dp(n); 37 38 for(int k=1; k<=n; ++k){ 39 long long Snk = row[k]; 40 long long distributions = Snk; // unlabeled non-empty boxes 41 long long surjections = (fact_mod(k) * Snk) % MOD; // onto functions [n]->[k] 42 cout << "k=" << k 43 << ": S(n,k)=" << distributions 44 << ", surjections=" << surjections << '\n'; 45 } 46 return 0; 47 } 48
Builds S(n,¡) once via DP, then reports both S(n,k) (unlabeled non-empty box distributions) and k! S(n,k) (surjections) modulo MOD. This ties together the two central interpretations in one program.