⚙️AlgorithmIntermediate

Bitmask DP

Key Points

  • Bitmask DP compresses the state of a subset of n elements into an integer mask, enabling elegant dynamic programming over all subsets.
  • It is ideal when n is small (typically ) and the logic depends on which items are included, not their order.
  • Classic examples include the Held–Karp TSP, assignment, Steiner Tree on small terminal sets, and SOS DP (sum over submasks).
  • Subset iteration tricks like for (int s = mask; s; s = (s - 1) & mask) let you loop over all submasks in O(2^{k}) for a k-bit mask.
  • The total work of iterating submasks for all masks is O(3^{n}), while many optimized DPs run in O(n 2^{n}) or O( 2^{n}).
  • Space is often O(n 2^{n}) or O(2^{n}); be mindful of memory limits, using 64-bit types for costs and unsigned shifts for masks.
  • Bit operations (set, clear, test, lsb) make transitions efficient and expressive.
  • Plan states and transitions first on paper; good state design is the key to correct and fast Bitmask DP.

Prerequisites

  • Bitwise operationsMask manipulation relies on set, clear, toggle, and test operations as well as shifts.
  • Dynamic programming fundamentalsYou must understand states, transitions, base cases, and optimal substructure to design DP over masks.
  • Combinatorics of subsetsKnowing there are 2^n subsets and how to iterate them informs feasibility and complexity estimates.
  • Graph shortest paths (optional for TSP variants)Precomputing pairwise distances often precedes TSP-like Bitmask DP.
  • Time and space complexity analysisTo judge if n is small enough and pick O(n 2^n) vs O(3^n) designs.

Detailed Explanation

Tap terms for definitions

01Overview

Bitmask DP is a dynamic programming technique where the state represents a subset of elements using a bitmask. A bitmask is an integer whose binary representation encodes which items from a set of n elements are included (bit 1) or excluded (bit 0). This compression turns set-based problems into array-indexable states, enabling transitions that add or remove elements in O(1) time using bit operations. Because there are 2^{n} subsets, algorithms that iterate over all subsets typically run in O(2^{n} × poly(n)) time. The technique shines when n is small (roughly up to 20–24) but the search space over combinations is otherwise exponential and hard to manage. Classic problems include the Traveling Salesman Problem (TSP) via Held–Karp, assignment problems, and various subset transforms like SOS DP (sum over submasks). The power of Bitmask DP comes from two ingredients: a well-designed state capturing exactly what matters (e.g., which cities have been visited and the current city), and efficient transitions expressed with bit operations. With careful design, you can often reduce exponential brute force (n!) to 2^{n} × poly(n), which is a dramatic improvement for small n.

02Intuition & Analogies

Imagine packing a suitcase with n different items. At any moment, you only care which items are already in the suitcase—not the order you placed them. If you took a photo of the suitcase, you could describe its contents by a row of n lights: a light is on if the item is inside, off otherwise. This row of lights is your bitmask, and each unique setting of lights is a subset of items. Now suppose you’re planning a trip where the cost depends on which cities you have visited so far and where you are currently standing. You can summarize your history with that same row of lights (visited cities) plus the city you’re currently at. When you move to a new city, you flip that city’s light on. Each light flip is a constant-time bit operation. Bitmask DP is about turning these intuitive “lights on/off” states into a systematic table of minimal (or maximal) costs, counts, or feasibility flags. Another metaphor: think of a combination lock with n switches. Each configuration encodes a partial solution. Dynamic programming then says: if I know the best solution for one configuration, I can quickly get the best for a neighboring configuration by toggling a single switch. Subset iteration tricks are like saying: for a fixed configuration, I can effortlessly scan through all its simpler configurations by turning off lights one by one, following a well-known pattern. This gives us smooth, local moves across an otherwise enormous space, letting us compute globally optimal answers without enumerating permutations.

03Formal Definition

Let U = {0, 1, ..., n-1} be a ground set. A subset S ⊆ U is represented by an integer mask M ∈ [0, 2^{n}) with bit i set iff i ∈ S. A Bitmask DP is a dynamic program where states are indexed by masks and possibly additional small indices. Typical forms include: (1) One-parameter DP: dp[M] stores an optimal value (min, max, count) for subset M. Transitions add or remove one element: dp[M ∪ {i}] ← combine(dp[M], cost(M, i)). (2) Two-parameter DP: dp[M][j] stores an optimal value for subset M with distinguished element j (e.g., last city). Transitions often are dp[M ∪ {k}][, w(j, k)). The computational complexity follows from the number of masks (2^{n}) times the per-state transitions (often O(n)), yielding O(n 2^{n}). Common transforms such as sum-over-subsets (SOS) define F(M) = Σ_{S ⊆ M} f(S) and admit O(n 2^{n}) dynamic programming via dimension-wise updates over bits. Correctness rests on substructure and monotonicity: each state’s optimal value is derived by considering how to extend or reduce smaller subsets, ensuring no state is recomputed redundantly. Implementation uses integer bit operations: test (M >> i) & 1, set M | (1 << i), clear M & ~(1 << i), iterate submasks s = (s - 1) & M.

04When to Use

Use Bitmask DP when your problem depends only on which subset of items has been chosen so far, not the order in which they were chosen, and when n is small (usually n ≤ 20–24). Go-to scenarios: (1) Path planning over small node sets, like TSP/Hamiltonian path with precomputed distances—dp[mask][last] is standard. (2) Assignment or scheduling where each task is assigned to exactly one agent—dp[mask] where mask encodes tasks already taken. (3) Exact cover, set cover variants, or selecting subsets under compatibility constraints—dp over masks with feasibility checks via bit operations. (4) Subset transforms such as counting or summing over all submasks (SOS DP) or supersets, for convolution-like computations on the hypercube. (5) DP on small terminal sets in graphs, like Steiner Tree with terminals encoded by a mask while graph nodes are handled in an outer loop. Avoid using Bitmask DP when n is large (e.g., n > 25) because 2^{n} states become infeasible in time or memory; consider greedy, flow, matching, tree/graph DP, or approximation/heuristics instead. Also, if the history matters beyond membership (like exact order with many permutations), rethink the state or precompute equivalences that allow collapsing order into a few parameters.

⚠️Common Mistakes

  1. Blowing memory: dp of size n × 2^{n} can exceed RAM (e.g., 20 × 2^{20} ≈ 20M entries; at 8 bytes each, ≈ 160 MB). Use int8/int16 when possible, compress to 1D dp, or prune states. 2) Wrong mask transitions: forgetting to test whether a bit is already on before setting it, or mixing up set/clear operations. Always use if ((mask & (1 << i)) == 0) when adding i, or mask ^ (1 << i) only when you are sure bit i is set. 3) Using signed 1 << n for shifts: 1 << n overflows for n ≥ 31 on 32-bit int. Use 1u << n or size_t(1) << n. 4) Not initializing dp with INF or impossible sentinels, causing accidental minima. Set INF to a large 64-bit value and check dp-state validity before transitions. 5) Forgetting base cases: e.g., dp[1 << s][s] = 0 in TSP. 6) Incorrect submask iteration: for (int s = mask; s; s = (s - 1) & mask) misses s = 0; include it separately if needed. 7) Double counting in counting DPs: ensure transitions are mutually exclusive (e.g., assign the k-th person next, not any person twice). 8) Misplaced complexity expectations: submask-of-mask loops are O(2^{k}) for a fixed mask with k bits, but summing over all masks is O(3^{n}); design transitions that avoid an extra submask loop when possible (e.g., use SOS DP). 9) Using int for costs that can overflow; prefer long long. 10) Path reconstruction not stored: if you need the actual subset sequence or assignment, keep parent pointers.

Key Formulas

Number of subsets

Explanation: There are 2^n different subsets of an n-element set. Each element can be either included or excluded.

Held–Karp TSP Recurrence

Explanation: The best way to end at city j having visited exactly M is to come from some i already in M \ {j} and pay the edge . Base case: dp[\{s\}][s] = 0 for start s.

TSP Time Complexity

Explanation: For each of 2^n masks and n choices of last city, we try up to n transitions. This yields 2^n up to constant factors.

Sum Over Subsets (SOS)

Explanation: Defines F(M) as the sum of f over all submasks S of M. This can be computed for all M in O(n 2^n) using a bitwise DP transform.

SOS DP Complexity

Explanation: The standard SOS transform iterates over n bit positions and all 2^n masks, updating in O(1) per iteration.

Total Submask Iterations

Explanation: Summing the number of submasks (2^k) over all masks with k set bits yields 3^n. This bounds algorithms that iterate all submasks of all masks.

Popcount via Bits

Explanation: The popcount is the total number of 1-bits in M. Many DPs use popcount(M) to infer how many items have been chosen so far.

Assignment DP Transition

Explanation: Assign the next person (inferred from popcount) to task j not yet taken, updating the minimal cost for the new mask.

Submask Loop Update

Explanation: Given a fixed mask M, this expression iterates S over all submasks of M. It is used to traverse submasks in O(2^{k}) time where k = popcount(M).

Exponential but Better than Factorial

Explanation: For many combinatorial problems, reducing factorial complexity to exponential 2^n via Bitmask DP is a major improvement when n is small.

Complexity Analysis

Bitmask DP algorithms typically have time complexity proportional to the number of masks (2^n) times the cost of transitions per mask. If each update examines O(n) possibilities (e.g., adding one of n items), the total time is O(n 2^n). When transitions also depend on a distinguished element (like the TSP last city), an extra factor of n appears, leading to O( 2^n). Transforms like SOS DP achieve O(n 2^n) by performing dimension-wise updates over bits instead of naively summing all submasks in O(3^n). Submask-iteration-based algorithms can degrade to O(3^n) if they loop over all masks and all submasks per mask; careful redesign to use transforms can restore O(n 2^n). Space complexity is often O(2^n) for single-parameter DPs (dp[mask]) or O(n 2^n) for two-parameter DPs (dp[mask][last]). Memory can be the bottleneck: for , storing a 64-bit dp array of size n·2^n requires roughly 22 × 4,194,304 × 8 ≈ 737 MB, which is too large for typical contest limits. Techniques to mitigate space include rolling arrays over the mask dimension, storing only feasible masks, using 32-bit integers when values fit, or compressing dimensions (e.g., keeping only dp for current last instead of all last when possible). Bit operations are O(1), so arithmetic on masks is not the dominant cost; rather, the exponential number of states is. Always profile the intended n and pick the DP form (O(n 2^n) vs O( 2^n) vs O(3^n)) that fits the time and memory constraints.

Code Examples

Held–Karp TSP: Minimum Hamiltonian Cycle with Path Reconstruction
1#include <bits/stdc++.h>
2using namespace std;
3
4// Held–Karp TSP DP: O(n^2 * 2^n) time, O(n * 2^n) space
5// Input:
6// n (number of cities), followed by n x n cost matrix (long long)
7// Assumes a complete graph and cost[i][i] = 0. Returns to start city 0.
8// Output:
9// Minimum tour cost and one optimal tour starting at 0.
10
11int main() {
12 ios::sync_with_stdio(false);
13 cin.tie(nullptr);
14
15 int n;
16 if (!(cin >> n)) return 0;
17 const long long INF = (1LL<<60);
18 vector<vector<long long>> w(n, vector<long long>(n));
19 for (int i = 0; i < n; ++i) {
20 for (int j = 0; j < n; ++j) cin >> w[i][j];
21 }
22
23 int N = 1 << n; // total number of masks
24 // dp[mask][j] = minimum cost to start at 0, visit exactly 'mask', and end at city j
25 vector<vector<long long>> dp(N, vector<long long>(n, INF));
26 vector<vector<int>> parent(N, vector<int>(n, -1)); // to reconstruct path
27
28 // Base: only city 0 visited and we're at 0
29 dp[1 << 0][0] = 0;
30
31 // Iterate over all masks and last city j in mask
32 for (int mask = 0; mask < N; ++mask) {
33 if ((mask & 1) == 0) continue; // ensure tour starts at city 0 included
34 for (int j = 0; j < n; ++j) {
35 if (!(mask & (1 << j))) continue; // j must be in mask
36 long long cur = dp[mask][j];
37 if (cur == INF) continue;
38 // Try to go to a next city k not in mask yet
39 int rem = ((1 << n) - 1) ^ mask; // cities not yet visited
40 for (int k = rem; k; k &= (k - 1)) {
41 int bit = k & -k; // least significant not-yet-visited bit
42 int nxt = __builtin_ctz(bit); // index of that bit
43 int nmask = mask | (1 << nxt);
44 long long cand = cur + w[j][nxt];
45 if (cand < dp[nmask][nxt]) {
46 dp[nmask][nxt] = cand;
47 parent[nmask][nxt] = j;
48 }
49 }
50 }
51 }
52
53 // Close the cycle: return to 0 from some last city j
54 long long ans = INF;
55 int last = -1;
56 int full = N - 1;
57 for (int j = 1; j < n; ++j) {
58 long long cand = dp[full][j] + w[j][0];
59 if (cand < ans) {
60 ans = cand;
61 last = j;
62 }
63 }
64
65 if (n == 1) {
66 cout << 0 << "\n" << 0 << "\n"; // trivial case
67 return 0;
68 }
69
70 cout << ans << "\n";
71
72 // Reconstruct path backwards from (full, last)
73 vector<int> tour;
74 int mask = full;
75 int j = last;
76 while (j != -1) {
77 tour.push_back(j);
78 int pj = parent[mask][j];
79 mask ^= (1 << j);
80 j = pj;
81 }
82 tour.push_back(0); // add start to complete the cycle on output if desired
83 reverse(tour.begin(), tour.end());
84
85 // Print a Hamiltonian cycle starting and ending at 0
86 // We'll print n+1 nodes to show the cycle explicitly
87 for (int i = 0; i < n; ++i) cout << tour[i] << ' ';
88 cout << 0 << '\n';
89 return 0;
90}
91

This is the classic Held–Karp DP for TSP. The state (mask, j) means we have visited exactly the cities in mask and are currently at city j. We expand by trying to visit an unvisited city nxt. After filling dp, we minimize over the last city j and add the edge back to 0. We also store parents to reconstruct an optimal tour.

Time: O(n^2 2^n)Space: O(n 2^n)
Assignment Problem via Bitmask DP (n up to ~20)
1#include <bits/stdc++.h>
2using namespace std;
3
4// dp[mask] = minimal cost to assign the first popcount(mask) people to the set of tasks in 'mask'
5// Transition: pick next person k = popcount(mask), assign them to any task j not in mask
6// Input: n, then n x n cost matrix cost[k][j]
7
8int main() {
9 ios::sync_with_stdio(false);
10 cin.tie(nullptr);
11
12 int n;
13 if (!(cin >> n)) return 0;
14 const long long INF = (1LL<<60);
15 vector<vector<long long>> cost(n, vector<long long>(n));
16 for (int i = 0; i < n; ++i)
17 for (int j = 0; j < n; ++j)
18 cin >> cost[i][j];
19
20 int N = 1 << n;
21 vector<long long> dp(N, INF);
22 vector<int> parentTask(N, -1); // which task was assigned to reach this mask
23 vector<int> parentMask(N, -1); // previous mask
24 dp[0] = 0;
25
26 for (int mask = 0; mask < N; ++mask) {
27 long long cur = dp[mask];
28 if (cur == INF) continue;
29 int k = __builtin_popcount((unsigned)mask); // next person index
30 if (k == n) continue; // all assigned
31 int rem = ((1 << n) - 1) ^ mask; // tasks not yet taken
32 for (int s = rem; s; s &= (s - 1)) {
33 int bit = s & -s;
34 int j = __builtin_ctz(bit);
35 int nmask = mask | (1 << j);
36 long long cand = cur + cost[k][j];
37 if (cand < dp[nmask]) {
38 dp[nmask] = cand;
39 parentTask[nmask] = j;
40 parentMask[nmask] = mask;
41 }
42 }
43 }
44
45 long long ans = dp[N - 1];
46 cout << ans << '\n';
47
48 // Reconstruct assignment: person i assigned to task assign[i]
49 vector<int> assign(n, -1);
50 int mask = N - 1;
51 for (int i = n - 1; i >= 0; --i) {
52 int j = parentTask[mask];
53 assign[i] = j;
54 mask = parentMask[mask];
55 }
56 for (int i = 0; i < n; ++i) cout << "person " << i << " -> task " << assign[i] << '\n';
57 return 0;
58}
59

The dp[mask] stores the minimum cost after assigning tasks in 'mask' to the first popcount(mask) people. The next person k is determined by the number of set bits. We try each available task j and update dp[mask | (1<<j)]. Parent arrays allow reconstructing the assignment.

Time: O(n 2^n)Space: O(2^n)
SOS DP: Sum Over All Submasks for Every Mask
1#include <bits/stdc++.h>
2using namespace std;
3
4// Computes F[mask] = sum_{S subseteq mask} f[S] for all masks in O(n * 2^n)
5// Input:
6// n (bits), then array f of length 2^n (long long)
7// Output:
8// F array of length 2^n
9
10int main() {
11 ios::sync_with_stdio(false);
12 cin.tie(nullptr);
13
14 int n;
15 if (!(cin >> n)) return 0;
16 int N = 1 << n;
17 vector<long long> f(N), F(N);
18 for (int m = 0; m < N; ++m) cin >> f[m];
19 F = f; // initialize
20
21 // Standard SOS DP: for each bit i, add contributions from masks with i-th bit off
22 for (int i = 0; i < n; ++i) {
23 for (int mask = 0; mask < N; ++mask) {
24 if (mask & (1 << i)) {
25 F[mask] += F[mask ^ (1 << i)];
26 }
27 }
28 }
29
30 for (int m = 0; m < N; ++m) {
31 cout << F[m] << (m + 1 == N ? '\n' : ' ');
32 }
33 return 0;
34}
35

We compute F[mask] = Σ_{S ⊆ mask} f[S] by sweeping each bit i and adding the value from the same mask with bit i cleared. This is the classic O(n 2^n) SOS transform, much faster than explicitly iterating all submasks for every mask (which would be O(3^n)).

Time: O(n 2^n)Space: O(2^n)
Enumerating Submasks and Including the Empty Submask
1#include <bits/stdc++.h>
2using namespace std;
3
4// Demonstrates submask iteration: for a fixed mask M, iterate all submasks S of M
5// including S = 0. Also shows how many submasks there are (2^{popcount(M)}).
6
7int main() {
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n; unsigned M;
12 if (!(cin >> n >> M)) return 0; // read n (bits) and a mask M (as unsigned integer)
13
14 vector<unsigned> subs;
15 // Non-empty submasks
16 for (unsigned S = M; S; S = (S - 1) & M) subs.push_back(S);
17 subs.push_back(0); // include empty submask explicitly
18
19 sort(subs.begin(), subs.end());
20 subs.erase(unique(subs.begin(), subs.end()), subs.end());
21
22 cout << "Submasks of M:" << '\n';
23 for (unsigned S : subs) {
24 cout << S << " (bits set: " << __builtin_popcount(S) << ")\n";
25 }
26 cout << "Total submasks: " << subs.size() << "\n";
27 cout << "Expected: 2^{popcount(M)} = " << (1u << __builtin_popcount(M)) << "\n";
28 return 0;
29}
30

This small program shows the canonical loop S = (S - 1) & M to enumerate all non-empty submasks of M, and then adds S = 0 separately. It also verifies that the number of submasks equals 2^{popcount(M)}.

Time: O(2^{k}) where k = popcount(M)Space: O(2^{k}) to store and print submasks