Meet in the Middle
Key Points
- •Meet-in-the-middle splits a hard exponential search into two halves, enumerates each half, and then combines results efficiently.
- •This technique typically reduces time from O(2^n) to O(2^{n/2} 2^{n/2}) while using O(2^{n/2}) memory.
- •Classic uses include subset sum, small-n knapsack, and k-sum problems such as 4-sum via pair sums.
- •The core idea is to precompute all possibilities of each half, then search for complements using sorting, two-pointers, or hashing.
- •Memory is the main constraint because you must store all partial results from at least one half.
- •Careful handling of duplicates, negative numbers, and overflow is essential for correctness.
- •Pruning dominated states (e.g., in knapsack) keeps the stored frontier small and speeds up queries.
- •It shines when n is around 30–50, weights are large, or DP over values/weights is infeasible.
Prerequisites
- →Bitmasks and subset enumeration — Enumerating all subsets of a half uses bitmask loops over 2^{m} states.
- →Sorting and binary search — Combining halves efficiently requires sorted arrays and upper_bound/lower_bound.
- →Two-pointer technique — An alternative to binary search when both halves are sorted and the predicate is monotone.
- →Hash maps — Counting complements can be done via frequency maps with expected O(1) lookups.
- →Asymptotic notation — Understanding the O(2^{n/2}) vs O(2^n) improvement and trade-offs.
- →Knapsack problem basics — MitM knapsack uses Pareto frontiers and dominance pruning on (weight, value) states.
- →Integer overflow handling — Subset and pair sums may exceed 32-bit range; using 64-bit is critical.
Detailed Explanation
Tap terms for definitions01Overview
Meet-in-the-middle (MitM) is a design pattern for algorithms that face a combinatorial explosion when exploring all possibilities. Instead of brute-forcing all 2^n subsets or all n^k k-tuples directly, we split the input (or decision sequence) into two halves, enumerate all possibilities of each half separately, and then combine the two precomputed sets to build full solutions. Because each half has about n/2 elements, each enumeration has roughly 2^{n/2} possibilities instead of 2^n. After generating the partial results, we use efficient lookups—typically sorting plus binary search/two-pointers, or hashing—to find complementary pairs that satisfy the overall condition (e.g., sums up to a target). This pattern turns an otherwise intractable brute force into something feasible for moderate n while keeping code simple and robust. The trade-off is memory: we store O(2^{n/2}) partial results from at least one half, which can still be large but is far smaller than O(2^n). MitM is especially powerful for subset sum variants, small-n knapsack with large weights, and k-sum problems (like 4-sum) where we can precompute partial sums/products on each side and match them in near-linear time over the precomputed arrays.
02Intuition & Analogies
Imagine you need to check a receipt for a party: you purchased items from two stores, and your total budget is T. Instead of trying every combination of all items at once, you first write down every possible subtotal from store A and every possible subtotal from store B. Then, for each subtotal x from A, you look for a complement y from B so that x + y is as close as possible to T (or exactly T). It’s much easier to match two smaller lists than to consider every combination of all items together. Another analogy: suppose you’re trying to unlock a safe with a 6-dial code. Brute force is 10^6 possibilities. But if you had a gadget that could precompute all effects of the first 3 dials (1,000 states) and all effects of the last 3 dials (1,000 states), you could then meet in the middle by matching the halves that cancel out to the correct final state. This reduces the hard search to two smaller enumerations and a fast merge step. In programming contests, this manifests as splitting an array into two halves, enumerating all subset sums for each half, sorting one list, and for each element in the other list, using binary search/two-pointers or a hash map to find complements. The power comes from the exponential shrinking: 2^{n/2} is the square root of 2^n, which is an enormous savings even for n around 40.
03Formal Definition
04When to Use
Use meet-in-the-middle when n is too large for full 2^n enumeration but small enough that 2^{n/2} is manageable (e.g., n ≈ 30–46). It is ideal when: (1) You need subset sums or knapsack with large weights where DP over weight or value is infeasible; (2) You face a k-sum problem and can reduce it to pair sums—4-sum becomes O(n^2) by building all pair sums and matching complements; (3) The constraints are separable across a partition, allowing independent enumeration of halves whose results can be merged via a simple predicate (sum, XOR, equality); (4) You can prune partial results using dominance (knapsack) to keep memory smaller; (5) You require exact counts of solutions (e.g., number of quadruples summing to T) where frequency maps over pair sums provide a clean counting mechanism. Avoid it when memory for O(2^{n/2}) items is still too large, when inputs are huge and hashing/sorting becomes a bottleneck, or when the problem is not easily decomposable into independent halves with a simple merge condition.
⚠️Common Mistakes
• Forgetting about memory: 2^{20} ≈ 1,048,576 items already take ~8 MB if stored as 64-bit integers; keeping multiple vectors or maps can multiply this. Estimate memory before coding. • Mishandling duplicates: when counting solutions (e.g., 4-sum), pair sums can repeat. Use equal_range in sorted arrays or a hash map with frequencies to count all occurrences correctly. • Incorrect two-pointer use with negatives: two-pointer merging requires both arrays sorted and a monotone predicate (like sums). With negatives present, two-pointer still works, but you must move pointers correctly; otherwise, you’ll skip valid pairs. • Integer overflow: subset sums or pair sums can exceed 32-bit limits. Use 64-bit (long long) or even builtins for wider types. • Not pruning dominated states in knapsack: if you keep all (weight, value) pairs, the second-half vector may explode. Sort by weight and keep a non-decreasing maximum value prefix to discard dominated entries. • Off-by-one in binary searches: when searching for the largest y such that x + y \le T, you should use upper_bound(T - x) - 1 and guard for begin/end conditions. • Splitting poorly: uneven halves (e.g., 5 and 35) inflate one side’s enumeration and hurt performance. Aim for balanced halves whenever possible. • Costly recomputation: reuse precomputed arrays across multiple queries if the input is static.
Key Formulas
Exponential Split
Explanation: The full 2^n search factors into two halves of size 2^{n/2} each. Enumerating both halves and combining pairs covers the whole search space.
Typical MitM Time
Explanation: Time is dominated by sorting one half and performing binary searches for each element of the other half. The logs simplify to O(n) because 2^{n/2} = 2 = O(n).
Memory Footprint
Explanation: You must store all partial results from at least one half. Memory grows exponentially with n/2, which is the main practical limitation.
Subset Count
Explanation: The number of subsets of m items is 2^m. This underlies the enumeration size for each half when computing all subset sums.
Bounded Best Sum
Explanation: This expression captures the target of maximizing the sum not exceeding T by pairing elements from the two halves.
4-Sum via Pair Sums
Explanation: The number of quadruples summing to T equals the convolution of frequencies of sums from A+B and C+D. We implement this with sorting + equa or a hash map.
Dominance in Knapsack
Explanation: A state (w2, v2) is dominated if there exists (w1, v1) no heavier and at least as valuable. Dominated states can be removed without affecting optimality.
Two-Pointer Move Rule
Explanation: On sorted lists, if the current sum is small, move the lower side up to increase it; if too large, move the upper side down to decrease it.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Enumerate all subset sums of a vector in O(2^m) 5 static vector<long long> subset_sums(const vector<long long>& a) { 6 int m = (int)a.size(); 7 vector<long long> sums; 8 sums.reserve(1 << min(m, 22)); // reserve some to reduce reallocs 9 // Iterate masks from 0..(2^m - 1) 10 for (int mask = 0; mask < (1 << m); ++mask) { 11 long long s = 0; 12 // Accumulate sum of elements where bit is set 13 for (int i = 0; i < m; ++i) { 14 if (mask & (1 << i)) s += a[i]; 15 } 16 sums.push_back(s); 17 } 18 return sums; 19 } 20 21 int main() { 22 ios::sync_with_stdio(false); 23 cin.tie(nullptr); 24 25 // Input: n, T, followed by n numbers (can be negative as well) 26 int n; long long T; 27 if (!(cin >> n >> T)) return 0; 28 vector<long long> a(n); 29 for (int i = 0; i < n; ++i) cin >> a[i]; 30 31 // Split into two halves L and R 32 int mid = n / 2; 33 vector<long long> L(a.begin(), a.begin() + mid); 34 vector<long long> R(a.begin() + mid, a.end()); 35 36 // Enumerate subset sums of both halves 37 vector<long long> SL = subset_sums(L); 38 vector<long long> SR = subset_sums(R); 39 40 // Sort SR for binary searches 41 sort(SR.begin(), SR.end()); 42 43 long long best = LLONG_MIN; // best sum <= T 44 45 // For each x in SL, find largest y in SR with x + y <= T 46 for (long long x : SL) { 47 long long need = T - x; 48 auto it = upper_bound(SR.begin(), SR.end(), need); // first > need 49 if (it == SR.begin()) { 50 // No y with y <= need 51 continue; 52 } 53 --it; // now *it <= need 54 long long cand = x + *it; 55 if (cand <= T) best = max(best, cand); 56 } 57 58 if (best == LLONG_MIN) { 59 cout << "No subset with sum <= T" << '\n'; 60 } else { 61 cout << best << '\n'; 62 } 63 64 return 0; 65 } 66
We split the array into two halves, compute all subset sums for each half, and sort the right sums. For each left sum x, we binary search the right sums for the largest y such that x + y ≤ T. The maximum over all pairs is the answer. This also handles negative numbers correctly because we only rely on sorting and upper_bound.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 int main() { 5 ios::sync_with_stdio(false); 6 cin.tie(nullptr); 7 8 // Input format: 9 // n, target 10 // A[0..n-1] 11 // B[0..n-1] 12 // C[0..n-1] 13 // D[0..n-1] 14 int n; long long T; 15 if (!(cin >> n >> T)) return 0; 16 vector<long long> A(n), B(n), C(n), D(n); 17 for (int i = 0; i < n; ++i) cin >> A[i]; 18 for (int i = 0; i < n; ++i) cin >> B[i]; 19 for (int i = 0; i < n; ++i) cin >> C[i]; 20 for (int i = 0; i < n; ++i) cin >> D[i]; 21 22 // Build all pair sums for A+B and C+D 23 vector<long long> AB; AB.reserve(1LL * n * n); 24 vector<long long> CD; CD.reserve(1LL * n * n); 25 for (int i = 0; i < n; ++i) { 26 for (int j = 0; j < n; ++j) { 27 AB.push_back(A[i] + B[j]); 28 CD.push_back(C[i] + D[j]); 29 } 30 } 31 32 // Sort one side (CD) to perform binary searches / frequency counts 33 sort(CD.begin(), CD.end()); 34 35 // For each s in AB, count how many t in CD have t = T - s using equal_range 36 long long64_t = 0; // just to ensure we think about overflow; not used 37 long long ans = 0; 38 for (long long s : AB) { 39 long long need = T - s; 40 auto range = equal_range(CD.begin(), CD.end(), need); 41 ans += (range.second - range.first); // add frequency of 'need' 42 } 43 44 cout << ans << '\n'; 45 return 0; 46 } 47
We precompute all n^2 pair sums from A+B and C+D. After sorting C+D, for each sum s in A+B we binary search for the count of values equal to T − s using equal_range. This exactly implements the convolution-style counting without hash maps and handles duplicates correctly.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Item { long long w, v; }; 5 6 // Enumerate (weight, value) pairs for all subsets of items 7 static vector<pair<long long,long long>> enumerate_pairs(const vector<Item>& arr) { 8 int m = (int)arr.size(); 9 vector<pair<long long,long long>> res; 10 res.reserve(1 << min(m, 22)); 11 for (int mask = 0; mask < (1 << m); ++mask) { 12 long long W = 0, V = 0; 13 for (int i = 0; i < m; ++i) if (mask & (1 << i)) { 14 W += arr[i].w; 15 V += arr[i].v; 16 } 17 res.emplace_back(W, V); 18 } 19 return res; 20 } 21 22 int main() { 23 ios::sync_with_stdio(false); 24 cin.tie(nullptr); 25 26 // Input: n, capacity W; then n lines of (weight, value) 27 int n; long long Wcap; 28 if (!(cin >> n >> Wcap)) return 0; 29 vector<Item> a(n); 30 for (int i = 0; i < n; ++i) cin >> a[i].w >> a[i].v; 31 32 int mid = n / 2; 33 34 vector<Item> L(a.begin(), a.begin() + mid); 35 vector<Item> R(a.begin() + mid, a.end()); 36 37 auto PL = enumerate_pairs(L); // all (w, v) on left 38 auto PR = enumerate_pairs(R); // all (w, v) on right 39 40 // Sort PR by weight, then prune dominated states to form a Pareto frontier 41 sort(PR.begin(), PR.end()); // sort by (w asc, v asc) 42 43 vector<pair<long long,long long>> frontier; 44 long long bestV = LLONG_MIN; // track running max value 45 for (auto [w, v] : PR) { 46 if (w > Wcap) continue; // never usable if alone exceeds capacity 47 if (frontier.empty()) { 48 frontier.emplace_back(w, v); 49 bestV = max(bestV, v); 50 } else { 51 // If current value improves best seen so far at this or higher weight, keep 52 if (v > bestV) { 53 frontier.emplace_back(w, v); 54 bestV = v; 55 } 56 // Else dominated: heavier and not more valuable -> skip 57 } 58 } 59 60 // Now frontier has increasing weights and strictly increasing values 61 // For binary search by remaining capacity, extract weights and prefix-best values 62 vector<long long> FW, FV; FW.reserve(frontier.size()); FV.reserve(frontier.size()); 63 for (auto &p : frontier) { FW.push_back(p.first); FV.push_back(p.second); } 64 65 long long ans = 0; 66 67 // For each left pair, find the best complement from the right frontier within remaining capacity 68 for (auto [wL, vL] : PL) { 69 if (wL > Wcap) continue; // cannot use this left subset 70 long long remain = Wcap - wL; 71 // Find the right-most weight <= remain 72 auto it = upper_bound(FW.begin(), FW.end(), remain); 73 if (it == FW.begin()) { 74 // No right subset fits, take left alone 75 ans = max(ans, vL); 76 } else { 77 int idx = int(it - FW.begin()) - 1; 78 ans = max(ans, vL + FV[idx]); 79 } 80 } 81 82 cout << ans << '\n'; 83 return 0; 84 } 85
We split items into two halves and enumerate all (weight, value) pairs for each side. On the right side, we sort by weight and prune dominated states to maintain a Pareto frontier with strictly increasing values. For each left subset, we binary search the right frontier for the heaviest weight that fits within remaining capacity and combine values. This yields the optimal knapsack value for n ≤ 40 even when weights are large and classic DP is infeasible.