⚙️AlgorithmIntermediate

Meet in the Middle

Key Points

  • Meet-in-the-middle splits a hard exponential search into two halves, enumerates each half, and then combines results efficiently.
  • This technique typically reduces time from O(2^n) to O(2^{n/2} 2^{n/2}) while using O(2^{n/2}) memory.
  • Classic uses include subset sum, small-n knapsack, and k-sum problems such as 4-sum via pair sums.
  • The core idea is to precompute all possibilities of each half, then search for complements using sorting, two-pointers, or hashing.
  • Memory is the main constraint because you must store all partial results from at least one half.
  • Careful handling of duplicates, negative numbers, and overflow is essential for correctness.
  • Pruning dominated states (e.g., in knapsack) keeps the stored frontier small and speeds up queries.
  • It shines when n is around 30–50, weights are large, or DP over values/weights is infeasible.

Prerequisites

  • Bitmasks and subset enumerationEnumerating all subsets of a half uses bitmask loops over 2^{m} states.
  • Sorting and binary searchCombining halves efficiently requires sorted arrays and upper_bound/lower_bound.
  • Two-pointer techniqueAn alternative to binary search when both halves are sorted and the predicate is monotone.
  • Hash mapsCounting complements can be done via frequency maps with expected O(1) lookups.
  • Asymptotic notationUnderstanding the O(2^{n/2}) vs O(2^n) improvement and trade-offs.
  • Knapsack problem basicsMitM knapsack uses Pareto frontiers and dominance pruning on (weight, value) states.
  • Integer overflow handlingSubset and pair sums may exceed 32-bit range; using 64-bit is critical.

Detailed Explanation

Tap terms for definitions

01Overview

Meet-in-the-middle (MitM) is a design pattern for algorithms that face a combinatorial explosion when exploring all possibilities. Instead of brute-forcing all 2^n subsets or all n^k k-tuples directly, we split the input (or decision sequence) into two halves, enumerate all possibilities of each half separately, and then combine the two precomputed sets to build full solutions. Because each half has about n/2 elements, each enumeration has roughly 2^{n/2} possibilities instead of 2^n. After generating the partial results, we use efficient lookups—typically sorting plus binary search/two-pointers, or hashing—to find complementary pairs that satisfy the overall condition (e.g., sums up to a target). This pattern turns an otherwise intractable brute force into something feasible for moderate n while keeping code simple and robust. The trade-off is memory: we store O(2^{n/2}) partial results from at least one half, which can still be large but is far smaller than O(2^n). MitM is especially powerful for subset sum variants, small-n knapsack with large weights, and k-sum problems (like 4-sum) where we can precompute partial sums/products on each side and match them in near-linear time over the precomputed arrays.

02Intuition & Analogies

Imagine you need to check a receipt for a party: you purchased items from two stores, and your total budget is T. Instead of trying every combination of all items at once, you first write down every possible subtotal from store A and every possible subtotal from store B. Then, for each subtotal x from A, you look for a complement y from B so that x + y is as close as possible to T (or exactly T). It’s much easier to match two smaller lists than to consider every combination of all items together. Another analogy: suppose you’re trying to unlock a safe with a 6-dial code. Brute force is 10^6 possibilities. But if you had a gadget that could precompute all effects of the first 3 dials (1,000 states) and all effects of the last 3 dials (1,000 states), you could then meet in the middle by matching the halves that cancel out to the correct final state. This reduces the hard search to two smaller enumerations and a fast merge step. In programming contests, this manifests as splitting an array into two halves, enumerating all subset sums for each half, sorting one list, and for each element in the other list, using binary search/two-pointers or a hash map to find complements. The power comes from the exponential shrinking: 2^{n/2} is the square root of 2^n, which is an enormous savings even for n around 40.

03Formal Definition

Meet-in-the-middle is an algorithmic paradigm where a problem on n variables (or items) is decomposed into two subproblems on roughly n/2 variables each. Let be all partial results from the left half and from the right half. We construct and in O(2^{n/2}) time and memory per side (for subset-like enumeration). The solution set is then obtained by combining pairs (x, y) that satisfy a global predicate P(x, y). The combination step typically uses: (1) sorting and performing binary searches for complements of elements in , (2) two-pointer merging if both and are sorted and the predicate is monotone (e.g., sums), or (3) hashing into a frequency map for O(1) average-time queries. For subset sum with target T, we generate all sums L and R, sort R, then for each x in L, search for y R with x + (exact) or x + y T (bounded). Asymptotically, time is O(2^{n/2} 2^{n/2}) for the sort plus O(2^{n/2} 2^{n/2}) for queries, and memory is O(2^{n/2}). In knapsack variants, we enumerate (weight, value) pairs per half and prune dominated points to a Pareto frontier before combining with binary search over capacities.

04When to Use

Use meet-in-the-middle when n is too large for full 2^n enumeration but small enough that 2^{n/2} is manageable (e.g., n ≈ 30–46). It is ideal when: (1) You need subset sums or knapsack with large weights where DP over weight or value is infeasible; (2) You face a k-sum problem and can reduce it to pair sums—4-sum becomes O(n^2) by building all pair sums and matching complements; (3) The constraints are separable across a partition, allowing independent enumeration of halves whose results can be merged via a simple predicate (sum, XOR, equality); (4) You can prune partial results using dominance (knapsack) to keep memory smaller; (5) You require exact counts of solutions (e.g., number of quadruples summing to T) where frequency maps over pair sums provide a clean counting mechanism. Avoid it when memory for O(2^{n/2}) items is still too large, when inputs are huge and hashing/sorting becomes a bottleneck, or when the problem is not easily decomposable into independent halves with a simple merge condition.

⚠️Common Mistakes

• Forgetting about memory: 2^{20} ≈ 1,048,576 items already take ~8 MB if stored as 64-bit integers; keeping multiple vectors or maps can multiply this. Estimate memory before coding. • Mishandling duplicates: when counting solutions (e.g., 4-sum), pair sums can repeat. Use equal_range in sorted arrays or a hash map with frequencies to count all occurrences correctly. • Incorrect two-pointer use with negatives: two-pointer merging requires both arrays sorted and a monotone predicate (like sums). With negatives present, two-pointer still works, but you must move pointers correctly; otherwise, you’ll skip valid pairs. • Integer overflow: subset sums or pair sums can exceed 32-bit limits. Use 64-bit (long long) or even builtins for wider types. • Not pruning dominated states in knapsack: if you keep all (weight, value) pairs, the second-half vector may explode. Sort by weight and keep a non-decreasing maximum value prefix to discard dominated entries. • Off-by-one in binary searches: when searching for the largest y such that x + y \le T, you should use upper_bound(T - x) - 1 and guard for begin/end conditions. • Splitting poorly: uneven halves (e.g., 5 and 35) inflate one side’s enumeration and hurt performance. Aim for balanced halves whenever possible. • Costly recomputation: reuse precomputed arrays across multiple queries if the input is static.

Key Formulas

Exponential Split

Explanation: The full 2^n search factors into two halves of size 2^{n/2} each. Enumerating both halves and combining pairs covers the whole search space.

Typical MitM Time

Explanation: Time is dominated by sorting one half and performing binary searches for each element of the other half. The logs simplify to O(n) because 2^{n/2} = 2 = O(n).

Memory Footprint

Explanation: You must store all partial results from at least one half. Memory grows exponentially with n/2, which is the main practical limitation.

Subset Count

Explanation: The number of subsets of m items is 2^m. This underlies the enumeration size for each half when computing all subset sums.

Bounded Best Sum

Explanation: This expression captures the target of maximizing the sum not exceeding T by pairing elements from the two halves.

4-Sum via Pair Sums

Explanation: The number of quadruples summing to T equals the convolution of frequencies of sums from A+B and C+D. We implement this with sorting + equa or a hash map.

Dominance in Knapsack

Explanation: A state (w2, v2) is dominated if there exists (w1, v1) no heavier and at least as valuable. Dominated states can be removed without affecting optimality.

Two-Pointer Move Rule

Explanation: On sorted lists, if the current sum is small, move the lower side up to increase it; if too large, move the upper side down to decrease it.

Complexity Analysis

The hallmark of meet-in-the-middle is cutting an exponential search into two smaller exponentials. For subset-style problems on n items, each half has n/2 items, generating 2^{n/2} partial results. Building both arrays is O(2^{n/2}) time. If we sort one side (say R), sorting costs O(2^{n/2} log 2^{n/2}). Then for each element x in the other side (L), we either binary search in O(log 2^{n/2}) or conduct a two-pointer linear merge across the entire arrays in O(2^{n/2}). Overall, common patterns yield O(2^{n/2} log 2^{n/2}) or O(2^{n/2}) post-sorting merge time, plus the initial enumeration. Memory is O(2^{n/2}) to store at least one half’s results; in counting problems using hash maps, the constant factors may be higher than with vectors. For 4-sum on one array of size n (or for pair sums across two arrays of size n), the dominant cost is forming all O() pair sums and sorting or hashing them; this is O( log n) with sorting or expected O() with hashing. In knapsack with and large weights, enumerating (weight, value) pairs per half is O(2^{n/2}); after sorting the second half by weight and pruning dominated points, we can binary search for each first-half state, giving O(2^{n/2} log 2^{n/2}) query time. Space is O(2^{n/2}) for storing either sums or Pareto frontiers. Practically, 2^{20} ≈ 10^6 elements (about 8 MB for 64-bit integers) is manageable, while 2^{22}–2^{24} can be borderline depending on environment and extra structures. The net improvement from O(2^n) to roughly O(2^{n/2} n) is what makes MitM viable for contest constraints.

Code Examples

Subset Sum (maximize sum ≤ T) using Meet-in-the-Middle
1#include <bits/stdc++.h>
2using namespace std;
3
4// Enumerate all subset sums of a vector in O(2^m)
5static vector<long long> subset_sums(const vector<long long>& a) {
6 int m = (int)a.size();
7 vector<long long> sums;
8 sums.reserve(1 << min(m, 22)); // reserve some to reduce reallocs
9 // Iterate masks from 0..(2^m - 1)
10 for (int mask = 0; mask < (1 << m); ++mask) {
11 long long s = 0;
12 // Accumulate sum of elements where bit is set
13 for (int i = 0; i < m; ++i) {
14 if (mask & (1 << i)) s += a[i];
15 }
16 sums.push_back(s);
17 }
18 return sums;
19}
20
21int main() {
22 ios::sync_with_stdio(false);
23 cin.tie(nullptr);
24
25 // Input: n, T, followed by n numbers (can be negative as well)
26 int n; long long T;
27 if (!(cin >> n >> T)) return 0;
28 vector<long long> a(n);
29 for (int i = 0; i < n; ++i) cin >> a[i];
30
31 // Split into two halves L and R
32 int mid = n / 2;
33 vector<long long> L(a.begin(), a.begin() + mid);
34 vector<long long> R(a.begin() + mid, a.end());
35
36 // Enumerate subset sums of both halves
37 vector<long long> SL = subset_sums(L);
38 vector<long long> SR = subset_sums(R);
39
40 // Sort SR for binary searches
41 sort(SR.begin(), SR.end());
42
43 long long best = LLONG_MIN; // best sum <= T
44
45 // For each x in SL, find largest y in SR with x + y <= T
46 for (long long x : SL) {
47 long long need = T - x;
48 auto it = upper_bound(SR.begin(), SR.end(), need); // first > need
49 if (it == SR.begin()) {
50 // No y with y <= need
51 continue;
52 }
53 --it; // now *it <= need
54 long long cand = x + *it;
55 if (cand <= T) best = max(best, cand);
56 }
57
58 if (best == LLONG_MIN) {
59 cout << "No subset with sum <= T" << '\n';
60 } else {
61 cout << best << '\n';
62 }
63
64 return 0;
65}
66

We split the array into two halves, compute all subset sums for each half, and sort the right sums. For each left sum x, we binary search the right sums for the largest y such that x + y ≤ T. The maximum over all pairs is the answer. This also handles negative numbers correctly because we only rely on sorting and upper_bound.

Time: O(2^{n/2} n + 2^{n/2} log 2^{n/2})Space: O(2^{n/2})
Counting 4-Sum Tuples from Four Arrays (A+B vs. C+D)
1#include <bits/stdc++.h>
2using namespace std;
3
4int main() {
5 ios::sync_with_stdio(false);
6 cin.tie(nullptr);
7
8 // Input format:
9 // n, target
10 // A[0..n-1]
11 // B[0..n-1]
12 // C[0..n-1]
13 // D[0..n-1]
14 int n; long long T;
15 if (!(cin >> n >> T)) return 0;
16 vector<long long> A(n), B(n), C(n), D(n);
17 for (int i = 0; i < n; ++i) cin >> A[i];
18 for (int i = 0; i < n; ++i) cin >> B[i];
19 for (int i = 0; i < n; ++i) cin >> C[i];
20 for (int i = 0; i < n; ++i) cin >> D[i];
21
22 // Build all pair sums for A+B and C+D
23 vector<long long> AB; AB.reserve(1LL * n * n);
24 vector<long long> CD; CD.reserve(1LL * n * n);
25 for (int i = 0; i < n; ++i) {
26 for (int j = 0; j < n; ++j) {
27 AB.push_back(A[i] + B[j]);
28 CD.push_back(C[i] + D[j]);
29 }
30 }
31
32 // Sort one side (CD) to perform binary searches / frequency counts
33 sort(CD.begin(), CD.end());
34
35 // For each s in AB, count how many t in CD have t = T - s using equal_range
36 long long64_t = 0; // just to ensure we think about overflow; not used
37 long long ans = 0;
38 for (long long s : AB) {
39 long long need = T - s;
40 auto range = equal_range(CD.begin(), CD.end(), need);
41 ans += (range.second - range.first); // add frequency of 'need'
42 }
43
44 cout << ans << '\n';
45 return 0;
46}
47

We precompute all n^2 pair sums from A+B and C+D. After sorting C+D, for each sum s in A+B we binary search for the count of values equal to T − s using equal_range. This exactly implements the convolution-style counting without hash maps and handles duplicates correctly.

Time: O(n^2 log n + n^2)Space: O(n^2)
0/1 Knapsack (n ≤ 40) via Meet-in-the-Middle with Pareto Pruning
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Item { long long w, v; };
5
6// Enumerate (weight, value) pairs for all subsets of items
7static vector<pair<long long,long long>> enumerate_pairs(const vector<Item>& arr) {
8 int m = (int)arr.size();
9 vector<pair<long long,long long>> res;
10 res.reserve(1 << min(m, 22));
11 for (int mask = 0; mask < (1 << m); ++mask) {
12 long long W = 0, V = 0;
13 for (int i = 0; i < m; ++i) if (mask & (1 << i)) {
14 W += arr[i].w;
15 V += arr[i].v;
16 }
17 res.emplace_back(W, V);
18 }
19 return res;
20}
21
22int main() {
23 ios::sync_with_stdio(false);
24 cin.tie(nullptr);
25
26 // Input: n, capacity W; then n lines of (weight, value)
27 int n; long long Wcap;
28 if (!(cin >> n >> Wcap)) return 0;
29 vector<Item> a(n);
30 for (int i = 0; i < n; ++i) cin >> a[i].w >> a[i].v;
31
32 int mid = n / 2;
33
34 vector<Item> L(a.begin(), a.begin() + mid);
35 vector<Item> R(a.begin() + mid, a.end());
36
37 auto PL = enumerate_pairs(L); // all (w, v) on left
38 auto PR = enumerate_pairs(R); // all (w, v) on right
39
40 // Sort PR by weight, then prune dominated states to form a Pareto frontier
41 sort(PR.begin(), PR.end()); // sort by (w asc, v asc)
42
43 vector<pair<long long,long long>> frontier;
44 long long bestV = LLONG_MIN; // track running max value
45 for (auto [w, v] : PR) {
46 if (w > Wcap) continue; // never usable if alone exceeds capacity
47 if (frontier.empty()) {
48 frontier.emplace_back(w, v);
49 bestV = max(bestV, v);
50 } else {
51 // If current value improves best seen so far at this or higher weight, keep
52 if (v > bestV) {
53 frontier.emplace_back(w, v);
54 bestV = v;
55 }
56 // Else dominated: heavier and not more valuable -> skip
57 }
58 }
59
60 // Now frontier has increasing weights and strictly increasing values
61 // For binary search by remaining capacity, extract weights and prefix-best values
62 vector<long long> FW, FV; FW.reserve(frontier.size()); FV.reserve(frontier.size());
63 for (auto &p : frontier) { FW.push_back(p.first); FV.push_back(p.second); }
64
65 long long ans = 0;
66
67 // For each left pair, find the best complement from the right frontier within remaining capacity
68 for (auto [wL, vL] : PL) {
69 if (wL > Wcap) continue; // cannot use this left subset
70 long long remain = Wcap - wL;
71 // Find the right-most weight <= remain
72 auto it = upper_bound(FW.begin(), FW.end(), remain);
73 if (it == FW.begin()) {
74 // No right subset fits, take left alone
75 ans = max(ans, vL);
76 } else {
77 int idx = int(it - FW.begin()) - 1;
78 ans = max(ans, vL + FV[idx]);
79 }
80 }
81
82 cout << ans << '\n';
83 return 0;
84}
85

We split items into two halves and enumerate all (weight, value) pairs for each side. On the right side, we sort by weight and prune dominated states to maintain a Pareto frontier with strictly increasing values. For each left subset, we binary search the right frontier for the heaviest weight that fits within remaining capacity and combine values. This yields the optimal knapsack value for n ≤ 40 even when weights are large and classic DP is infeasible.

Time: O(2^{n/2} + 2^{n/2} log 2^{n/2})Space: O(2^{n/2})