Small-to-Large Merging
Key Points
- •Small-to-large merging is a technique where you always merge the smaller container into the larger one to guarantee low total work.
- •Each element can move at most O(log n) times because the container it belongs to at least doubles in size every time it moves.
- •This yields O(n log n) total time for a sequence of merges over n total elements, assuming O(1) average-time inserts (e.g., unordere).
- •The technique powers DSU on tree (a.k.a. sack), where children’s data structures are merged into the parent to answer subtree queries.
- •It works with sets, maps, frequency tables, multisets, and even custom containers that support inserts and size queries.
- •Typical uses include counting distinct values in subtrees, tracking frequency modes, and aggregating properties across merged components.
- •Careful implementation details matter: choose the big container by current size, and never iterate the large container to insert into the small one.
- •Amortized analysis, not per-operation bounds, explains why this is fast; worst-case per merge may be large, but total cost stays O(n log n).
Prerequisites
- →Tree traversal (DFS) — DSU on tree relies on post-order processing of children before merging into the parent.
- →Hash maps and ordered maps — Understanding insertion complexity and trade-offs between unordered_map and std::map is essential for time guarantees.
- →Amortized analysis — The O(n log n) guarantee comes from amortized bounds using the doubling argument.
- →Sets and multisets — Small-to-large merging applies to deduplication and frequency tracking containers.
- →Pointers and dynamic allocation in C++ — The DSU on tree pattern often reuses a child’s container via pointers to avoid extra copying.
- →Recursion and stack management — Implementations usually use recursive DFS; knowing limits helps avoid stack overflow on deep trees.
- →Graph basics — Modeling trees as undirected graphs with adjacency lists is required for DSU on tree.
Detailed Explanation
Tap terms for definitions01Overview
Small-to-large merging is a powerful algorithmic pattern used when repeatedly merging collections such as sets, maps, or frequency tables. The rule is simple: whenever you merge two containers, always merge the smaller one into the larger one. This ensures that any individual element can only be moved a logarithmic number of times—each time it moves, the size of the container holding it at least doubles—leading to an overall O(n log n) time complexity for a sequence of merges on n elements. This idea underlies several competitive programming techniques, most famously DSU on tree (also called the sack technique), where you compute answers for every node’s subtree by merging children’s data into the parent, keeping the biggest child’s structure as the base. The approach is container-agnostic: it works with unordered_set/unordered_map (average O(1) inserts), std::set/std::map (O(log n) inserts, thus O(n log^2 n) overall), or custom structures, as long as you can measure size and insert elements. Small-to-large is particularly attractive when you need to maintain aggregates like distinct counts, frequency-based answers (e.g., mode, sum of colors achieving max frequency), or set unions across many merges. Its correctness is straightforward, and its efficiency is guaranteed by amortized analysis using the doubling argument.
02Intuition & Analogies
Imagine you have many piles of stones and you want to combine them into fewer piles. If you always dump the smaller pile into the larger one, each individual stone doesn’t get moved many times. Why? Because after each move, the pile it belongs to is at least twice as big as before. A single stone might move from a pile of size 1 to size 2, then to size 4, then to size 8, and so on—there are only about log2(n) such doublings before we reach n. This means every stone changes hands only a handful of times, even if you perform a large number of merges overall. Contrast this with merging arbitrarily: if you sometimes pour a large pile into a small one, you could end up moving many stones repeatedly, causing the total work to balloon. Small-to-large merging locks in the good case by design. In programming terms, the stone is a key/value in a map or an element of a set, and a pile is the container holding it. When you merge two containers, you iterate the smaller and insert its contents into the larger. Each element insertion is the equivalent of moving a stone. The key insight is not how expensive a single merge might look, but how many times any one element can be forced to move. The doubling intuition makes it clear: only logarithmically many times, which is why the total is fast.
03Formal Definition
04When to Use
Use small-to-large merging when you must repeatedly merge sets or maps and you care about total throughput rather than per-merge worst cases. Classic scenarios include: (1) DSU on tree (sack) to compute subtree answers like the number of distinct colors, the mode frequency, or the sum of values achieving the mode, where each node’s container aggregates its children’s data. (2) Merging many buckets/groups in offline processing, such as combining frequency tables after union operations or collapsing communities in a graph while maintaining statistics. (3) Polynomial or histogram merges: adding sparse polynomials or histograms stored as maps where many merges occur in arbitrary orders. (4) Incremental clustering or divide-and-conquer routines where partial results need to be unified efficiently. Prefer small-to-large when: the number of merges is large; elements may be moved multiple times; and the container supports size comparison and insertion. If deterministic logarithmic bounds are essential and hash instability is a concern, choose balanced BST containers (std::map/set) and accept the O(n log^2 n) trade-off. For tree problems with per-node aggregation, this technique is often simpler than heavy-light decomposition or Mo’s algorithm and avoids complex index structures.
⚠️Common Mistakes
Common pitfalls include: (1) Merging in the wrong direction—iterating the large container and inserting into the small one negates the doubling argument and can lead to quadratic behavior. Always check sizes and swap so you iterate the smaller. (2) Forgetting to reuse the largest child’s container in DSU on tree. If you allocate a fresh container at each node and merge everything into it, you lose the benefit of keeping the big bag and pay extra copies. Instead, adopt the biggest child’s bag as the parent’s bag. (3) Using data structures with poor average-case inserts (e.g., unordered_map with pathological hashing). Mitigate by reserving capacity, using good hash functions (e.g., custom hash for 64-bit), or falling back to std::map/set for strict guarantees. (4) Double counting or failing to maintain aggregate answers (like mode sums) when frequencies change. Ensure you correctly update maxima and associated sums during each insertion. (5) Memory mismanagement when using pointers to containers in DSU on tree—don’t free the big child’s container you just adopted, and clear/delete only the truly small temporary containers after merging. (6) Deep recursion on large trees causing stack overflows. Consider iterative DFS or increase recursion limits where possible. (7) Assuming per-merge O(log n) without amortization: the guarantee is on total work; single merges can still be heavy if the small set is large at that step.
Key Formulas
Doubling Bound
Explanation: Every time an element moves, the size of its new container at least doubles, so it can move only logarithmically many times up to n. This is the core reason small-to-large merging is efficient.
Total Time with O(1) Inserts
Explanation: If each insertion into the destination container is O(1) on average (e.g., unordere), the total time over all merges is proportional to n log n. n is the total number of elements ever inserted.
Total Time with O(log n) Inserts
Explanation: If each insertion costs O(log n) (e.g., std::map/set), the extra log factor multiplies the amortized O(n n) element moves, giving O(n lo n) total time.
Sum of Moves Bound
Explanation: Summing the per-element bound over all n elements yields the overall O(n log n) limit on element movements. This is a worst-case upper bound.
Container Size Growth
Explanation: The sequence of container sizes experienced by any element at least doubles each move, bounding the number of moves by a logarithm in n. This formalizes the doubling argument.
DSU on Tree Recurrence
Explanation: The cost at node u is the cost of its children plus the cost to merge all small children’s bags into the big child’s bag. Summed over the tree, the merge term is O(n log n).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Merge set B into set A using small-to-large. 5 // After the call, B is cleared to free memory. 6 void mergeSets(unordered_set<int>& A, unordered_set<int>& B) { 7 if (A.size() < B.size()) swap(A, B); // ensure A is the larger set 8 for (int x : B) A.insert(x); // insert all elements of the smaller set 9 B.clear(); // optional: free memory from B 10 } 11 12 int main() { 13 ios::sync_with_stdio(false); 14 cin.tie(nullptr); 15 16 int n, q; 17 // Input format: 18 // n q 19 // s1 a11 a12 ... 20 // s2 a21 a22 ... 21 // ... 22 // Then q lines: u v (merge set v into set u, print |u|) 23 if (!(cin >> n >> q)) return 0; 24 vector<unordered_set<int>> S(n + 1); 25 26 for (int i = 1; i <= n; ++i) { 27 int sz; cin >> sz; 28 S[i].reserve(sz * 2 + 1); // small optimization to reduce rehashing 29 for (int j = 0; j < sz; ++j) { 30 int x; cin >> x; 31 S[i].insert(x); 32 } 33 } 34 35 while (q--) { 36 int u, v; cin >> u >> v; 37 if (u == v) { // nothing to do 38 cout << S[u].size() << '\n'; 39 continue; 40 } 41 mergeSets(S[u], S[v]); 42 cout << S[u].size() << '\n'; 43 } 44 45 return 0; 46 } 47
We maintain n sets and process q merge operations. For each merge, we always merge the smaller set into the larger one, ensuring each element is moved only O(log n) times over the whole sequence. Clearing the smaller set after merging is optional but frees memory. Using unordered_set yields expected O(1) inserts; the total expected time is O(n log n) across all merges.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute for each node u the number of distinct colors in its subtree. 5 // We use DSU on tree (sack): keep a pointer to a bag (unordered_map color->count) per node. 6 7 const int MAXN = 200000; 8 int n; 9 vector<int> color; 10 vector<vector<int>> adj; 11 vector<int> ans; 12 vector<unordered_map<int,int>*> bag; // bag[u] points to the map for subtree u 13 14 void dfs(int u, int p) { 15 int big = -1; size_t best = 0; 16 17 // Process children first 18 for (int v : adj[u]) if (v != p) { 19 dfs(v, u); 20 if (bag[v] && bag[v]->size() > best) { 21 best = bag[v]->size(); 22 big = v; 23 } 24 } 25 26 if (big != -1) { 27 bag[u] = bag[big]; // adopt the largest child's bag 28 bag[big] = nullptr; // prevent accidental reuse 29 } else { 30 bag[u] = new unordered_map<int,int>(); 31 bag[u]->reserve(4); 32 } 33 34 // Merge all small children into u's bag 35 for (int v : adj[u]) if (v != p && v != big) { 36 if (!bag[v]) continue; 37 auto &src = *bag[v]; 38 auto &dst = *bag[u]; 39 if (dst.bucket_count() < src.size() * 2) dst.reserve(dst.size() + src.size() * 2); 40 for (auto &kv : src) dst[kv.first] += kv.second; 41 delete bag[v]; bag[v] = nullptr; // free the merged bag 42 } 43 44 // Add u's own color 45 (*bag[u])[color[u]]++; 46 47 // Distinct colors equals number of keys in the map 48 ans[u] = (int)bag[u]->size(); 49 } 50 51 int main(){ 52 ios::sync_with_stdio(false); 53 cin.tie(nullptr); 54 55 cin >> n; 56 color.assign(n+1, 0); 57 for (int i = 1; i <= n; ++i) cin >> color[i]; 58 adj.assign(n+1, {}); 59 for (int i = 0; i < n-1; ++i) { 60 int u, v; cin >> u >> v; 61 adj[u].push_back(v); 62 adj[v].push_back(u); 63 } 64 65 ans.assign(n+1, 0); 66 bag.assign(n+1, nullptr); 67 68 dfs(1, 0); 69 70 for (int i = 1; i <= n; ++i) cout << ans[i] << (i==n?'\n':' '); 71 72 // Clean up remaining allocated bags (only the root's bag should remain non-null) 73 for (int i = 1; i <= n; ++i) if (bag[i]) { delete bag[i]; bag[i] = nullptr; } 74 75 return 0; 76 } 77
We compute distinct colors in each node’s subtree by maintaining, for each node, a frequency map keyed by color. We choose the largest child’s bag as the base for the parent and small-to-large merge all other children’s bags into it, then add the parent’s own color. The number of distinct colors equals the number of keys. The doubling argument guarantees O(n log n) expected time with unordered_map.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // For each node u, compute the sum of colors that appear with maximal frequency in u's subtree. 5 // This is a classic "Lomsat Gelral"-style problem solvable via small-to-large merging. 6 7 int n; 8 vector<int> color; 9 vector<vector<int>> adj; 10 11 vector<unordered_map<int,int>*> bag; // color -> count for subtree 12 vector<int> mxFreq; // maximal frequency for subtree 13 vector<long long> sumWithMx; // sum of colors achieving mxFreq 14 15 void dfs(int u, int p) { 16 int big = -1; size_t best = 0; 17 18 for (int v : adj[u]) if (v != p) { 19 dfs(v, u); 20 if (bag[v] && bag[v]->size() > best) { best = bag[v]->size(); big = v; } 21 } 22 23 if (big != -1) { 24 bag[u] = bag[big]; bag[big] = nullptr; 25 mxFreq[u] = mxFreq[big]; 26 sumWithMx[u] = sumWithMx[big]; 27 } else { 28 bag[u] = new unordered_map<int,int>(); 29 bag[u]->reserve(4); 30 mxFreq[u] = 0; 31 sumWithMx[u] = 0; 32 } 33 34 // Merge other children into u's bag and update mode info incrementally 35 for (int v : adj[u]) if (v != p && v != big) { 36 if (!bag[v]) continue; 37 auto &src = *bag[v]; 38 auto &dst = *bag[u]; 39 if (dst.bucket_count() < src.size() * 2) dst.reserve(dst.size() + src.size() * 2); 40 for (auto &kv : src) { 41 int c = kv.first; 42 int add = kv.second; 43 int after = (dst[c] += add); 44 if (after > mxFreq[u]) { mxFreq[u] = after; sumWithMx[u] = c; } 45 else if (after == mxFreq[u]) { sumWithMx[u] += c; } 46 } 47 delete bag[v]; bag[v] = nullptr; 48 } 49 50 // Add u's own color and update mode info 51 int after = (++(*bag[u])[color[u]]); 52 if (after > mxFreq[u]) { mxFreq[u] = after; sumWithMx[u] = color[u]; } 53 else if (after == mxFreq[u]) { sumWithMx[u] += color[u]; } 54 } 55 56 int main(){ 57 ios::sync_with_stdio(false); 58 cin.tie(nullptr); 59 60 cin >> n; 61 color.assign(n+1, 0); 62 for (int i = 1; i <= n; ++i) cin >> color[i]; 63 adj.assign(n+1, {}); 64 for (int i = 0; i < n-1; ++i) { 65 int u, v; cin >> u >> v; 66 adj[u].push_back(v); 67 adj[v].push_back(u); 68 } 69 70 bag.assign(n+1, nullptr); 71 mxFreq.assign(n+1, 0); 72 sumWithMx.assign(n+1, 0); 73 74 dfs(1, 0); 75 76 for (int i = 1; i <= n; ++i) cout << sumWithMx[i] << (i==n?'\n':' '); 77 78 for (int i = 1; i <= n; ++i) if (bag[i]) { delete bag[i]; bag[i] = nullptr; } 79 return 0; 80 } 81
We maintain, for each subtree, a color frequency map plus two aggregates: mxFreq (the maximum frequency) and sumWithMx (the sum of colors that achieve that frequency). We adopt the largest child’s bag and merge all other children’s entries into it, updating the aggregates incrementally as counts increase. Finally, we add the current node’s color. Small-to-large ensures that each color’s count is updated only O(log n) times across the whole tree.