⚙️AlgorithmIntermediate

Small-to-Large Merging

Key Points

  • Small-to-large merging is a technique where you always merge the smaller container into the larger one to guarantee low total work.
  • Each element can move at most O(log n) times because the container it belongs to at least doubles in size every time it moves.
  • This yields O(n log n) total time for a sequence of merges over n total elements, assuming O(1) average-time inserts (e.g., unordere).
  • The technique powers DSU on tree (a.k.a. sack), where children’s data structures are merged into the parent to answer subtree queries.
  • It works with sets, maps, frequency tables, multisets, and even custom containers that support inserts and size queries.
  • Typical uses include counting distinct values in subtrees, tracking frequency modes, and aggregating properties across merged components.
  • Careful implementation details matter: choose the big container by current size, and never iterate the large container to insert into the small one.
  • Amortized analysis, not per-operation bounds, explains why this is fast; worst-case per merge may be large, but total cost stays O(n log n).

Prerequisites

  • Tree traversal (DFS)DSU on tree relies on post-order processing of children before merging into the parent.
  • Hash maps and ordered mapsUnderstanding insertion complexity and trade-offs between unordered_map and std::map is essential for time guarantees.
  • Amortized analysisThe O(n log n) guarantee comes from amortized bounds using the doubling argument.
  • Sets and multisetsSmall-to-large merging applies to deduplication and frequency tracking containers.
  • Pointers and dynamic allocation in C++The DSU on tree pattern often reuses a child’s container via pointers to avoid extra copying.
  • Recursion and stack managementImplementations usually use recursive DFS; knowing limits helps avoid stack overflow on deep trees.
  • Graph basicsModeling trees as undirected graphs with adjacency lists is required for DSU on tree.

Detailed Explanation

Tap terms for definitions

01Overview

Small-to-large merging is a powerful algorithmic pattern used when repeatedly merging collections such as sets, maps, or frequency tables. The rule is simple: whenever you merge two containers, always merge the smaller one into the larger one. This ensures that any individual element can only be moved a logarithmic number of times—each time it moves, the size of the container holding it at least doubles—leading to an overall O(n log n) time complexity for a sequence of merges on n elements. This idea underlies several competitive programming techniques, most famously DSU on tree (also called the sack technique), where you compute answers for every node’s subtree by merging children’s data into the parent, keeping the biggest child’s structure as the base. The approach is container-agnostic: it works with unordered_set/unordered_map (average O(1) inserts), std::set/std::map (O(log n) inserts, thus O(n log^2 n) overall), or custom structures, as long as you can measure size and insert elements. Small-to-large is particularly attractive when you need to maintain aggregates like distinct counts, frequency-based answers (e.g., mode, sum of colors achieving max frequency), or set unions across many merges. Its correctness is straightforward, and its efficiency is guaranteed by amortized analysis using the doubling argument.

02Intuition & Analogies

Imagine you have many piles of stones and you want to combine them into fewer piles. If you always dump the smaller pile into the larger one, each individual stone doesn’t get moved many times. Why? Because after each move, the pile it belongs to is at least twice as big as before. A single stone might move from a pile of size 1 to size 2, then to size 4, then to size 8, and so on—there are only about log2(n) such doublings before we reach n. This means every stone changes hands only a handful of times, even if you perform a large number of merges overall. Contrast this with merging arbitrarily: if you sometimes pour a large pile into a small one, you could end up moving many stones repeatedly, causing the total work to balloon. Small-to-large merging locks in the good case by design. In programming terms, the stone is a key/value in a map or an element of a set, and a pile is the container holding it. When you merge two containers, you iterate the smaller and insert its contents into the larger. Each element insertion is the equivalent of moving a stone. The key insight is not how expensive a single merge might look, but how many times any one element can be forced to move. The doubling intuition makes it clear: only logarithmically many times, which is why the total is fast.

03Formal Definition

Consider a family of containers C = {, , ..., } with total element count n. A merge operation takes two indices i, j and combines into by inserting every element of into . The small-to-large policy enforces that at the start of each merge; if not, swap roles before merging. Let an element x experience a sequence of container sizes , , ..., as it is moved between containers due to merges. Because we always merge smaller into larger, we have . Since and , it follows t ≤ ⌊lo n⌋. Therefore, each element is copied/integrated at most O(log n) times. If each insertion is O(1) amortized (e.g., std::unordere/set under average-case hashing), then total time across all merges is O(n log n). When inserts are O(log n) (e.g., std::map/set), total time becomes O(n lo n). In DSU on tree, we keep at each node u a container B(u) aggregating its subtree information. After processing all children v of u, we choose the child with the largest to initialize B(u), then small-to-large merge all other B(v) into B(u), and finally incorporate u’s own contribution. The same doubling argument bounds total merging cost over the entire tree.

04When to Use

Use small-to-large merging when you must repeatedly merge sets or maps and you care about total throughput rather than per-merge worst cases. Classic scenarios include: (1) DSU on tree (sack) to compute subtree answers like the number of distinct colors, the mode frequency, or the sum of values achieving the mode, where each node’s container aggregates its children’s data. (2) Merging many buckets/groups in offline processing, such as combining frequency tables after union operations or collapsing communities in a graph while maintaining statistics. (3) Polynomial or histogram merges: adding sparse polynomials or histograms stored as maps where many merges occur in arbitrary orders. (4) Incremental clustering or divide-and-conquer routines where partial results need to be unified efficiently. Prefer small-to-large when: the number of merges is large; elements may be moved multiple times; and the container supports size comparison and insertion. If deterministic logarithmic bounds are essential and hash instability is a concern, choose balanced BST containers (std::map/set) and accept the O(n log^2 n) trade-off. For tree problems with per-node aggregation, this technique is often simpler than heavy-light decomposition or Mo’s algorithm and avoids complex index structures.

⚠️Common Mistakes

Common pitfalls include: (1) Merging in the wrong direction—iterating the large container and inserting into the small one negates the doubling argument and can lead to quadratic behavior. Always check sizes and swap so you iterate the smaller. (2) Forgetting to reuse the largest child’s container in DSU on tree. If you allocate a fresh container at each node and merge everything into it, you lose the benefit of keeping the big bag and pay extra copies. Instead, adopt the biggest child’s bag as the parent’s bag. (3) Using data structures with poor average-case inserts (e.g., unordered_map with pathological hashing). Mitigate by reserving capacity, using good hash functions (e.g., custom hash for 64-bit), or falling back to std::map/set for strict guarantees. (4) Double counting or failing to maintain aggregate answers (like mode sums) when frequencies change. Ensure you correctly update maxima and associated sums during each insertion. (5) Memory mismanagement when using pointers to containers in DSU on tree—don’t free the big child’s container you just adopted, and clear/delete only the truly small temporary containers after merging. (6) Deep recursion on large trees causing stack overflows. Consider iterative DFS or increase recursion limits where possible. (7) Assuming per-merge O(log n) without amortization: the guarantee is on total work; single merges can still be heavy if the small set is large at that step.

Key Formulas

Doubling Bound

Explanation: Every time an element moves, the size of its new container at least doubles, so it can move only logarithmically many times up to n. This is the core reason small-to-large merging is efficient.

Total Time with O(1) Inserts

Explanation: If each insertion into the destination container is O(1) on average (e.g., unordere), the total time over all merges is proportional to n log n. n is the total number of elements ever inserted.

Total Time with O(log n) Inserts

Explanation: If each insertion costs O(log n) (e.g., std::map/set), the extra log factor multiplies the amortized O(n n) element moves, giving O(n lo n) total time.

Sum of Moves Bound

Explanation: Summing the per-element bound over all n elements yields the overall O(n log n) limit on element movements. This is a worst-case upper bound.

Container Size Growth

Explanation: The sequence of container sizes experienced by any element at least doubles each move, bounding the number of moves by a logarithm in n. This formalizes the doubling argument.

DSU on Tree Recurrence

Explanation: The cost at node u is the cost of its children plus the cost to merge all small children’s bags into the big child’s bag. Summed over the tree, the merge term is O(n log n).

Complexity Analysis

The key complexity fact is amortized: when we always merge the smaller container into the larger one, each element changes containers at most O(log n) times. Specifically, if an element is moved from a container of size s to another of size at least s, the destination must be at least 2s at the moment of insertion (since we merge small into big), so future moves can happen only O(log n) times before the container reaches size n. Summing over all elements yields O(n log n) total insertions. If the underlying container supports average O(1) insertion (e.g., std::unordere/set with good hashing), the overall time across a sequence of merges is O(n log n). If the container has O(log n) insertion time (e.g., std::map or std::set), the total is O(n lo n). For DSU on tree, we perform a DFS; at each node, we adopt the largest child’s bag and small-to-large merge all other children’s bags. The sizes of bags across nodes collectively satisfy the same doubling argument, leading to O(n log n) expected time when using hash maps. Space complexity is O(n) for storing the tree plus the total size of all elements stored across bags; because each element exists in exactly one bag at any time and we reuse the big bag, the total auxiliary memory for bags is O(n). Additional overhead includes recursion stack O(h) where h is tree height and temporary pointers/containers proportional to the number of nodes. In practice, rehashing can add hidden costs; reserving capacity for large bags mitigates this.

Code Examples

Generic small-to-large merging of sets with online merges
1#include <bits/stdc++.h>
2using namespace std;
3
4// Merge set B into set A using small-to-large.
5// After the call, B is cleared to free memory.
6void mergeSets(unordered_set<int>& A, unordered_set<int>& B) {
7 if (A.size() < B.size()) swap(A, B); // ensure A is the larger set
8 for (int x : B) A.insert(x); // insert all elements of the smaller set
9 B.clear(); // optional: free memory from B
10}
11
12int main() {
13 ios::sync_with_stdio(false);
14 cin.tie(nullptr);
15
16 int n, q;
17 // Input format:
18 // n q
19 // s1 a11 a12 ...
20 // s2 a21 a22 ...
21 // ...
22 // Then q lines: u v (merge set v into set u, print |u|)
23 if (!(cin >> n >> q)) return 0;
24 vector<unordered_set<int>> S(n + 1);
25
26 for (int i = 1; i <= n; ++i) {
27 int sz; cin >> sz;
28 S[i].reserve(sz * 2 + 1); // small optimization to reduce rehashing
29 for (int j = 0; j < sz; ++j) {
30 int x; cin >> x;
31 S[i].insert(x);
32 }
33 }
34
35 while (q--) {
36 int u, v; cin >> u >> v;
37 if (u == v) { // nothing to do
38 cout << S[u].size() << '\n';
39 continue;
40 }
41 mergeSets(S[u], S[v]);
42 cout << S[u].size() << '\n';
43 }
44
45 return 0;
46}
47

We maintain n sets and process q merge operations. For each merge, we always merge the smaller set into the larger one, ensuring each element is moved only O(log n) times over the whole sequence. Clearing the smaller set after merging is optional but frees memory. Using unordered_set yields expected O(1) inserts; the total expected time is O(n log n) across all merges.

Time: Amortized O(n log n + total_input)Space: O(n + total_elements)
DSU on tree: number of distinct colors in each subtree
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute for each node u the number of distinct colors in its subtree.
5// We use DSU on tree (sack): keep a pointer to a bag (unordered_map color->count) per node.
6
7const int MAXN = 200000;
8int n;
9vector<int> color;
10vector<vector<int>> adj;
11vector<int> ans;
12vector<unordered_map<int,int>*> bag; // bag[u] points to the map for subtree u
13
14void dfs(int u, int p) {
15 int big = -1; size_t best = 0;
16
17 // Process children first
18 for (int v : adj[u]) if (v != p) {
19 dfs(v, u);
20 if (bag[v] && bag[v]->size() > best) {
21 best = bag[v]->size();
22 big = v;
23 }
24 }
25
26 if (big != -1) {
27 bag[u] = bag[big]; // adopt the largest child's bag
28 bag[big] = nullptr; // prevent accidental reuse
29 } else {
30 bag[u] = new unordered_map<int,int>();
31 bag[u]->reserve(4);
32 }
33
34 // Merge all small children into u's bag
35 for (int v : adj[u]) if (v != p && v != big) {
36 if (!bag[v]) continue;
37 auto &src = *bag[v];
38 auto &dst = *bag[u];
39 if (dst.bucket_count() < src.size() * 2) dst.reserve(dst.size() + src.size() * 2);
40 for (auto &kv : src) dst[kv.first] += kv.second;
41 delete bag[v]; bag[v] = nullptr; // free the merged bag
42 }
43
44 // Add u's own color
45 (*bag[u])[color[u]]++;
46
47 // Distinct colors equals number of keys in the map
48 ans[u] = (int)bag[u]->size();
49}
50
51int main(){
52 ios::sync_with_stdio(false);
53 cin.tie(nullptr);
54
55 cin >> n;
56 color.assign(n+1, 0);
57 for (int i = 1; i <= n; ++i) cin >> color[i];
58 adj.assign(n+1, {});
59 for (int i = 0; i < n-1; ++i) {
60 int u, v; cin >> u >> v;
61 adj[u].push_back(v);
62 adj[v].push_back(u);
63 }
64
65 ans.assign(n+1, 0);
66 bag.assign(n+1, nullptr);
67
68 dfs(1, 0);
69
70 for (int i = 1; i <= n; ++i) cout << ans[i] << (i==n?'\n':' ');
71
72 // Clean up remaining allocated bags (only the root's bag should remain non-null)
73 for (int i = 1; i <= n; ++i) if (bag[i]) { delete bag[i]; bag[i] = nullptr; }
74
75 return 0;
76}
77

We compute distinct colors in each node’s subtree by maintaining, for each node, a frequency map keyed by color. We choose the largest child’s bag as the base for the parent and small-to-large merge all other children’s bags into it, then add the parent’s own color. The number of distinct colors equals the number of keys. The doubling argument guarantees O(n log n) expected time with unordered_map.

Time: O(n log n) expectedSpace: O(n)
DSU on tree: sum of colors achieving maximum frequency (mode sum) in each subtree
1#include <bits/stdc++.h>
2using namespace std;
3
4// For each node u, compute the sum of colors that appear with maximal frequency in u's subtree.
5// This is a classic "Lomsat Gelral"-style problem solvable via small-to-large merging.
6
7int n;
8vector<int> color;
9vector<vector<int>> adj;
10
11vector<unordered_map<int,int>*> bag; // color -> count for subtree
12vector<int> mxFreq; // maximal frequency for subtree
13vector<long long> sumWithMx; // sum of colors achieving mxFreq
14
15void dfs(int u, int p) {
16 int big = -1; size_t best = 0;
17
18 for (int v : adj[u]) if (v != p) {
19 dfs(v, u);
20 if (bag[v] && bag[v]->size() > best) { best = bag[v]->size(); big = v; }
21 }
22
23 if (big != -1) {
24 bag[u] = bag[big]; bag[big] = nullptr;
25 mxFreq[u] = mxFreq[big];
26 sumWithMx[u] = sumWithMx[big];
27 } else {
28 bag[u] = new unordered_map<int,int>();
29 bag[u]->reserve(4);
30 mxFreq[u] = 0;
31 sumWithMx[u] = 0;
32 }
33
34 // Merge other children into u's bag and update mode info incrementally
35 for (int v : adj[u]) if (v != p && v != big) {
36 if (!bag[v]) continue;
37 auto &src = *bag[v];
38 auto &dst = *bag[u];
39 if (dst.bucket_count() < src.size() * 2) dst.reserve(dst.size() + src.size() * 2);
40 for (auto &kv : src) {
41 int c = kv.first;
42 int add = kv.second;
43 int after = (dst[c] += add);
44 if (after > mxFreq[u]) { mxFreq[u] = after; sumWithMx[u] = c; }
45 else if (after == mxFreq[u]) { sumWithMx[u] += c; }
46 }
47 delete bag[v]; bag[v] = nullptr;
48 }
49
50 // Add u's own color and update mode info
51 int after = (++(*bag[u])[color[u]]);
52 if (after > mxFreq[u]) { mxFreq[u] = after; sumWithMx[u] = color[u]; }
53 else if (after == mxFreq[u]) { sumWithMx[u] += color[u]; }
54}
55
56int main(){
57 ios::sync_with_stdio(false);
58 cin.tie(nullptr);
59
60 cin >> n;
61 color.assign(n+1, 0);
62 for (int i = 1; i <= n; ++i) cin >> color[i];
63 adj.assign(n+1, {});
64 for (int i = 0; i < n-1; ++i) {
65 int u, v; cin >> u >> v;
66 adj[u].push_back(v);
67 adj[v].push_back(u);
68 }
69
70 bag.assign(n+1, nullptr);
71 mxFreq.assign(n+1, 0);
72 sumWithMx.assign(n+1, 0);
73
74 dfs(1, 0);
75
76 for (int i = 1; i <= n; ++i) cout << sumWithMx[i] << (i==n?'\n':' ');
77
78 for (int i = 1; i <= n; ++i) if (bag[i]) { delete bag[i]; bag[i] = nullptr; }
79 return 0;
80}
81

We maintain, for each subtree, a color frequency map plus two aggregates: mxFreq (the maximum frequency) and sumWithMx (the sum of colors that achieve that frequency). We adopt the largest child’s bag and merge all other children’s entries into it, updating the aggregates incrementally as counts increase. Finally, we add the current node’s color. Small-to-large ensures that each color’s count is updated only O(log n) times across the whole tree.

Time: O(n log n) expectedSpace: O(n)