Rollback DSU
Key Points
- •Rollback DSU (Disjoint Set Union with undo) lets you union sets and later revert to any previous state in LIFO order.
- •It uses union-by-size/rank but deliberately avoids path compression so that changes are easy to reverse.
- •All changes made during a merge are pushed onto a stack; rollback simply pops and restores them.
- •Each union/find/rollback runs in O( n) worst case because tree height stays at most O( n).
- •Rollback DSU is ideal for offline divide-and-conquer on queries, especially dynamic connectivity over time.
- •You can take a snapshot (stack size) and rollback to that snapshot later in O(k n) where k merges happened after the snapshot.
- •It can be augmented to maintain global aggregates (e.g., number of components, pairs in same component) and still support rollback.
- •Typical use cases include segment tree over time for dynamic connectivity, Kruskal-like offline processing, and parallel binary search.
Prerequisites
- →Basic DSU / Union-Find — Understanding union, find, union-by-size/rank, and component representation is essential.
- →Stacks and LIFO semantics — Rollback relies on undoing changes in reverse order using a stack of changes.
- →Divide and Conquer on intervals — Commonly used with rollback DSU to apply changes within subranges and undo them afterward.
- →Segment Tree — Implements the time-partitioning structure to store edge lifespans across O(log Q) nodes.
- →Amortized and worst-case analysis — To reason about O(log n) per operation and total bounds across offline traversals.
- →Graph connectivity basics — Rollback DSU is primarily used to maintain connected components in graphs.
- →Offline algorithm design — You must reformulate problems to process all operations/queries in a non-online order.
Detailed Explanation
Tap terms for definitions01Overview
Rollback DSU (also called DSU with undo) is a variant of the Disjoint Set Union (Union-Find) data structure that supports efficient undo operations. Like standard DSU, it maintains a partition of elements into disjoint sets and supports union and find operations. The key twist is that every change made during a union is recorded on a stack so that we can revert (rollback) those changes later. This makes it perfect for offline algorithms where we explore different scenarios (e.g., different time intervals of a dynamic graph) and need to return to earlier states without reconstructing data from scratch. To enable rollback, we use union-by-size or union-by-rank to keep trees shallow, but we do not use path compression (since it makes rolling back very difficult). The result is O(log n) worst-case time per find/union/rollback, because union-by-size/rank guarantees logarithmic height of trees. Rollback DSU is commonly combined with divide-and-conquer over time, a segment tree over time intervals, or parallel binary search to solve dynamic connectivity problems offline. It can also be augmented to track additional global information, like the number of components, total number of connected pairs, or bipartiteness, while keeping the ability to rollback.
02Intuition & Analogies
Imagine building something with LEGO and taking pictures at various stages. A normal DSU is like building and compressing pieces together so tightly that it’s hard to separate them later (path compression). A Rollback DSU is like assembling pieces but keeping a careful log of every connection you made, so that you can go back to a previous picture by undoing the last few steps in reverse order. The log is a stack: last change made is the first one undone. Why avoid path compression? Path compression flattens paths by changing many parent pointers at once, which is great for speed but terrible for undoing—it’s like welding many pieces in one go. Union-by-size/rank is different: you only connect one root under another and update one size counter. That’s just a couple of reversible edits, easy to push onto the stack. The tree might be a bit taller without path compression, but union-by-size/rank still keeps it short enough (logarithmic height). Now, think about dynamic connectivity over time, where edges are added and removed as time passes. If we build a segment tree over time intervals, then as we travel down the tree, we “apply” edges active in that interval. When we return from that interval, we rollback—restoring DSU to the exact state before we descended. This is like exploring alternate timelines in a story: you go down one branch (apply changes), then return to the branching point (rollback) and explore another branch, all without rebuilding from scratch.
03Formal Definition
04When to Use
Use Rollback DSU when you need to explore many versions of a graph’s connectivity offline and you can organize the exploration so that changes are applied and then undone in LIFO order. Classic scenarios include:
- Dynamic connectivity offline: Build a segment tree over time. Each edge’s active interval [l, r) is added to nodes covering that interval. DFS the tree: apply unions when entering a node and rollback when leaving. Answer connectivity queries at leaves (timestamps).
- Divide-and-conquer on queries: When a problem can be split into subranges where each subrange needs a subset of edges applied, a rollback-friendly structure is ideal to move between subproblems cheaply.
- Parallel binary search: For many threshold queries (e.g., minimal weight to connect u and v), you can evaluate midpoints in batches by applying certain edges, answering, and rolling back.
- Kruskal-like offline processes: Variants that need to consider edges in different orders per subproblem and then revert to a base state benefit from rollback. Choose Rollback DSU when rebuild-from-scratch would be too slow, persistent DSU is overkill/complex, and your algorithm’s navigation through states is naturally stack-like (enter subproblem → apply changes → leave subproblem → undo changes).
⚠️Common Mistakes
- Using path compression: It makes find almost O(1) amortized but breaks rollback because a single find can change many parent pointers. In Rollback DSU, never do path compression; rely on union-by-size/rank instead.
- Forgetting to store enough info for rollback: When you union, you must record all mutated fields to restore them: the child root whose parent changed, and the previous size/rank of the new parent root. If you also maintain aggregates (e.g., number of connected pairs), store their previous values or store enough data to recompute them exactly during rollback.
- Pushing changes for unsuccessful unions: If two elements are already connected, do not change anything, and do not push a change unless your rollback logic expects it. The simplest approach uses snapshot = stack size; if no change happened, nothing needs to be popped.
- Not normalizing edges in dynamic connectivity: When mapping edges to activation intervals, always normalize endpoints (u < v) so that pairs can be matched correctly between add/remove events.
- Memory blow-up: Each successful union pushes a change. In segment-tree-over-time solutions, each edge appears in O(log Q) nodes. Ensure total pushes remain within memory (O(M log Q)), and use compact change records.
- Incorrect rollback order: Rollback must pop in exact reverse order of applied changes. Mixing multiple DSU instances or interleaving rollbacks to the wrong snapshot corrupts state. Always save a snapshot on entry and rollback to it on exit of a recursive frame.
Key Formulas
Tree Height under Union-by-Size
Explanation: The height h of any set’s tree is bounded by the base-2 logarithm of the number of elements because each time you move up a level, the size at least doubles. This ensures find and union take O(log n) time.
Per-Operation Time Bounds
Explanation: With union-by-size (no path compression), find, union, and each undo step traverse at most logarithmic height. These are worst-case guarantees, not just amortized.
Space Usage
Explanation: Space is linear in the number of elements n plus the number U of successful unions recorded on the change stack. Each successful merge contributes one change record.
Pairs in Same Component
Explanation: If components are , ..., , the number of unordered pairs within the same component equals the sum of combinations over component sizes. This aggregate can be updated on union and rolled back.
Pairs Delta on Merge
Explanation: Merging components of sizes a and b creates exactly a·b new intra-component pairs. This identity is useful for maintaining aggregates during unions and rollbacks.
Divide-and-Conquer over Time
Explanation: When traversing a segment tree over Q time points, each node applies k unions. The recurrence shows total work across the tree, leading to O((M + Q) Q n) overall for dynamic connectivity.
Edge-to-Segment Expansion
Explanation: Each edge interval [l, r) is stored in O(log Q) segment tree nodes. If there are M intervals, total recorded edge appearances is O(M log Q).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Rollback DSU with union-by-size and NO path compression. 5 // Supports: find (iterative), unite, snapshot, rollback, components count. 6 struct DSURollback { 7 int n; // number of elements 8 vector<int> parent, sz; // parent pointers and sizes (valid at roots) 9 vector<pair<int,int>> stk; // change stack: (child_root, prev_size_of_parent_root) 10 int comps; // current number of components 11 12 DSURollback(int n = 0) { init(n); } 13 14 void init(int n_) { 15 n = n_; 16 parent.resize(n); 17 sz.assign(n, 1); 18 iota(parent.begin(), parent.end(), 0); 19 stk.clear(); 20 comps = n; 21 } 22 23 // Find root without path compression (iterative to keep it fast and rollback-friendly) 24 int find(int x) { 25 while (parent[x] != x) x = parent[x]; 26 return x; 27 } 28 29 // Returns true if a merge happened. 30 bool unite(int a, int b) { 31 a = find(a); b = find(b); 32 if (a == b) return false; 33 if (sz[a] < sz[b]) swap(a, b); // ensure a is the larger root 34 // Record the change: child root b will point to a; store previous size of a 35 stk.emplace_back(b, sz[a]); 36 parent[b] = a; 37 sz[a] += sz[b]; 38 comps--; // one fewer component 39 return true; 40 } 41 42 // Save current stack size as a snapshot token 43 int snapshot() const { return (int)stk.size(); } 44 45 // Roll back to the state when stack size was s 46 void rollback(int s) { 47 while ((int)stk.size() > s) { 48 auto [b, prev_sz_a] = stk.back(); 49 stk.pop_back(); 50 int a = parent[b]; // currently, b's parent is a 51 parent[b] = b; // detach b to make it a root again 52 sz[a] = prev_sz_a; // restore size of a 53 comps++; // we just undid one successful union 54 } 55 } 56 57 int components() const { return comps; } 58 }; 59 60 int main() { 61 ios::sync_with_stdio(false); 62 cin.tie(nullptr); 63 64 // Demo: basic operations with snapshots and rollbacks 65 int n = 6; 66 DSURollback dsu(n); 67 68 // Snapshot S0 69 int S0 = dsu.snapshot(); 70 dsu.unite(0, 1); 71 dsu.unite(2, 3); 72 cout << "Components after two unions: " << dsu.components() << "\n"; // expect 4 73 74 // Nested snapshot S1 75 int S1 = dsu.snapshot(); 76 dsu.unite(1, 2); // merges {0,1} with {2,3} 77 cout << "Components after merging (1,2): " << dsu.components() << "\n"; // expect 3 78 79 // Rollback to S1: undo last merge only 80 dsu.rollback(S1); 81 cout << "Components after rollback to S1: " << dsu.components() << "\n"; // expect 4 82 83 // Rollback to S0: undo all merges since S0 84 dsu.rollback(S0); 85 cout << "Components after rollback to S0: " << dsu.components() << "\n"; // expect 6 86 87 return 0; 88 } 89
This program implements a rollback-capable DSU with union-by-size and a stack of changes. Each successful union records which root became a child and the previous size of the new parent. A snapshot is simply the current stack size; rolling back pops changes until the stack reaches the snapshot size, restoring parents, sizes, and the component count. The demo shows nested snapshots and selective rollback.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Rollback DSU as before 5 struct DSURollback { 6 int n; vector<int> parent, sz; vector<pair<int,int>> stk; int comps; 7 DSURollback(int n=0){init(n);} void init(int n_){n=n_; parent.resize(n); sz.assign(n,1); iota(parent.begin(),parent.end(),0); stk.clear(); comps=n;} 8 int find(int x){ while(parent[x]!=x) x=parent[x]; return x; } 9 bool unite(int a,int b){ a=find(a); b=find(b); if(a==b) return false; if(sz[a]<sz[b]) swap(a,b); stk.emplace_back(b, sz[a]); parent[b]=a; sz[a]+=sz[b]; comps--; return true; } 10 int snapshot() const { return (int)stk.size(); } 11 void rollback(int s){ while((int)stk.size()>s){ auto [b, prev]=stk.back(); stk.pop_back(); int a=parent[b]; parent[b]=b; sz[a]=prev; comps++; } } 12 }; 13 14 // We support three operations: 15 // 1) add u v 16 // 2) remove u v 17 // 3) query 18 // Input format example: 19 // n q 20 // <q lines, each is: "add u v" or "rem u v" or "qry"> 21 // nodes are 0-indexed or 1-indexed; we will accept 1-indexed in input and convert to 0-indexed. 22 // Output: for each query, print the current number of connected components. 23 24 struct Edge { int u, v; }; 25 26 struct SegTree { 27 int N; // timeline size (number of operations) 28 vector<vector<Edge>> tree; // tree[node] holds edges active in that segment 29 SegTree(int n=0){init(n);} 30 void init(int n){ N=n; tree.assign(4*N, {}); } 31 void add_interval(int node, int l, int r, int ql, int qr, const Edge &e){ 32 if(qr<=l || r<=ql) return; // disjoint 33 if(ql<=l && r<=qr){ tree[node].push_back(e); return; } 34 int mid=(l+r)/2; 35 add_interval(node*2, l, mid, ql, qr, e); 36 add_interval(node*2+1, mid, r, ql, qr, e); 37 } 38 void add_interval(int l, int r, const Edge &e){ add_interval(1,0,N,l,r,e); } 39 }; 40 41 int main(){ 42 ios::sync_with_stdio(false); 43 cin.tie(nullptr); 44 45 int n, q; 46 if(!(cin >> n >> q)) return 0; 47 48 struct Op { string type; int u=-1, v=-1; }; 49 vector<Op> ops(q); 50 51 auto norm = [&](int &u, int &v){ if(u>v) swap(u,v); }; 52 53 // Track activation intervals for edges 54 map<pair<int,int>, int> on; // edge -> start time 55 vector<pair<pair<int,int>, pair<int,int>>> intervals; // ((u,v), (l,r)) 56 57 for(int i=0;i<q;i++){ 58 string t; cin >> t; ops[i].type=t; 59 if(t=="add" || t=="rem"){ 60 int u,v; cin >> u >> v; --u; --v; // convert to 0-index 61 norm(u,v); ops[i].u=u; ops[i].v=v; 62 if(t=="add"){ on[{u,v}] = i; } 63 else{ // rem 64 auto it = on.find({u,v}); 65 if(it==on.end()){ 66 // If input guarantees valid pairs, this won't happen. 67 // Otherwise, ignore or handle error. 68 }else{ 69 intervals.push_back({{u,v},{it->second, i}}); 70 on.erase(it); 71 } 72 } 73 }else{ 74 // query: nothing to store besides type 75 } 76 } 77 // Remaining active edges last until time q 78 for(auto &kv : on){ intervals.push_back({kv.first, {kv.second, q}}); } 79 80 // Build segment tree over [0, q) 81 SegTree seg(q); 82 for(auto &it : intervals){ 83 int u = it.first.first, v = it.first.second; 84 int l = it.second.first, r = it.second.second; 85 seg.add_interval(l, r, {u,v}); 86 } 87 88 DSURollback dsu(n); 89 90 vector<int> answers; answers.reserve(q); 91 92 function<void(int,int,int)> dfs = [&](int node, int l, int r){ 93 int snap = dsu.snapshot(); 94 // Apply edges active in this segment 95 for(const auto &e : seg.tree[node]) dsu.unite(e.u, e.v); 96 if(l+1 == r){ 97 if(ops[l].type == "qry"){ 98 answers.push_back(dsu.components()); 99 } 100 }else{ 101 int mid=(l+r)/2; 102 dfs(node*2, l, mid); 103 dfs(node*2+1, mid, r); 104 } 105 dsu.rollback(snap); 106 }; 107 108 dfs(1, 0, q); 109 110 for(int x : answers) cout << x << '\n'; 111 return 0; 112 } 113
We preprocess add/remove operations to build activation intervals [l, r) for each undirected edge (normalized with u < v). These intervals are added to a segment tree over time, so each edge appears in O(log Q) nodes. A DFS over the segment tree applies edges at entry (using DSU unite) and rollbacks on exit to restore state. We answer queries at leaves (time points). This is the classic offline dynamic connectivity pattern using rollback DSU.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Rollback DSU augmented to maintain total number of unordered pairs (i,j) in the same component. 5 // We store previous pairs value per change to make rollback O(1) per pop. 6 struct DSURollbackPairs { 7 int n; vector<int> parent, sz; 8 struct Change { int child; int prev_size_parent; long long prev_pairs; }; 9 vector<Change> stk; 10 int comps; long long pairs_in_same; 11 12 DSURollbackPairs(int n=0){ init(n); } 13 void init(int n_){ n=n_; parent.resize(n); sz.assign(n,1); iota(parent.begin(), parent.end(), 0); stk.clear(); comps=n; pairs_in_same=0; } 14 15 int find(int x){ while(parent[x]!=x) x=parent[x]; return x; } 16 17 static inline long long C2(long long x){ return x*(x-1)/2; } 18 19 bool unite(int a, int b){ 20 a=find(a); b=find(b); if(a==b) return false; if(sz[a]<sz[b]) swap(a,b); 21 // Before merge, store prev state for rollback 22 Change ch; ch.child=b; ch.prev_size_parent=sz[a]; ch.prev_pairs=pairs_in_same; 23 stk.push_back(ch); 24 // Update DSU links and sizes 25 parent[b]=a; sz[a]+=sz[b]; comps--; 26 // Update aggregate: merge adds sz[a_old]*sz[b] 27 long long a_old = ch.prev_size_parent; long long b_sz = sz[b]; 28 pairs_in_same += a_old * b_sz; // since C2(a_old + b_sz) - C2(a_old) - C2(b_sz) = a_old*b_sz 29 return true; 30 } 31 32 int snapshot() const { return (int)stk.size(); } 33 void rollback(int s){ 34 while((int)stk.size() > s){ 35 Change ch = stk.back(); stk.pop_back(); 36 int b = ch.child; int a = parent[b]; 37 // Restore aggregate value first 38 pairs_in_same = ch.prev_pairs; 39 // Restore DSU links and sizes 40 parent[b]=b; sz[a]=ch.prev_size_parent; comps++; 41 } 42 } 43 }; 44 45 int main(){ 46 ios::sync_with_stdio(false); 47 cin.tie(nullptr); 48 49 int n = 5; // nodes 0..4 50 DSURollbackPairs dsu(n); 51 52 auto show = [&](){ 53 cout << "components=" << dsu.comps << ", pairs_in_same=" << dsu.pairs_in_same << "\n"; 54 }; 55 56 int S0 = dsu.snapshot(); 57 dsu.unite(0,1); show(); // pairs += 1 58 dsu.unite(3,4); show(); // pairs += 1 59 int S1 = dsu.snapshot(); 60 dsu.unite(1,3); show(); // merging sizes 2 and 2 adds 4 pairs (total now 6) 61 dsu.rollback(S1); show(); // back to after two merges (pairs=2) 62 dsu.rollback(S0); show(); // back to initial (pairs=0) 63 64 return 0; 65 } 66
This variant maintains a running total of unordered pairs within the same component. On merge of components with sizes a and b, the number of pairs increases by a·b. We store the previous pairs value and the previous size of the new parent on the stack, so rollback simply restores these fields along with parent pointers. This pattern generalizes to many global aggregates.