🗂️Data StructureIntermediate

DSU with Weighted Edges

Key Points

  • A DSU with weighted edges (also called a potential or difference-constraint union-find) maintains relative values between elements while still supporting near-constant-time merges and finds.
  • Each node stores an offset (potential) to its parent so the difference between any two connected nodes can be computed as a difference of their potentials to the root.
  • When merging sets with a constraint like value[b] − value[a] = w, you attach one root under the other and set the new root’s offset so the new equation holds.
  • Path compression must also update offsets: when a node jumps directly to the root, its stored potential is adjusted to remain the offset to the new parent.
  • If a constraint contradicts existing relationships (e.g., a cycle with nonzero total weight), the structure detects it immediately.
  • This technique generalizes to many Abelian groups (e.g., sums modulo M, 2D vectors) as long as “differences” and addition/subtraction are defined.
  • The amortized time per operation is O( the inverse Ackermann function, essentially constant for all practical input sizes.
  • Common pitfalls include getting signs wrong in the merge formula, forgetting to update potentials during path compression, and mixing 0-based/1-based indices.

Prerequisites

  • Basic DSU (Union-Find)Weighted DSU builds directly on union and find operations with path compression and union by size/rank.
  • Prefix sums / telescoping sumsUnderstanding how path potentials accumulate and cancel clarifies why differences are pot[y] − pot[x].
  • Amortized analysis and inverse Ackermann functionExplains why DSU operations are effectively constant time on average.
  • Signed integer arithmetic and overflowPotentials are additive along paths; values can grow, requiring correct types (e.g., long long).
  • Graphs and connected componentsConstraints define edges; differences are defined only within components.
  • Modular arithmetic (optional)Many problems use parity/mod-M variants of weighted DSU.
  • Abelian groups (optional)Generalizes the structure to work with other additive groups like vectors or modulo spaces.

Detailed Explanation

Tap terms for definitions

01Overview

A Disjoint Set Union (DSU) with weighted edges augments the classic union-find structure so it can track numerical relationships between elements, not just connectivity. Instead of simply asking “Are a and b in the same set?”, we can also ask “What is value[b] − value[a]?” if a and b belong to the same connected component. This is useful in problems that give equations of the form value[b] − value[a] = w and ask us to add constraints, check consistency, or compute differences. The key idea is to store, for each node, an offset (also called potential) relative to its parent. The root of each component is a reference point with potential zero. The difference between any two nodes in the same component is then the difference of their potentials to the root.

The data structure supports three primary operations. First, find(x) returns the root of x and compresses the path to speed up future operations, while also updating potentials so they stay correct relative to the root. Second, unite(a, b, w) merges the components containing a and b while enforcing the constraint value[b] − value[a] = w; if they are already connected, it checks whether the new constraint is consistent. Third, diff(a, b) returns value[b] − value[a] if a and b are connected. With union by size/rank and path compression, all operations run in near-constant amortized time, making the structure practical for online constraint processing and contradiction detection.

02Intuition & Analogies

Imagine a group of friends each standing along a straight line (the number line). We don’t know anyone’s absolute position, but we’re told a chain of statements like “B is 5 meters to the right of A” or “C is −2 relative to B.” If you treat each person as a node and each statement as a rigid connector with a written offset, then the friends in one connected group can all be positioned consistently, except for an overall shift of the entire group. If you pick one friend in a group as the reference point (the “root”) and declare their position to be 0, then everyone else’s position is uniquely determined relative to that root.

The DSU acts like a set of such groups, each with its own hidden reference friend. The offset (potential) stored at a person tells you how far they are from their current parent in the tree structure internal to the DSU. If you walk from one person up the parent chain to the group’s root and add all the offsets you pass, you end up with that person’s position relative to the root. To get the difference between two people in the same group, just subtract their positions relative to the root.

When we merge two groups because we learned a new relation like “B is 7 to the right of A,” we attach one group’s root under the other and set exactly one new offset so that the statement is true. Path compression is like reorganizing the family tree for speed: we connect people directly to the root. But when we shortcut the path, we must also update their stored offsets so they still mean “position relative to the new parent (root).” If at any time a new statement clashes with what we already know (e.g., it creates a loop whose total signed distance is nonzero), we detect a contradiction.

03Formal Definition

We maintain a forest where each set is represented by a rooted tree. For each node x, store parent[x] and a potential (offset) pot[x] such that pot[x] = val[x] − val[parent[x]]. For a root r, parent[r] = r and pot[r] = 0. Define the root potential of x by accumulating along the path to the root: after find(x) with path compression, pot[x] becomes val[x] − val[root(x)]. For any two nodes x and y in the same component with root r, the difference is val[y] − val[x] = (val[y] − val[r]) − (val[x] − val[r]) = pot[y] − pot[x]. To merge with constraint val[b] − val[a] = w, let ), ) after calling find on both. Denote (i.e., val[a] − val[rx]) and (i.e., val[b] − val[ry]). If , the constraint is consistent iff pb − . If and we set parent[rx] = ry, we must define pot[rx] so that the constraint holds. Since val[b] − val[a] = (val[ry] + pb) − (val[rx] + pa) = w, we deduce val[rx] − val[ry] = −w + pb − pa, hence pot[rx] = val[rx] − val[parent[rx]] = val[rx] − val[ry] = −w + pb − pa. Path compression must preserve pot semantics: if and ), then after compressing we set + pot[p], so that pot[x] remains val[x] − val[r].

04When to Use

Use a weighted DSU when your constraints are pairwise additive differences over an undirected relationship graph, and you need to process them online: add constraints, check consistency, and answer difference queries. Typical problems include: (1) equations of the form val[b] − val[a] = w (e.g., distances, balances, potentials), (2) parity constraints (w modulo 2), (3) modular arithmetic relations (mod M), and (4) relative positions in 1D or even multi-dimensional offsets where addition/subtraction is defined component-wise.

It is ideal when you do not care about absolute values, only differences within each connected component. The structure implicitly fixes each component up to an additive constant, which is fine because many tasks (like contradiction detection or difference queries) are invariant under a global shift. If constraints come in a stream and you must answer queries quickly (e.g., competitive programming, online judge problems), this DSU outperforms general graph algorithms because of its near-constant amortized time per operation.

Avoid it if your constraints are not purely additive differences (e.g., inequalities like ≤, or multiplicative relations without a log transform), or if you must compute shortest-path distances in arbitrary weighted graphs; then you likely need Bellman–Ford, Dijkstra, or topological DP on DAGs. Also avoid it if constraints are global and not decomposable into pairwise differences.

⚠️Common Mistakes

  • Wrong sign convention: mixing up whether pot[x] means val[x] − val[parent[x]] or the opposite leads to incorrect merge formulas. Choose one convention and derive all formulas from it consistently.
  • Not updating potentials during path compression: if you set parent[x] directly to the root without adding pot[parent_old] to pot[x], your differences will be wrong. Always do pot[x] += pot[parent_old] when compressing.
  • Forgetting to call find before using pot[x]: pot[x] is only guaranteed to be relative to the current parent; after find it becomes relative to the root, which is needed to compare nodes from different subtrees.
  • Overflow with large weights: use 64-bit integers (long long) or wider if sums can exceed 32-bit range.
  • Missing contradiction checks: when rx == ry after a proposed unite, you must verify pb − pa == w; otherwise you silently accept inconsistent systems.
  • Attaching the wrong root or using the wrong formula when swapping roots: if you attach ry under rx, adjust the formula symmetrically (pot[ry] = w − pb + pa). Also remember to swap sizes/ranks with the corresponding formula.
  • Mixing 0-based and 1-based indices: a common competitive programming source of off-by-one bugs.
  • Assuming absolute values are determined: DSU only determines differences; absolute values are defined up to an additive constant per component unless you explicitly fix a root’s value.

Key Formulas

Potential to parent

Explanation: By definition, each node stores its value minus its parent’s value. For a root r, pot[r] = 0 since parent[r] = r.

Path compression update

Explanation: When we shortcut x to the root, we must add the parent’s potential so pot[x] still equals val[x] − val[root]. This preserves correctness after compression.

Difference via potentials

Explanation: Differences between two nodes equal the difference of their potentials to the common root. If they’re not connected, the difference is undefined.

Merge formula (one direction)

Explanation: To enforce val[b] − val[a] = w when parent[] = , the new potential at must be set so that the equation holds for all nodes in the merged component.

Merge formula (symmetric)

Explanation: This is the symmetric case of the previous formula when we choose the opposite attachment direction. Use exactly one of these based on which root becomes the child.

Cycle consistency

Explanation: For a consistent system, the signed sum of weights along any closed loop must be zero. A nonzero sum indicates a contradiction.

Amortized complexity

Explanation: With union by size/rank and path compression, m operations over n elements cost near-linear time using the inverse Ackermann function, which grows extremely slowly.

Potential accumulation

Explanation: The potential from a node to the root equals the telescoping sum of edge offsets along the path. Path compression makes this sum a direct lookup.

Component anchor

Explanation: Within a component with root r, each absolute value equals an arbitrary component constant plus its potential to the root. Only differences are uniquely defined.

Modular generalization

Explanation: When working modulo M, all formulas hold with arithmetic taken modulo M, allowing parity/mod-M constraint systems.

Complexity Analysis

Let n be the number of elements and m the number of operations (find, unite, diff). A weighted DSU augments a classic DSU with a single additional array of potentials and a few constant-time arithmetic updates per link or compression. With union by size/rank and path compression, the amortized cost of each operation is O( where α is the inverse Ackermann function. In practice, ≤ 5 for any realistic input size, so operations are effectively constant time. Specifically, find(x) walks up the parent chain to the root and performs a constant amount of work per visited node (including one addition to update the potential during path compression). The amortized number of visited nodes per find is bounded by The unite(a,b,w) operation invokes two finds (to obtain roots and updated potentials) and then either attaches one root to the other with a constant number of updates or checks for a contradiction—thus also O( amortized. A diff(a,b) query requires two finds and one subtraction, so it is likewise O( Space usage is O(n) for the parent array, O(n) for the size/rank array, and O(n) for the potentials, totaling O(n). If you store additional metadata (e.g., timestamps, component sizes), the asymptotic space remains linear. In modular or multi-dimensional variants, potentials store bigger types (e.g., pairs), but the asymptotic space is still O(n).

Code Examples

Core Weighted DSU supporting unite (with contradiction check) and diff queries
1#include <bits/stdc++.h>
2using namespace std;
3
4struct WeightedDSU {
5 int n;
6 vector<int> parent, sz;
7 vector<long long> pot; // pot[x] = val[x] - val[parent[x]]; for roots pot[root] = 0
8
9 WeightedDSU(int n = 0) { init(n); }
10
11 void init(int n_) {
12 n = n_;
13 parent.resize(n);
14 sz.assign(n, 1);
15 pot.assign(n, 0);
16 iota(parent.begin(), parent.end(), 0);
17 }
18
19 // Find root of x and compress path, updating pot[x] to be relative to root
20 int find(int x) {
21 if (parent[x] == x) return x;
22 int p = parent[x];
23 int r = find(p);
24 pot[x] += pot[p]; // accumulate offset to the new parent (the root)
25 parent[x] = r;
26 return r;
27 }
28
29 // Return val[x] - val[root(x)] after ensuring compression
30 long long weight_to_root(int x) {
31 find(x);
32 return pot[x];
33 }
34
35 // Return whether a and b are connected
36 bool same(int a, int b) { return find(a) == find(b); }
37
38 // Get val[b] - val[a] if connected; throws if not connected
39 long long diff(int a, int b) {
40 int ra = find(a);
41 int rb = find(b);
42 if (ra != rb) throw runtime_error("Not connected");
43 return pot[b] - pot[a];
44 }
45
46 // Unite with constraint: val[b] - val[a] = w
47 // Returns true if successful, false if contradiction detected
48 bool unite(int a, int b, long long w) {
49 int ra = find(a);
50 int rb = find(b);
51 long long pa = pot[a]; // val[a] - val[ra]
52 long long pb = pot[b]; // val[b] - val[rb]
53
54 if (ra == rb) {
55 // Check consistency: (val[b]-val[a]) must equal w
56 return (pb - pa) == w;
57 }
58 // Union by size: attach smaller under larger
59 if (sz[ra] < sz[rb]) {
60 // Attach ra under rb. Need pot[ra] = -w + pb - pa
61 parent[ra] = rb;
62 pot[ra] = -w + pb - pa;
63 sz[rb] += sz[ra];
64 } else {
65 // Attach rb under ra. Symmetric formula: pot[rb] = w - pb + pa
66 parent[rb] = ra;
67 pot[rb] = w - pb + pa;
68 sz[ra] += sz[rb];
69 }
70 return true;
71 }
72};
73
74int main() {
75 ios::sync_with_stdio(false);
76 cin.tie(nullptr);
77
78 // Demo: 5 nodes (0..4). Add constraints and query differences.
79 WeightedDSU dsu(5);
80
81 // val[1] - val[0] = 3
82 cout << (dsu.unite(0, 1, 3) ? "OK" : "Contradiction") << "\n";
83
84 // val[2] - val[1] = 4 => val[2] - val[0] should be 7
85 cout << (dsu.unite(1, 2, 4) ? "OK" : "Contradiction") << "\n";
86
87 // Query val[2] - val[0]
88 try {
89 cout << "diff(0,2) = " << dsu.diff(0, 2) << "\n"; // Expect 7
90 } catch (...) { cout << "Unknown\n"; }
91
92 // Add a consistent relation: val[2] - val[0] = 7
93 cout << (dsu.unite(0, 2, 7) ? "OK" : "Contradiction") << "\n"; // OK
94
95 // Add a contradictory relation: val[2] - val[0] = 8 (conflicts with 7)
96 cout << (dsu.unite(0, 2, 8) ? "OK" : "Contradiction") << "\n"; // Contradiction
97
98 // Unknown difference (different components)
99 try {
100 cout << "diff(3,4) = " << dsu.diff(3, 4) << "\n";
101 } catch (...) {
102 cout << "Unknown\n"; // not connected yet
103 }
104
105 // Connect 3 and 4: val[4] - val[3] = -2
106 cout << (dsu.unite(3, 4, -2) ? "OK" : "Contradiction") << "\n";
107 cout << "diff(3,4) = " << dsu.diff(3, 4) << "\n"; // -2
108
109 return 0;
110}
111

This program implements a weighted DSU where pot[x] stores val[x] − val[parent[x]]. The unite(a,b,w) operation enforces val[b] − val[a] = w by attaching one root under the other and setting exactly one new potential to make the equation true. Path compression updates potentials to remain relative to the root. The demo shows consistent and contradictory insertions and how to query differences.

Time: O(α(n)) amortized per operationSpace: O(n)
Online processing of constraints and queries: add relation or ask difference
1#include <bits/stdc++.h>
2using namespace std;
3
4struct WeightedDSU {
5 int n; vector<int> parent, sz; vector<long long> pot;
6 WeightedDSU(int n=0){init(n);}
7 void init(int n_){n=n_; parent.resize(n); sz.assign(n,1); pot.assign(n,0); iota(parent.begin(), parent.end(), 0);}
8 int find(int x){ if(parent[x]==x) return x; int p=parent[x]; int r=find(p); pot[x]+=pot[p]; parent[x]=r; return r; }
9 bool same(int a,int b){ return find(a)==find(b);}
10 long long diff(int a,int b){ int ra=find(a), rb=find(b); if(ra!=rb) throw runtime_error("Unknown"); return pot[b]-pot[a]; }
11 bool unite(int a,int b,long long w){ int ra=find(a), rb=find(b); long long pa=pot[a], pb=pot[b]; if(ra==rb) return (pb-pa)==w; if(sz[ra]<sz[rb]){ parent[ra]=rb; pot[ra]=-w+pb-pa; sz[rb]+=sz[ra]; } else { parent[rb]=ra; pot[rb]=w-pb+pa; sz[ra]+=sz[rb]; } return true; }
12};
13
14int main(){
15 ios::sync_with_stdio(false); cin.tie(nullptr);
16
17 // Input format (example):
18 // n q
19 // Then q lines of two types:
20 // 1 a b w -> add constraint val[b] - val[a] = w
21 // 2 a b -> query val[b] - val[a] or print "Unknown" if not connected
22 // Indices are 1-based in input; we convert to 0-based.
23
24 int n, q; if(!(cin >> n >> q)) return 0;
25 WeightedDSU dsu(n);
26 for(int i=0;i<q;i++){
27 int t; cin >> t;
28 if(t==1){
29 int a,b; long long w; cin >> a >> b >> w; --a; --b;
30 bool ok = dsu.unite(a,b,w);
31 if(!ok) cout << "Contradiction\n"; else cout << "OK\n";
32 } else if(t==2){
33 int a,b; cin >> a >> b; --a; --b;
34 if(!dsu.same(a,b)) cout << "Unknown\n";
35 else cout << dsu.diff(a,b) << "\n";
36 } else {
37 // Optional: handle invalid type
38 cout << "Invalid query type\n";
39 }
40 }
41 return 0;
42}
43

This solution processes a typical competitive programming stream: add difference constraints and answer difference queries. For type-1 operations it merges sets or detects contradictions; for type-2 operations it prints the known difference if the nodes are connected or "Unknown" otherwise. It demonstrates careful 1-based to 0-based index conversion, and the exact formulas required for correctness.

Time: O(α(n)) amortized per operation; total O(q α(n))Space: O(n)
2D extension: DSU with vector offsets (relative positions in a plane)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Vec2 {
5 long long x=0, y=0;
6 Vec2(){} Vec2(long long x_, long long y_):x(x_),y(y_){}
7 Vec2 operator+(const Vec2& o) const { return Vec2(x+o.x, y+o.y); }
8 Vec2 operator-(const Vec2& o) const { return Vec2(x-o.x, y-o.y); }
9 Vec2& operator+=(const Vec2& o){ x+=o.x; y+=o.y; return *this; }
10};
11
12struct DSU2D {
13 int n; vector<int> parent, sz; vector<Vec2> pot; // pot[v] = pos[v] - pos[parent[v]] (vector difference)
14 DSU2D(int n=0){init(n);}
15 void init(int n_){ n=n_; parent.resize(n); sz.assign(n,1); pot.assign(n, Vec2()); iota(parent.begin(), parent.end(), 0);}
16 int find(int x){ if(parent[x]==x) return x; int p=parent[x]; int r=find(p); pot[x]+=pot[p]; parent[x]=r; return r; }
17 bool same(int a,int b){ return find(a)==find(b);}
18 Vec2 diff(int a,int b){ int ra=find(a), rb=find(b); if(ra!=rb) throw runtime_error("Unknown"); return Vec2(pot[b].x - pot[a].x, pot[b].y - pot[a].y); }
19 // Enforce pos[b] - pos[a] = w (as a 2D vector)
20 bool unite(int a,int b, Vec2 w){ int ra=find(a), rb=find(b); Vec2 pa=pot[a], pb=pot[b]; if(ra==rb){ return (pb.x - pa.x == w.x) && (pb.y - pa.y == w.y); }
21 if(sz[ra] < sz[rb]){ parent[ra]=rb; // pot[ra] = -w + pb - pa
22 pot[ra] = Vec2(-w.x + pb.x - pa.x, -w.y + pb.y - pa.y); sz[rb]+=sz[ra]; }
23 else { parent[rb]=ra; // pot[rb] = w - pb + pa
24 pot[rb] = Vec2(w.x - pb.x + pa.x, w.y - pb.y + pa.y); sz[ra]+=sz[rb]; }
25 return true; }
26};
27
28int main(){
29 ios::sync_with_stdio(false); cin.tie(nullptr);
30
31 // Suppose we know relative positions (vectors) between pairs of robots on a plane.
32 DSU2D dsu(4); // robots 0..3
33 // pos[1] - pos[0] = (3, 1)
34 cout << (dsu.unite(0,1, Vec2(3,1)) ? "OK" : "Contradiction") << "\n";
35 // pos[2] - pos[1] = (-2, 4)
36 cout << (dsu.unite(1,2, Vec2(-2,4)) ? "OK" : "Contradiction") << "\n";
37 // Query pos[2] - pos[0] -> should be (1, 5)
38 try{
39 Vec2 d = dsu.diff(0,2);
40 cout << "diff(0,2) = (" << d.x << "," << d.y << ")\n";
41 }catch(...){ cout << "Unknown\n"; }
42
43 // Contradictory relation: claim pos[2] - pos[0] = (2,5)
44 cout << (dsu.unite(0,2, Vec2(2,5)) ? "OK" : "Contradiction") << "\n";
45
46 return 0;
47}
48

This example extends the idea from scalars to 2D vectors. The same formulas apply component-wise because (Z^2, +) is an Abelian group. It shows that potentials can represent relative positions in a plane and still support O(α(n)) operations.

Time: O(α(n)) amortized per operationSpace: O(n)