Directed MST (Edmonds/Chu-Liu Algorithm)
Key Points
- •A directed minimum spanning arborescence (MSA) is a minimum-cost set of edges that makes every vertex reachable from a chosen root with exactly one incoming edge per non-root vertex.
- •Edmonds/Chu–Liu’s algorithm solves the MSA by picking the cheapest incoming edge for each vertex, contracting any cycles that appear, adjusting edge weights, and recursing on the contracted graph.
- •If no cycle forms after picking minimum incoming edges, those edges already form the optimal arborescence.
- •Cycle contraction preserves optimality by subtracting the chosen in-edge costs from incoming edges to the cycle, ensuring weights remain nonnegative and enabling recursion.
- •The naive implementation runs in O(VE) time; with advanced data structures (e.g., skew/Fibonacci heaps and lazy propagation) it can be improved to roughly O(E log V).
- •You must check that every vertex is reachable from the root in the original graph; otherwise, a directed MSA does not exist.
- •Edmonds’ algorithm works with negative edge weights and parallel edges, but you must be careful with self-loops and weight adjustments during contraction.
- •Applications include optimal broadcast trees, dependency resolution under direction constraints, and network design with one-way links.
Prerequisites
- →Directed graphs and reachability — Understanding directed edges, paths, and reachability is essential to define an arborescence and to pre-check feasibility.
- →Cycle detection — Edmonds requires detecting and contracting directed cycles formed by chosen in-edges.
- →Disjoint Set Union (DSU) — DSU helps conceptualize/implement cycle contraction and component representation, especially in optimized versions.
- →Priority queues / heaps — Optimized implementations keep per-vertex heaps of incoming edges for faster minimum selection and merges.
- →Asymptotic analysis — To analyze O(VE) vs O(E log V) complexity and choose an appropriate approach for given constraints.
- →Minimum Spanning Tree (undirected) — Contrasting with MST clarifies why directed MSAs require different rules (in-edges and cycle contraction).
- →C++ vectors, structs, and I/O — To implement the algorithm safely with dynamic arrays and strong typing.
Detailed Explanation
Tap terms for definitions01Overview
A directed minimum spanning arborescence (MSA), also called a minimum spanning branching, generalizes the idea of an MST to directed graphs. Given a directed graph with edge weights and a chosen root r, we want a set of directed edges so that every vertex (except r) has exactly one incoming edge, and all vertices are reachable from r. Among all such structures, we choose one with minimum total weight. Edmonds/Chu–Liu’s algorithm is the classic solution: it repeatedly selects the minimum incoming edge for each vertex, detects cycles created by these selections, and contracts each cycle into a single super-vertex. Before recursing, it adjusts edge weights to preserve optimality by subtracting the cycle’s chosen in-edge costs from every edge entering the cycle. This process continues until there are no cycles, at which point the chosen edges form the optimal arborescence. The algorithm is robust to negative weights, allows multiple edges, and only requires that every vertex be reachable from the root for a solution to exist. The straightforward implementation runs in O(VE) time, sufficient for many competitive programming tasks; with more advanced data structures (priority/skew/Fibonacci heaps and careful lazy propagation), it can be brought closer to O(E log V).
02Intuition & Analogies
Think of building a company-wide broadcast plan from the CEO (root) to every employee. For each employee, you’d like to pick the cheapest manager who can send them a message (the minimum incoming edge). If everyone picks their cheapest manager and no circular reporting chain appears, you’re done: each employee has exactly one manager, and the CEO can reach everyone through these reporting lines. But if a circular chain appears (Alice reports to Bob, Bob to Carol, Carol back to Alice), we have a problem: cycles can’t exist in a tree. To resolve this fairly, we temporarily treat the cycle as a single “super-employee.” We also account for the fact that everyone in the cycle already got some benefit (their cheapest manager), so any outside offer to the cycle should be compared after subtracting those benefits—this is the weight adjustment. Intuitively, you’re saying: the cycle is internally satisfied at a certain base cost; any external link must beat that base, so we normalize incoming offers to the cycle by removing the internal credits. Now, with the cycle replaced by one super-employee and normalized incoming offers, we repeat the process. This shrinking-and-normalizing continues until no cycles remain. At that point, everyone has a unique manager chain leading to the CEO, and the sum of the chosen offers is minimal. When we expand the cycles back, we keep the internal cheapest links except at the one point where an external offer enters, breaking the cycle and maintaining a directed tree structure.
03Formal Definition
04When to Use
Use Edmonds/Chu–Liu when you need a minimum-cost directed tree rooted at a specific node. Typical scenarios: (1) networking/broadcast design where links are directional (e.g., asymmetric wireless links), and you want the cheapest way to reach all nodes from a central router; (2) dependency resolution where edges indicate allowable build or data-flow directions, and you want a minimal set of dependencies reaching all components from a root; (3) assigning supervisors or gateways in hierarchical systems with directional constraints; (4) robotics or control-flow graphs where transitions are directed and you require a minimal incoming control edge per node; (5) compiler backends and IR optimization contexts where directed structures with root reachability are desired. Prefer Edmonds over undirected MST algorithms (Kruskal/Prim) whenever edges have direction and each non-root must have in-degree exactly one. If the graph is a DAG, Edmonds still applies but may simplify: picking minimum incoming edges cannot create cycles, so you’re done immediately. If some nodes are unreachable from r, no directed MSA exists; consider adding edges, changing the root, or computing MSAs on reachable components only.
⚠️Common Mistakes
- Forgetting reachability: Edmonds assumes every vertex is reachable from the root. Always verify this first; otherwise, the algorithm will fail when some vertex has no incoming edge in the induced subproblem.\n- Mishandling self-loops and parallel edges: self-loops should never be chosen; keep only the minimum among multiple edges between the same pair when beneficial, but do not drop them blindly because direction matters.\n- Incorrect cycle detection: when walking predecessors to find cycles, it’s easy to mix visitation marks across iterations. Use per-iteration visitation IDs and ensure you stop at the root.\n- Wrong weight adjustments on contraction: the correct update is w'(x\to c)=w(x\to y)-w(e_{y}) for y in the cycle. Omitting the subtraction or subtracting the wrong quantity breaks optimality.\n- Losing original indices during contraction: if you need the actual arborescence edges, maintain mappings (edge IDs and component IDs) through contractions; otherwise, you may only get the cost.\n- Assuming MST properties: undirected MST cut/cycle rules don’t carry over directly. The directed case requires minimum incoming edges per vertex and cycle contraction.\n- Overflow with large weights: use 64-bit integers (long long) for sums and consider potential negative weights.\n- Using the wrong root: Edmonds is root-specific; changing the root generally changes the optimal arborescence and its cost.
Key Formulas
MSA Optimization Problem
Explanation: We minimize the total weight of selected edges subject to each non-root vertex having exactly one incoming edge and being reachable from the root. This defines the optimal directed spanning arborescence.
Minimum Incoming Edge Selection
Explanation: For each non-root vertex, choose the cheapest incoming edge. If these choices do not form a cycle, they already form the optimal arborescence.
Cycle Contraction Weight Adjustment
Explanation: When contracting a cycle C into super-vertex c, the weight of an edge entering the cycle is reduced by the chosen in-edge weight of its head. This normalization preserves optimality under contraction.
Cost via Selected In-Edges
Explanation: The total cost equals the sum of selected minimum incoming edges at each stage. After adjusting weights during contraction, further recursive costs add no extra terms because of the normalization.
Edge Count
Explanation: Any arborescence on n vertices contains exactly n−1 edges, since each non-root vertex has in-degree one.
Feasibility Condition
Explanation: A directed MSA exists only if every vertex is reachable from the root in the original graph. If not, no arborescence can span all vertices.
Naive Time Complexity
Explanation: The straightforward Edmonds implementation repeats minimum-incoming-edge scans and contractions up to V times, each pass scanning E edges, yielding O(VE) time.
Optimized Time Complexity
Explanation: Using meldable priority heaps with lazy propagation to manage incoming edges and merges during contractions reduces the overall work to roughly O(E log V) in practice/theory under standard heap bounds.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Edge { 5 int u, v; // directed edge u -> v 6 long long w; // weight (can be negative) 7 }; 8 9 // Returns pair<exists, cost>. If no arborescence exists (unreachable node), returns {false, 0}. 10 pair<bool, long long> directed_mst_cost(int n, int root, const vector<Edge>& edges) { 11 const long long INF = (1LL<<62); 12 long long answer = 0; 13 14 int N = n; 15 vector<Edge> E = edges; // work on a copy; we'll contract and adjust weights 16 17 while (true) { 18 // 1) For each vertex, find minimum incoming edge 19 vector<long long> in(N, INF); 20 vector<int> pre(N, -1); // pre[v] = parent u of chosen in-edge (u -> v) 21 for (const auto &e : E) { 22 if (e.u != e.v && e.w < in[e.v]) { // ignore self-loops and keep smallest 23 in[e.v] = e.w; 24 pre[e.v] = e.u; 25 } 26 } 27 in[root] = 0; // root has no incoming edge in arborescence 28 // If some non-root has no incoming edge, no MSA exists 29 for (int v = 0; v < N; ++v) { 30 if (v == root) continue; 31 if (in[v] == INF) return {false, 0}; 32 } 33 34 // 2) Detect cycles among chosen edges 35 int cntCycle = 0; 36 vector<int> id(N, -1); // component id after contraction 37 vector<int> vis(N, -1); // visitation mark to find cycles 38 for (int v = 0; v < N; ++v) answer += in[v]; 39 for (int v = 0, curMark = 0; v < N; ++v) { 40 if (v == root) continue; 41 int u = v; 42 // Walk predecessors until we reach root, find an assigned id, or revisit in this walk 43 while (vis[u] != v && id[u] == -1 && u != root) { 44 vis[u] = v; 45 u = pre[u]; 46 } 47 if (u != root && id[u] == -1) { 48 // Found a cycle; assign an id to all nodes in the cycle 49 for (int x = pre[u]; x != u; x = pre[x]) id[x] = cntCycle; 50 id[u] = cntCycle++; 51 } 52 } 53 if (cntCycle == 0) break; // No cycle: chosen edges form optimal arborescence 54 55 // Assign new ids to non-cycle vertices 56 for (int v = 0; v < N; ++v) if (id[v] == -1) id[v] = cntCycle++; 57 58 // 3) Build contracted graph with adjusted weights 59 vector<Edge> newE; newE.reserve(E.size()); 60 for (const auto &e : E) { 61 int u = id[e.u]; 62 int v = id[e.v]; 63 if (u != v) { 64 long long w = e.w - in[e.v]; // subtract in-edge weight of original head 65 newE.push_back({u, v, w}); 66 } 67 } 68 root = id[root]; 69 N = cntCycle; 70 E.swap(newE); 71 } 72 73 return {true, answer}; 74 } 75 76 int main() { 77 ios::sync_with_stdio(false); 78 cin.tie(nullptr); 79 80 // Example usage: 81 // Input format: n m root, then m lines of (u v w) 82 int n, m, r; 83 if (!(cin >> n >> m >> r)) { 84 // Demo graph if no input is provided 85 n = 4; m = 5; r = 0; 86 vector<Edge> edges = { 87 {0,1,1}, {0,2,5}, {1,2,1}, {1,3,3}, {2,3,1} 88 }; 89 auto res = directed_mst_cost(n, r, edges); 90 if (!res.first) cout << "No arborescence exists\n"; 91 else cout << "MSA cost = " << res.second << "\n"; 92 return 0; 93 } 94 95 vector<Edge> edges; edges.reserve(m); 96 for (int i = 0; i < m; ++i) { 97 int u, v; long long w; cin >> u >> v >> w; 98 edges.push_back({u, v, w}); 99 } 100 101 // Optional: pre-check reachability from root on the original graph 102 vector<vector<int>> g(n); for (auto &e : edges) g[e.u].push_back(e.v); 103 vector<int> vis(n, 0); queue<int> q; q.push(r); vis[r] = 1; 104 while (!q.empty()) { int x = q.front(); q.pop(); for (int y : g[x]) if (!vis[y]) vis[y]=1,q.push(y); } 105 for (int v = 0; v < n; ++v) if (!vis[v]) { cout << "No arborescence exists\n"; return 0; } 106 107 auto res = directed_mst_cost(n, r, edges); 108 if (!res.first) cout << "No arborescence exists\n"; 109 else cout << res.second << "\n"; 110 return 0; 111 } 112
This is the classical O(VE) Chu–Liu/Edmonds algorithm that returns only the total cost. It repeatedly chooses minimum incoming edges for all vertices, detects cycles, contracts them, adjusts weights by subtracting the selected in-edge costs, and recurses on the contracted graph. If any non-root vertex has no incoming edge at some stage, no arborescence exists. The pre-check ensures all vertices are reachable from the root. The algorithm works with negative weights and parallel edges and ignores self-loops.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Edge { int u, v; long long w; }; 5 6 pair<bool, long long> directed_mst_cost(int n, int root, const vector<Edge>& edges) { 7 const long long INF = (1LL<<62); 8 long long ans = 0; 9 int N = n; vector<Edge> E = edges; 10 while (true) { 11 vector<long long> in(N, INF); 12 vector<int> pre(N, -1), id(N, -1), vis(N, -1); 13 for (const auto &e : E) if (e.u != e.v && e.w < in[e.v]) in[e.v] = e.w, pre[e.v] = e.u; 14 in[root] = 0; 15 for (int i = 0; i < N; ++i) if (i != root && in[i] == INF) return {false, 0}; 16 for (int i = 0; i < N; ++i) ans += in[i]; 17 int cnt = 0; 18 for (int i = 0; i < N; ++i) if (i != root) { 19 int v = i; 20 while (vis[v] != i && id[v] == -1 && v != root) vis[v] = i, v = pre[v]; 21 if (v != root && id[v] == -1) { for (int u = pre[v]; u != v; u = pre[u]) id[u] = cnt; id[v] = cnt++; } 22 } 23 if (cnt == 0) break; 24 for (int i = 0; i < N; ++i) if (id[i] == -1) id[i] = cnt++; 25 vector<Edge> nE; nE.reserve(E.size()); 26 for (auto &e : E) { 27 int u = id[e.u], v = id[e.v]; 28 if (u != v) nE.push_back({u, v, e.w - in[e.v]}); 29 } 30 root = id[root]; 31 N = cnt; E.swap(nE); 32 } 33 return {true, ans}; 34 } 35 36 int main(){ 37 ios::sync_with_stdio(false); 38 cin.tie(nullptr); 39 40 int n, m, r; cin >> n >> m >> r; 41 vector<Edge> edges(m); 42 for (int i = 0; i < m; ++i) cin >> edges[i].u >> edges[i].v >> edges[i].w; 43 44 // Reachability pre-check on original graph 45 vector<vector<int>> g(n); for (auto &e : edges) g[e.u].push_back(e.v); 46 vector<int> seen(n); queue<int> q; q.push(r); seen[r] = 1; 47 while (!q.empty()) { int x = q.front(); q.pop(); for (int y : g[x]) if (!seen[y]) seen[y]=1, q.push(y); } 48 for (int v = 0; v < n; ++v) if (!seen[v]) { cout << -1 << "\n"; return 0; } 49 50 auto res = directed_mst_cost(n, r, edges); 51 cout << (res.first ? res.second : -1) << "\n"; 52 return 0; 53 } 54
This wrapper pairs the Edmonds core with input parsing and a reachability pre-check. It prints -1 when no arborescence exists or the minimum cost otherwise. Keeping a clean separation between reachability and the main algorithm helps avoid subtle bugs and makes the code contest-ready.