⚙️AlgorithmAdvanced

Directed MST (Edmonds/Chu-Liu Algorithm)

Key Points

  • A directed minimum spanning arborescence (MSA) is a minimum-cost set of edges that makes every vertex reachable from a chosen root with exactly one incoming edge per non-root vertex.
  • Edmonds/Chu–Liu’s algorithm solves the MSA by picking the cheapest incoming edge for each vertex, contracting any cycles that appear, adjusting edge weights, and recursing on the contracted graph.
  • If no cycle forms after picking minimum incoming edges, those edges already form the optimal arborescence.
  • Cycle contraction preserves optimality by subtracting the chosen in-edge costs from incoming edges to the cycle, ensuring weights remain nonnegative and enabling recursion.
  • The naive implementation runs in O(VE) time; with advanced data structures (e.g., skew/Fibonacci heaps and lazy propagation) it can be improved to roughly O(E log V).
  • You must check that every vertex is reachable from the root in the original graph; otherwise, a directed MSA does not exist.
  • Edmonds’ algorithm works with negative edge weights and parallel edges, but you must be careful with self-loops and weight adjustments during contraction.
  • Applications include optimal broadcast trees, dependency resolution under direction constraints, and network design with one-way links.

Prerequisites

  • Directed graphs and reachabilityUnderstanding directed edges, paths, and reachability is essential to define an arborescence and to pre-check feasibility.
  • Cycle detectionEdmonds requires detecting and contracting directed cycles formed by chosen in-edges.
  • Disjoint Set Union (DSU)DSU helps conceptualize/implement cycle contraction and component representation, especially in optimized versions.
  • Priority queues / heapsOptimized implementations keep per-vertex heaps of incoming edges for faster minimum selection and merges.
  • Asymptotic analysisTo analyze O(VE) vs O(E log V) complexity and choose an appropriate approach for given constraints.
  • Minimum Spanning Tree (undirected)Contrasting with MST clarifies why directed MSAs require different rules (in-edges and cycle contraction).
  • C++ vectors, structs, and I/OTo implement the algorithm safely with dynamic arrays and strong typing.

Detailed Explanation

Tap terms for definitions

01Overview

A directed minimum spanning arborescence (MSA), also called a minimum spanning branching, generalizes the idea of an MST to directed graphs. Given a directed graph with edge weights and a chosen root r, we want a set of directed edges so that every vertex (except r) has exactly one incoming edge, and all vertices are reachable from r. Among all such structures, we choose one with minimum total weight. Edmonds/Chu–Liu’s algorithm is the classic solution: it repeatedly selects the minimum incoming edge for each vertex, detects cycles created by these selections, and contracts each cycle into a single super-vertex. Before recursing, it adjusts edge weights to preserve optimality by subtracting the cycle’s chosen in-edge costs from every edge entering the cycle. This process continues until there are no cycles, at which point the chosen edges form the optimal arborescence. The algorithm is robust to negative weights, allows multiple edges, and only requires that every vertex be reachable from the root for a solution to exist. The straightforward implementation runs in O(VE) time, sufficient for many competitive programming tasks; with more advanced data structures (priority/skew/Fibonacci heaps and careful lazy propagation), it can be brought closer to O(E log V).

02Intuition & Analogies

Think of building a company-wide broadcast plan from the CEO (root) to every employee. For each employee, you’d like to pick the cheapest manager who can send them a message (the minimum incoming edge). If everyone picks their cheapest manager and no circular reporting chain appears, you’re done: each employee has exactly one manager, and the CEO can reach everyone through these reporting lines. But if a circular chain appears (Alice reports to Bob, Bob to Carol, Carol back to Alice), we have a problem: cycles can’t exist in a tree. To resolve this fairly, we temporarily treat the cycle as a single “super-employee.” We also account for the fact that everyone in the cycle already got some benefit (their cheapest manager), so any outside offer to the cycle should be compared after subtracting those benefits—this is the weight adjustment. Intuitively, you’re saying: the cycle is internally satisfied at a certain base cost; any external link must beat that base, so we normalize incoming offers to the cycle by removing the internal credits. Now, with the cycle replaced by one super-employee and normalized incoming offers, we repeat the process. This shrinking-and-normalizing continues until no cycles remain. At that point, everyone has a unique manager chain leading to the CEO, and the sum of the chosen offers is minimal. When we expand the cycles back, we keep the internal cheapest links except at the one point where an external offer enters, breaking the cycle and maintaining a directed tree structure.

03Formal Definition

Let G=(V,E) be a directed graph with weight function w: E and a chosen root r V. A spanning arborescence rooted at r is a set of edges A E such that: (1) for every v V \{r\}, the in-degree in A is exactly 1; (2) r has in-degree 0; and (3) every vertex is reachable from r via edges in A. The cost of A is w(u,v). The minimum spanning arborescence (MSA) is an arborescence that minimizes this sum. Edmonds/Chu–Liu’s algorithm proceeds as follows: (i) for each v r, choose an edge =(p(v),v) of minimum weight entering v; (ii) if the chosen edges contain no directed cycle, then A=\{ v r\} is optimal; (iii) otherwise, for each directed cycle C, contract C into a single super-vertex c, and for every edge (x,y) with y C and x C, replace it with (x,c) and set its new weight to w'(x,c)=w(x,y)-w(); (iv) recurse on the contracted graph; (v) expand cycles and reconstruct A by keeping all inside cycles except for the unique vertex where the external entering edge attaches, which breaks the cycle. A necessary condition for existence is that every vertex is reachable from r in G.

04When to Use

Use Edmonds/Chu–Liu when you need a minimum-cost directed tree rooted at a specific node. Typical scenarios: (1) networking/broadcast design where links are directional (e.g., asymmetric wireless links), and you want the cheapest way to reach all nodes from a central router; (2) dependency resolution where edges indicate allowable build or data-flow directions, and you want a minimal set of dependencies reaching all components from a root; (3) assigning supervisors or gateways in hierarchical systems with directional constraints; (4) robotics or control-flow graphs where transitions are directed and you require a minimal incoming control edge per node; (5) compiler backends and IR optimization contexts where directed structures with root reachability are desired. Prefer Edmonds over undirected MST algorithms (Kruskal/Prim) whenever edges have direction and each non-root must have in-degree exactly one. If the graph is a DAG, Edmonds still applies but may simplify: picking minimum incoming edges cannot create cycles, so you’re done immediately. If some nodes are unreachable from r, no directed MSA exists; consider adding edges, changing the root, or computing MSAs on reachable components only.

⚠️Common Mistakes

  • Forgetting reachability: Edmonds assumes every vertex is reachable from the root. Always verify this first; otherwise, the algorithm will fail when some vertex has no incoming edge in the induced subproblem.\n- Mishandling self-loops and parallel edges: self-loops should never be chosen; keep only the minimum among multiple edges between the same pair when beneficial, but do not drop them blindly because direction matters.\n- Incorrect cycle detection: when walking predecessors to find cycles, it’s easy to mix visitation marks across iterations. Use per-iteration visitation IDs and ensure you stop at the root.\n- Wrong weight adjustments on contraction: the correct update is w'(x\to c)=w(x\to y)-w(e_{y}) for y in the cycle. Omitting the subtraction or subtracting the wrong quantity breaks optimality.\n- Losing original indices during contraction: if you need the actual arborescence edges, maintain mappings (edge IDs and component IDs) through contractions; otherwise, you may only get the cost.\n- Assuming MST properties: undirected MST cut/cycle rules don’t carry over directly. The directed case requires minimum incoming edges per vertex and cycle contraction.\n- Overflow with large weights: use 64-bit integers (long long) for sums and consider potential negative weights.\n- Using the wrong root: Edmonds is root-specific; changing the root generally changes the optimal arborescence and its cost.

Key Formulas

MSA Optimization Problem

Explanation: We minimize the total weight of selected edges subject to each non-root vertex having exactly one incoming edge and being reachable from the root. This defines the optimal directed spanning arborescence.

Minimum Incoming Edge Selection

Explanation: For each non-root vertex, choose the cheapest incoming edge. If these choices do not form a cycle, they already form the optimal arborescence.

Cycle Contraction Weight Adjustment

Explanation: When contracting a cycle C into super-vertex c, the weight of an edge entering the cycle is reduced by the chosen in-edge weight of its head. This normalization preserves optimality under contraction.

Cost via Selected In-Edges

Explanation: The total cost equals the sum of selected minimum incoming edges at each stage. After adjusting weights during contraction, further recursive costs add no extra terms because of the normalization.

Edge Count

Explanation: Any arborescence on n vertices contains exactly n−1 edges, since each non-root vertex has in-degree one.

Feasibility Condition

Explanation: A directed MSA exists only if every vertex is reachable from the root in the original graph. If not, no arborescence can span all vertices.

Naive Time Complexity

Explanation: The straightforward Edmonds implementation repeats minimum-incoming-edge scans and contractions up to V times, each pass scanning E edges, yielding O(VE) time.

Optimized Time Complexity

Explanation: Using meldable priority heaps with lazy propagation to manage incoming edges and merges during contractions reduces the overall work to roughly O(E log V) in practice/theory under standard heap bounds.

Complexity Analysis

Let V= and E=. The naive Chu–Liu/Edmonds implementation executes in phases. In each phase, it scans all edges to compute the minimum incoming edge for each vertex in O(E) time, then checks for cycles and possibly performs a contraction. In the worst case, it may perform up to O(V) contractions (or iterations), leading to O(VE) time overall. Space usage is O(V+E) to store edges and per-vertex arrays (predecessors, visitation marks, component IDs). This version is simple and robust, and it handles negative weights and parallel edges. For dense graphs (E≈), O(VE)=O() may be heavy; for sparse graphs it is often acceptable in competitive programming. Optimized implementations maintain, for each (super-)vertex, a meldable heap (e.g., skew heap, leftist heap, or Fibonacci heap) of its incoming edges. Selecting the minimum incoming edge becomes a heap top in O(1) amortized and removals/merges take O(log V) amortized. When contracting a cycle, we merge the heaps of member vertices and apply lazy tags to subtract in-edge costs from all incoming edges in O(1) amortized per merge. Each edge enters and leaves heaps a constant number of times across the entire run, so the total time is roughly O(E log V). Space remains O(V+E), plus heap node overhead. Despite the improved asymptotics, the code is significantly more complex and error-prone; unless V and E are large, the O(VE) version is often preferable under contest time constraints. Always pre-check reachability from the root to avoid futile work and early detect infeasibility.

Code Examples

Naive Edmonds/Chu–Liu Algorithm: Compute Total Cost of Directed MSA
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Edge {
5 int u, v; // directed edge u -> v
6 long long w; // weight (can be negative)
7};
8
9// Returns pair<exists, cost>. If no arborescence exists (unreachable node), returns {false, 0}.
10pair<bool, long long> directed_mst_cost(int n, int root, const vector<Edge>& edges) {
11 const long long INF = (1LL<<62);
12 long long answer = 0;
13
14 int N = n;
15 vector<Edge> E = edges; // work on a copy; we'll contract and adjust weights
16
17 while (true) {
18 // 1) For each vertex, find minimum incoming edge
19 vector<long long> in(N, INF);
20 vector<int> pre(N, -1); // pre[v] = parent u of chosen in-edge (u -> v)
21 for (const auto &e : E) {
22 if (e.u != e.v && e.w < in[e.v]) { // ignore self-loops and keep smallest
23 in[e.v] = e.w;
24 pre[e.v] = e.u;
25 }
26 }
27 in[root] = 0; // root has no incoming edge in arborescence
28 // If some non-root has no incoming edge, no MSA exists
29 for (int v = 0; v < N; ++v) {
30 if (v == root) continue;
31 if (in[v] == INF) return {false, 0};
32 }
33
34 // 2) Detect cycles among chosen edges
35 int cntCycle = 0;
36 vector<int> id(N, -1); // component id after contraction
37 vector<int> vis(N, -1); // visitation mark to find cycles
38 for (int v = 0; v < N; ++v) answer += in[v];
39 for (int v = 0, curMark = 0; v < N; ++v) {
40 if (v == root) continue;
41 int u = v;
42 // Walk predecessors until we reach root, find an assigned id, or revisit in this walk
43 while (vis[u] != v && id[u] == -1 && u != root) {
44 vis[u] = v;
45 u = pre[u];
46 }
47 if (u != root && id[u] == -1) {
48 // Found a cycle; assign an id to all nodes in the cycle
49 for (int x = pre[u]; x != u; x = pre[x]) id[x] = cntCycle;
50 id[u] = cntCycle++;
51 }
52 }
53 if (cntCycle == 0) break; // No cycle: chosen edges form optimal arborescence
54
55 // Assign new ids to non-cycle vertices
56 for (int v = 0; v < N; ++v) if (id[v] == -1) id[v] = cntCycle++;
57
58 // 3) Build contracted graph with adjusted weights
59 vector<Edge> newE; newE.reserve(E.size());
60 for (const auto &e : E) {
61 int u = id[e.u];
62 int v = id[e.v];
63 if (u != v) {
64 long long w = e.w - in[e.v]; // subtract in-edge weight of original head
65 newE.push_back({u, v, w});
66 }
67 }
68 root = id[root];
69 N = cntCycle;
70 E.swap(newE);
71 }
72
73 return {true, answer};
74}
75
76int main() {
77 ios::sync_with_stdio(false);
78 cin.tie(nullptr);
79
80 // Example usage:
81 // Input format: n m root, then m lines of (u v w)
82 int n, m, r;
83 if (!(cin >> n >> m >> r)) {
84 // Demo graph if no input is provided
85 n = 4; m = 5; r = 0;
86 vector<Edge> edges = {
87 {0,1,1}, {0,2,5}, {1,2,1}, {1,3,3}, {2,3,1}
88 };
89 auto res = directed_mst_cost(n, r, edges);
90 if (!res.first) cout << "No arborescence exists\n";
91 else cout << "MSA cost = " << res.second << "\n";
92 return 0;
93 }
94
95 vector<Edge> edges; edges.reserve(m);
96 for (int i = 0; i < m; ++i) {
97 int u, v; long long w; cin >> u >> v >> w;
98 edges.push_back({u, v, w});
99 }
100
101 // Optional: pre-check reachability from root on the original graph
102 vector<vector<int>> g(n); for (auto &e : edges) g[e.u].push_back(e.v);
103 vector<int> vis(n, 0); queue<int> q; q.push(r); vis[r] = 1;
104 while (!q.empty()) { int x = q.front(); q.pop(); for (int y : g[x]) if (!vis[y]) vis[y]=1,q.push(y); }
105 for (int v = 0; v < n; ++v) if (!vis[v]) { cout << "No arborescence exists\n"; return 0; }
106
107 auto res = directed_mst_cost(n, r, edges);
108 if (!res.first) cout << "No arborescence exists\n";
109 else cout << res.second << "\n";
110 return 0;
111}
112

This is the classical O(VE) Chu–Liu/Edmonds algorithm that returns only the total cost. It repeatedly chooses minimum incoming edges for all vertices, detects cycles, contracts them, adjusts weights by subtracting the selected in-edge costs, and recurses on the contracted graph. If any non-root vertex has no incoming edge at some stage, no arborescence exists. The pre-check ensures all vertices are reachable from the root. The algorithm works with negative weights and parallel edges and ignores self-loops.

Time: O(VE) in the worst case.Space: O(V+E).
Reusable Wrapper with Reachability Check and Robust I/O
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Edge { int u, v; long long w; };
5
6pair<bool, long long> directed_mst_cost(int n, int root, const vector<Edge>& edges) {
7 const long long INF = (1LL<<62);
8 long long ans = 0;
9 int N = n; vector<Edge> E = edges;
10 while (true) {
11 vector<long long> in(N, INF);
12 vector<int> pre(N, -1), id(N, -1), vis(N, -1);
13 for (const auto &e : E) if (e.u != e.v && e.w < in[e.v]) in[e.v] = e.w, pre[e.v] = e.u;
14 in[root] = 0;
15 for (int i = 0; i < N; ++i) if (i != root && in[i] == INF) return {false, 0};
16 for (int i = 0; i < N; ++i) ans += in[i];
17 int cnt = 0;
18 for (int i = 0; i < N; ++i) if (i != root) {
19 int v = i;
20 while (vis[v] != i && id[v] == -1 && v != root) vis[v] = i, v = pre[v];
21 if (v != root && id[v] == -1) { for (int u = pre[v]; u != v; u = pre[u]) id[u] = cnt; id[v] = cnt++; }
22 }
23 if (cnt == 0) break;
24 for (int i = 0; i < N; ++i) if (id[i] == -1) id[i] = cnt++;
25 vector<Edge> nE; nE.reserve(E.size());
26 for (auto &e : E) {
27 int u = id[e.u], v = id[e.v];
28 if (u != v) nE.push_back({u, v, e.w - in[e.v]});
29 }
30 root = id[root];
31 N = cnt; E.swap(nE);
32 }
33 return {true, ans};
34}
35
36int main(){
37 ios::sync_with_stdio(false);
38 cin.tie(nullptr);
39
40 int n, m, r; cin >> n >> m >> r;
41 vector<Edge> edges(m);
42 for (int i = 0; i < m; ++i) cin >> edges[i].u >> edges[i].v >> edges[i].w;
43
44 // Reachability pre-check on original graph
45 vector<vector<int>> g(n); for (auto &e : edges) g[e.u].push_back(e.v);
46 vector<int> seen(n); queue<int> q; q.push(r); seen[r] = 1;
47 while (!q.empty()) { int x = q.front(); q.pop(); for (int y : g[x]) if (!seen[y]) seen[y]=1, q.push(y); }
48 for (int v = 0; v < n; ++v) if (!seen[v]) { cout << -1 << "\n"; return 0; }
49
50 auto res = directed_mst_cost(n, r, edges);
51 cout << (res.first ? res.second : -1) << "\n";
52 return 0;
53}
54

This wrapper pairs the Edmonds core with input parsing and a reachability pre-check. It prints -1 when no arborescence exists or the minimum cost otherwise. Keeping a clean separation between reachability and the main algorithm helps avoid subtle bugs and makes the code contest-ready.

Time: O(VE) for the MSA; O(V+E) for the reachability BFS.Space: O(V+E).