MST Properties and Applications
Key Points
- •An MST minimizes total edge weight over all spanning trees and has powerful properties such as the cut and cycle properties that guide correct, greedy construction.
- •If all edge weights are distinct, the MST is unique; otherwise, ties can yield multiple MSTs of equal total weight.
- •The minimax (bottleneck) path between two nodes equals the unique path between them in the MST, minimizing the maximum edge used.
- •The cycle property says the heaviest edge in any cycle cannot belong to an MST, while the cut property says the lightest edge crossing any cut is safe to include.
- •You can find the second-best MST by trying to replace one edge on the MST path with a non-tree edge and picking the minimum strictly larger total weight.
- •Kruskal’s algorithm with DSU is a standard way to build an MST in O(m m) time; augmenting it with LCA enables fast bottleneck path queries.
- •Uniqueness of MST can be checked by seeing if any non-tree edge ties the maximum edge on the MST path between its endpoints.
- •In directed graphs, the analogous structure is a minimum spanning arborescence rooted at r, which is harder and solved by Chu–Liu/Edmonds’ algorithm.
Prerequisites
- →Graphs and basic terminology — Understanding vertices, edges, connectivity, and cycles is essential for MST definitions.
- →Sorting and comparison-based complexity — Kruskal’s algorithm relies on sorting edges by weight.
- →Disjoint Set Union (Union-Find) — Efficiently detects cycles and maintains components in Kruskal’s algorithm.
- →Trees and tree traversals — MST is a tree; traversal is needed for preprocessing and queries.
- →Binary lifting and LCA — Enables O(log n) bottleneck path queries and supports second-best MST computations.
- →Greedy algorithms and exchange arguments — MST correctness relies on cut and cycle properties proven via greedy-exchange arguments.
- →Asymptotic notation — To analyze and compare the running times like O(m \log m) and O(n \log n).
- →Directed graphs and cycle contraction — Required to grasp Chu–Liu/Edmonds for minimum spanning arborescence.
Detailed Explanation
Tap terms for definitions01Overview
A Minimum Spanning Tree (MST) of a connected, undirected, weighted graph is a subset of edges that connects all vertices with no cycles and minimum possible total edge weight. MSTs are foundational in network design, clustering, and approximation algorithms. Two core structural facts make MSTs highly tractable: the cut property (the lightest edge crossing any cut is safe to take) and the cycle property (the heaviest edge on any cycle is not part of any MST). These properties enable efficient greedy algorithms like Kruskal’s and Prim’s to find MSTs in polynomial time. MSTs also encode bottleneck-optimal paths: the path between any two vertices in the MST minimizes the maximum edge weight used among all possible paths in the original graph. This minimax property supports applications in robust routing and single-linkage clustering. When weights tie, multiple distinct MSTs can exist with the same total weight; checking uniqueness requires additional reasoning. Closely related is the second-best MST problem: find the lightest spanning tree with strictly larger weight than the MST, useful in redundancy analysis. In directed graphs, the analogous structure is a minimum spanning arborescence (rooted in-branching), a harder problem typically solved via the Chu–Liu/Edmonds algorithm.
02Intuition & Analogies
Imagine building a road network to connect cities as cheaply as possible without creating any unnecessary loops. If you stand at the border between already-connected cities and not-yet-connected cities (a cut), it’s always sensible to pick the cheapest road that crosses that border; paying more for that crossing never helps. That’s the cut property, which justifies greedy choices. Now picture a closed loop of roads (a cycle). If one road in that loop is clearly the most expensive, then keeping it is wasteful—there’s always another way around the loop without paying for that costly road. That’s the cycle property. Together, they guide us to a structure that connects everything with no waste. For paths between two cities, suppose you hate getting stuck on a terrible road segment. Among all possible routes, the one that keeps the worst segment as mild as possible is your goal. Surprisingly, if you build the cheap, loop-free road network (the MST), the unique route between any two cities in that network already achieves that goal: it minimizes the worst segment along the way. That’s the minimax (bottleneck) path property. If prices are all distinct, every decision is forced: there’s only one MST. But if some roads tie in price, different combinations might tie for best total cost. To find an alternative just slightly more expensive (for backup planning), try swapping one MST edge with a non-MST edge; the best such swap gives the second-best MST. For one-way streets (directed graphs), connecting everything to a specific root with minimal incoming costs (an arborescence) is trickier and needs a specialized algorithm (Chu–Liu/Edmonds).
03Formal Definition
04When to Use
Use MSTs to design least-cost infrastructures (networks of roads, fiber, pipelines) where cycles add no value and costs must be minimized. They are ideal in clustering (single-linkage): building the MST then cutting the k−1 largest edges yields k clusters. Apply the minimax (bottleneck) path property when the objective is to minimize the worst edge along a path, such as ensuring high reliability or bandwidth guarantees; preprocess the MST for fast bottleneck queries via LCA. In competitive programming, MSTs appear in problems with phrases like “connect all nodes with minimum total cost,” “minimize the maximum edge on the path,” or “find the second-best spanning tree.” Use Kruskal’s algorithm when the graph is sparse-to-medium and sorting edges is straightforward; use Prim’s with a heap for dense graphs. For checking MST uniqueness or building alternatives, analyze non-tree edges against the maximum edge on the MST path. When the graph is directed and you need a rooted structure with minimal incoming costs, switch to the minimum spanning arborescence problem (Chu–Liu/Edmonds), common in tasks like dependency parsing or directed network design.
⚠️Common Mistakes
- Assuming MSTs solve shortest path problems. MSTs minimize total (or bottleneck) over a tree, not the individual shortest path distances; use Dijkstra/Floyd–Warshall for shortest paths.
- Ignoring connectivity. If the graph is disconnected, Kruskal/Prim returns a forest; always check that you have n−1 edges in the result for an MST.
- Using 32-bit integers for weights. Edge sums can overflow; prefer 64-bit (long long) for total weight and edge weights when needed.
- Forgetting to handle ties. With equal weights, multiple MSTs may exist. To test uniqueness, check if any non-tree edge ties the maximum edge on the MST path.
- Mis-implementing LCA for bottleneck queries. You must lift both nodes to the same depth and maintain the maximum edge seen along the ascent; off-by-one in binary lifting tables is common.
- Replacing the wrong edge for second-best MST. Always remove the maximum-weight edge on the MST path between the endpoints of the inserted non-tree edge.
- Mixing 0-index and 1-index nodes. Be consistent throughout sorting, DSU, and adjacency building.
- Applying undirected MST algorithms to directed graphs. For directed cases, you need Chu–Liu/Edmonds’ algorithm; Kruskal/Prim do not apply.
Key Formulas
Tree Weight
Explanation: The total weight of a spanning tree T is the sum of its edge weights. MSTs minimize this sum over all spanning trees.
Cut Property
Explanation: If an edge is a lightest edge crossing some cut, there exists an MST that includes it. This justifies greedy choices in Kruskal/Prim.
Cycle Property
Explanation: The heaviest edge on a cycle cannot be in any MST, since removing it reduces the cycle’s cost without disconnecting the graph.
Minimax (Bottleneck) Path Property
Explanation: Among all u–v paths, the one that minimizes the worst edge weight is exactly the path between u and v in any MST.
Uniqueness Criterion
Explanation: With distinct weights, every greedy choice is forced by the cut and cycle properties, yielding a unique MST.
Second-best MST Weight
Explanation: Try adding each non-tree edge and removing the heaviest edge on the created cycle; the smallest strictly larger total weight is the second-best MST.
Kruskal Complexity
Explanation: Sorting edges dominates the time; DSU operations are near-constant. This is typical for sparse-to-medium graphs.
Prim (Binary Heap) Complexity
Explanation: Using a binary heap for the frontier edges yields this bound, often preferable on denser graphs.
Binary Lifting Height
Explanation: Binary lifting tables have O(log n) levels, enabling O(log n) query time for LCA and path-aggregate computations.
Chu–Liu/Edmonds Sketch
Explanation: The algorithm picks minimum incoming edges to each node, then contracts cycles and repeats, accumulating cost with corrections.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct DSU { 5 int n; vector<int> p, r; 6 DSU(int n=0): n(n), p(n), r(n,0) { iota(p.begin(), p.end(), 0); } 7 int find(int a){ return p[a]==a? a : p[a]=find(p[a]); } 8 bool unite(int a,int b){ a=find(a); b=find(b); if(a==b) return false; if(r[a]<r[b]) swap(a,b); p[b]=a; if(r[a]==r[b]) r[a]++; return true; } 9 }; 10 11 struct Edge { int u,v; long long w; }; 12 13 int main(){ 14 ios::sync_with_stdio(false); cin.tie(nullptr); 15 int n, m, q; 16 // Input format: n m q, then m edges (u v w) 1-indexed, then q queries (u v) 17 if(!(cin >> n >> m >> q)) return 0; 18 vector<Edge> edges(m); 19 for(int i=0;i<m;++i){ 20 int u,v; long long w; cin >> u >> v >> w; --u; --v; edges[i]={u,v,w}; 21 } 22 23 // 1) Build MST using Kruskal (may produce a forest if graph is disconnected) 24 vector<Edge> sorted = edges; 25 sort(sorted.begin(), sorted.end(), [](const Edge& a, const Edge& b){ return a.w < b.w; }); 26 27 DSU dsu(n); 28 long long mstWeight = 0; 29 vector<vector<pair<int,long long>>> adj(n); // MST adjacency: (to, weight) 30 int usedEdges = 0; 31 for(const auto &e : sorted){ 32 if(dsu.unite(e.u, e.v)){ 33 mstWeight += e.w; 34 adj[e.u].push_back({e.v, e.w}); 35 adj[e.v].push_back({e.u, e.w}); 36 usedEdges++; 37 } 38 } 39 40 // 2) Preprocess LCA with max-edge along up-jumps for minimax queries 41 int LOG = 1; while((1<<LOG) <= n) LOG++; 42 vector<int> depth(n, -1); 43 vector<vector<int>> up(LOG, vector<int>(n, -1)); 44 vector<vector<long long>> mx(LOG, vector<long long>(n, 0)); // mx[k][v] = max edge on 2^k jump from v upward 45 vector<int> comp(n, -1); 46 47 // DFS/BFS from each component root to set depth, up[0], mx[0], and component id 48 for(int s=0, cid=0; s<n; ++s){ 49 if(depth[s]!=-1) continue; 50 queue<int> qu; qu.push(s); depth[s]=0; up[0][s] = -1; mx[0][s]=0; comp[s]=cid; 51 while(!qu.empty()){ 52 int u = qu.front(); qu.pop(); 53 for(auto [v,w] : adj[u]){ 54 if(depth[v]==-1){ 55 depth[v] = depth[u]+1; 56 up[0][v] = u; 57 mx[0][v] = w; 58 comp[v] = cid; 59 qu.push(v); 60 } 61 } 62 } 63 cid++; 64 } 65 66 for(int k=1;k<LOG;++k){ 67 for(int v=0; v<n; ++v){ 68 int mid = up[k-1][v]; 69 if(mid==-1){ up[k][v] = -1; mx[k][v] = mx[k-1][v]; } 70 else{ 71 up[k][v] = up[k-1][mid]; 72 mx[k][v] = max(mx[k-1][v], mx[k-1][mid]); 73 } 74 } 75 } 76 77 auto maxOnPath = [&](int a, int b){ 78 if(comp[a] != comp[b]) return -1LL; // no path in MST (disconnected components) 79 long long ans = 0; 80 if(depth[a] < depth[b]) swap(a,b); 81 int diff = depth[a]-depth[b]; 82 for(int k=LOG-1;k>=0;--k){ 83 if(diff & (1<<k)){ 84 ans = max(ans, mx[k][a]); 85 a = up[k][a]; 86 } 87 } 88 if(a==b) return ans; 89 for(int k=LOG-1;k>=0;--k){ 90 if(up[k][a] != up[k][b]){ 91 ans = max(ans, mx[k][a]); 92 ans = max(ans, mx[k][b]); 93 a = up[k][a]; 94 b = up[k][b]; 95 } 96 } 97 // last step to LCA 98 ans = max(ans, mx[0][a]); 99 ans = max(ans, mx[0][b]); 100 return ans; 101 }; 102 103 // Output MST total (for connected graphs only) and answer bottleneck queries 104 if(usedEdges == n-1) { 105 cout << "MST total weight: " << mstWeight << "\n"; 106 } else { 107 cout << "Graph is disconnected; built a minimum spanning forest.\n"; 108 } 109 110 while(q--){ 111 int u,v; cin >> u >> v; --u; --v; 112 long long res = maxOnPath(u,v); 113 if(res<0) cout << -1 << "\n"; // no path 114 else cout << res << "\n"; // minimax (bottleneck) value between u and v 115 } 116 return 0; 117 } 118
We build an MST (or spanning forest) using Kruskal and DSU. Then we preprocess binary lifting tables up[k][v] and mx[k][v], where mx stores the maximum edge on the 2^k jump upwards in the MST. For any query (u, v), we lift the deeper node up to the same depth while tracking the maximum weights, then lift both synchronously to their LCA, aggregating maxima. The resulting maximum is the bottleneck value on the MST path, which equals the minimax path value in the original graph by the MST minimax property.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct DSU{ int n; vector<int> p,r; DSU(int n=0):n(n),p(n),r(n,0){ iota(p.begin(),p.end(),0);} int find(int a){return p[a]==a?a:p[a]=find(p[a]);} bool unite(int a,int b){a=find(a);b=find(b); if(a==b) return false; if(r[a]<r[b]) swap(a,b); p[b]=a; if(r[a]==r[b]) r[a]++; return true;} }; 5 struct Edge{int u,v; long long w;}; 6 7 int main(){ 8 ios::sync_with_stdio(false); cin.tie(nullptr); 9 int n,m; if(!(cin>>n>>m)) return 0; 10 vector<Edge> edges(m); 11 for(int i=0;i<m;++i){int u,v; long long w; cin>>u>>v>>w; --u;--v; edges[i]={u,v,w};} 12 13 // 1) Build MST with Kruskal 14 vector<Edge> sorted=edges; sort(sorted.begin(), sorted.end(), [](auto &a, auto &b){return a.w<b.w;}); 15 DSU dsu(n); 16 long long mstW=0; int used=0; 17 vector<vector<pair<int,long long>>> adj(n); 18 vector<char> inMST(m,0); // mark edges by index? we'll reconstruct by set later if needed 19 // To know which edges are in MST, we can store a map from (u,v,w,idx) if needed; simpler: rebuild by checking connectivity, but we'll not need per-edge marks for this approach 20 21 for(const auto &e: sorted){ 22 if(dsu.unite(e.u,e.v)){ 23 mstW += e.w; used++; 24 adj[e.u].push_back({e.v,e.w}); 25 adj[e.v].push_back({e.u,e.w}); 26 } 27 } 28 29 if(used != n-1){ 30 cout << "Graph disconnected; no MST.\n"; 31 return 0; 32 } 33 cout << "MST total weight: " << mstW << "\n"; 34 35 // 2) LCA preprocess for path max query 36 int LOG=1; while((1<<LOG)<=n) LOG++; 37 vector<int> depth(n,-1), comp(n,-1); 38 vector<vector<int>> up(LOG, vector<int>(n,-1)); 39 vector<vector<long long>> mx(LOG, vector<long long>(n,0)); 40 41 // Single component, but do BFS from node 0 42 queue<int> q; q.push(0); depth[0]=0; up[0][0]=-1; mx[0][0]=0; comp[0]=0; 43 while(!q.empty()){ 44 int u=q.front(); q.pop(); 45 for(auto [v,w]: adj[u]) if(depth[v]==-1){ depth[v]=depth[u]+1; up[0][v]=u; mx[0][v]=w; comp[v]=0; q.push(v);} } 46 47 for(int k=1;k<LOG;++k){ 48 for(int v=0; v<n; ++v){ 49 int mid=up[k-1][v]; 50 if(mid==-1){ up[k][v]=-1; mx[k][v]=mx[k-1][v]; } 51 else{ up[k][v]=up[k-1][mid]; mx[k][v]=max(mx[k-1][v], mx[k-1][mid]); } 52 } 53 } 54 55 auto maxOnPath = [&](int a,int b){ 56 long long ans=0; 57 if(depth[a]<depth[b]) swap(a,b); 58 int diff=depth[a]-depth[b]; 59 for(int k=LOG-1;k>=0;--k){ if(diff&(1<<k)){ ans=max(ans,mx[k][a]); a=up[k][a]; } } 60 if(a==b) return ans; 61 for(int k=LOG-1;k>=0;--k){ if(up[k][a]!=up[k][b]){ ans=max(ans,mx[k][a]); ans=max(ans,mx[k][b]); a=up[k][a]; b=up[k][b]; } } 62 ans=max(ans,mx[0][a]); ans=max(ans,mx[0][b]); 63 return ans; 64 }; 65 66 // 3) Compute second-best MST and test uniqueness 67 long long second = LLONG_MAX; 68 bool uniqueMST = true; // will become false if tie replacement exists 69 70 // For efficiency, iterate all edges; for each non-tree edge (u,v,w), replacement weight is mstW - maxOnPath(u,v) + w 71 // But we must skip tree edges. We can detect tree edges by degree-based matching; simpler: use an unordered_set of encoded MST edges. 72 // Build a set of MST edges as undirected pairs with weight to distinguish duplicates. 73 struct Key{int a,b; long long w;}; 74 struct KeyHash{ size_t operator()(Key const&k) const noexcept{ return ((uint64_t)k.a<<32) ^ ((uint64_t)k.b<<1) ^ std::hash<long long>()(k.w); } }; 75 struct KeyEq{ bool operator()(Key const&x, Key const&y) const noexcept{ return x.a==y.a && x.b==y.b && x.w==y.w; } }; 76 77 unordered_set<Key,KeyHash,KeyEq> mstSet; mstSet.reserve(n*2); 78 for(int u=0;u<n;++u){ 79 for(auto [v,w]: adj[u]) if(u<v){ mstSet.insert({u,v,w}); } 80 } 81 82 for(const auto &e: edges){ 83 int a=min(e.u,e.v), b=max(e.u,e.v); 84 if(mstSet.find({a,b,e.w}) != mstSet.end()) continue; // skip MST edge 85 long long mxp = maxOnPath(e.u, e.v); 86 if(mxp == e.w) uniqueMST = false; // tie implies an alternative MST with equal total 87 long long cand = mstW - mxp + e.w; 88 if(cand > mstW) second = min(second, cand); 89 } 90 91 if(uniqueMST) cout << "MST is unique.\n"; else cout << "MST is not unique.\n"; 92 if(second==LLONG_MAX) cout << "No strictly larger spanning tree via single swap (second-best MST not found).\n"; 93 else cout << "Second-best MST weight: " << second << "\n"; 94 return 0; 95 } 96
We first build an MST and preprocess LCA with maximum-edge tracking. For each non-tree edge (u, v, w), adding it forms a cycle in the MST; removing the maximum-weight edge on the MST path between u and v yields a new spanning tree with total weight mstW − maxOnPath(u, v) + w. The best strictly larger candidate is the second-best MST. Uniqueness is determined by whether any non-tree edge ties the maximum on the path (mxp == w), which implies an alternative MST with equal total weight.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct DEdge{ int u,v; long long w; }; 5 6 // Returns {exists, cost}. If some node (except root) has no incoming edge reachable from root, returns {false, 0}. 7 pair<bool,long long> edmonds(int n, int r, const vector<DEdge>& E){ 8 const long long INF = (1LL<<62); 9 long long res = 0; 10 int N = n; 11 vector<DEdge> edges = E; 12 while(true){ 13 // 1) For each node, find minimum incoming edge 14 vector<long long> in(N, INF); 15 vector<int> pre(N, -1); 16 for(const auto &e: edges){ if(e.u != e.v && e.w < in[e.v]){ in[e.v] = e.w; pre[e.v] = e.u; } } 17 in[r] = 0; pre[r] = r; 18 for(int v=0; v<N; ++v){ if(v==r) continue; if(in[v] == INF) return {false, 0}; } 19 // 2) Find cycles 20 int cnt = 0; 21 vector<int> id(N, -1), vis(N, -1); 22 for(int v=0; v<N; ++v){ res += in[v]; int u=v; while(vis[u]!=v && id[u]==-1 && u!=r){ vis[u]=v; u=pre[u]; } 23 if(u!=r && id[u]==-1){ // found a cycle, assign an id 24 for(int x=pre[u]; x!=u; x=pre[x]) id[x]=cnt; id[u]=cnt; cnt++; 25 } 26 } 27 if(cnt==0) break; // no cycles 28 for(int v=0; v<N; ++v) if(id[v]==-1) id[v]=cnt++; 29 // 3) Contract cycles and reweight edges 30 vector<DEdge> nE; nE.reserve(edges.size()); 31 for(const auto &e: edges){ 32 int uu = id[e.u], vv = id[e.v]; long long ww = e.w; 33 if(uu != vv){ ww -= in[e.v]; nE.push_back({uu, vv, ww}); } 34 } 35 edges.swap(nE); 36 r = id[r]; 37 N = cnt; 38 } 39 return {true, res}; 40 } 41 42 int main(){ 43 ios::sync_with_stdio(false); cin.tie(nullptr); 44 int n, m, r; // nodes 0..n-1, m directed edges, root r 45 if(!(cin>>n>>m>>r)) return 0; 46 vector<DEdge> edges(m); 47 for(int i=0;i<m;++i){ int u,v; long long w; cin>>u>>v>>w; edges[i]={u,v,w}; } 48 auto [ok, cost] = edmonds(n, r, edges); 49 if(!ok) cout << "No arborescence reachable from root.\n"; 50 else cout << "Minimum spanning arborescence cost: " << cost << "\n"; 51 return 0; 52 } 53
This is a standard Chu–Liu/Edmonds implementation that computes the minimum spanning arborescence (MSA) cost rooted at r. It repeatedly selects the minimum incoming edge for every node, detects any formed cycles, contracts each cycle into a supernode, and reweights edges accordingly. When no cycles remain, the sum of chosen incoming edges (with prior reweightings) yields the MSA cost. This solves the directed analogue of MST, which is harder than the undirected case.