⚙AlgorithmIntermediate

Minimum Spanning Tree - Kruskal

Key Points

  • ‱
    Kruskal’s algorithm builds a minimum spanning tree (MST) by sorting all edges by weight and greedily picking the next lightest edge that does not form a cycle.
  • ‱
    A Disjoint Set Union (DSU/Union-Find) data structure is used to detect cycles efficiently while adding edges.
  • ‱
    The algorithm is correct by the cut property: the minimum-weight edge crossing any cut is always part of some MST.
  • ‱
    Its time complexity is dominated by sorting edges, which is O(E E), while DSU operations are almost linear with inverse-Ackermann factor.
  • ‱
    Kruskal naturally generalizes to form a Minimum Spanning Forest (MSF) when the graph is disconnected.
  • ‱
    You can stop early to create k-clusters: run Kruskal until exactly k components remain, and the next crossing edge weight gives the spacing.
  • ‱
    After building an MST, you can preprocess it for max-edge-on-path queries (via LCA/binary lifting) to analyze alternative edges or check MST uniqueness hints.
  • ‱
    Handling equal-weight edges is safe; the chosen MST may not be unique, but Kruskal still returns a valid one.

Prerequisites

  • →Graphs (undirected, weighted) — You must understand vertices, edges, and weights to model the MST problem.
  • →Greedy algorithms — Kruskal is a greedy method; understanding why local choices can lead to global optimality is key.
  • →Disjoint Set Union (Union-Find) — Efficient cycle detection in Kruskal relies on DSU with path compression and union by size/rank.
  • →Sorting algorithms and complexity — Kruskal’s runtime is dominated by sorting edges in non-decreasing order.
  • →Trees and basic properties — Recognizing that a tree on n nodes has n−1 edges and no cycles helps validate correctness and stopping conditions.
  • →Binary lifting / LCA (optional) — Useful for advanced post-MST queries like maximum edge on a tree path or reasoning about second-best MSTs.

Detailed Explanation

Tap terms for definitions

01Overview

Kruskal’s algorithm is a classic greedy method to find a Minimum Spanning Tree (MST) of a weighted, undirected graph. An MST connects all vertices with the minimum possible total edge weight without forming any cycles. Kruskal proceeds by sorting edges in non-decreasing order and then scanning them from lightest to heaviest. It adds an edge if and only if it connects two currently disconnected components, which avoids cycles by construction. To check whether two vertices are already connected, the algorithm uses a Disjoint Set Union (DSU), also known as Union-Find, which supports fast union and find operations with path compression and union by size/rank.

The algorithm’s strength is its simplicity and efficiency on sparse graphs: sorting edges costs O(E log E), and the DSU adds only a near-linear overhead. Kruskal is also easy to adapt to related tasks. For example, if the graph is disconnected, the same procedure yields a Minimum Spanning Forest (MSF) — a collection of MSTs, one per connected component. Another common application is clustering: by stopping early when k components remain, you obtain k clusters with maximum spacing.

Kruskal’s correctness relies on the cut property: at any step, choosing the lightest edge that crosses some cut separating the current components ensures optimality. This property guarantees that greedy choices never block building a global optimum later, making Kruskal a prototypical example of a correct greedy algorithm.

02Intuition & Analogies

Imagine you are building the cheapest road network to connect all cities. At the start, every city is isolated. If you always consider the cheapest available road and add it unless it closes a loop, you intuitively keep costs down while steadily knitting the cities together. You never pay for a redundant road that just loops around, because a loop doesn’t help connect any new city; it only adds cost.

To check whether a new road would create a loop, picture placing each already-connected set of cities into a labeled bin. When you pick a road connecting two cities from different bins, it’s helpful: it merges two bins (more cities get connected). If the road’s cities are already in the same bin, it would just circle within, making a loop — not helpful, so you skip it. The DSU is a smart labeling system that tells you very quickly whether two cities are in the same bin and can merge bins lightning-fast.

Why is taking the cheapest crossing road always safe? Think of separating the already-connected cities from the rest with a line (a cut). The cheapest road that crosses that line is the best way to reach new territory. If you rejected this cheapest bridge and instead chose a more expensive one later, you would obviously pay extra for no benefit. This is the essence of the cut property and why the greedy step cannot hurt. So by repeatedly doing the safest, cheapest connection that doesn’t make a loop, you end up with the cheapest network that connects everyone.

03Formal Definition

Let G = (V, E) be a connected, undirected graph with weight function w: E → ℝ. A spanning tree T ⊆ E is a subset of edges connecting all vertices without cycles and has − 1 edges. A Minimum Spanning Tree (MST) is a spanning tree T minimizing the total weight \( w(e)\). Kruskal’s algorithm sorts E in non-decreasing order by weight and initializes a forest F with no edges and components. It scans edges e = (u, v) in sorted order and adds e to F if find(u) ≠ find(v) with respect to a DSU over V; if added, union(u, v) merges the two components. The algorithm stops when − 1 edges have been added (or when no more edges remain, which yields a minimum spanning forest if G is disconnected). Correctness follows from the cut property: For any cut (S, V \ S), if e is a minimum-weight edge crossing the cut, then there exists an MST containing e. Each time Kruskal adds an edge between distinct components, that edge is minimum among those crossing the cut defined by these components, hence safe. The cycle property complements this: In any cycle, the heaviest edge does not belong to at least one MST, justifying skipping edges that would create cycles. Time complexity is O(E E) for sorting plus near-linear DSU overhead O(E ()), where \(\) is the inverse Ackermann function.

04When to Use

  • Sparse graphs: When E is close to V (e.g., E = O(V)), Kruskal’s sort-and-scan is very fast and often simpler than Prim’s with a binary heap.
  • Edge list input: If edges already come as a list (common in contests), Kruskal is a natural fit. If the edges are pre-sorted or nearly sorted, it can be even faster in practice.
  • Disconnected graphs: Kruskal automatically yields a Minimum Spanning Forest, which is useful if you only need to connect within components.
  • Clustering: For k-clustering with maximum spacing, run Kruskal but halt when exactly k components remain; the next crossing edge weight is the spacing between clusters.
  • Post-MST analysis: Build an MST, then preprocess it (e.g., with LCA/binary lifting) to answer queries like maximum edge on the path between two nodes. This helps test if swapping in an extra edge could lower weight (second-best MST ideas) or to reason about uniqueness.
  • Parallelization prospects: Sorting and DSU can be partially parallelized or batched (advanced), making Kruskal suitable for large-scale settings with careful engineering.

⚠Common Mistakes

  • Forgetting to check connectivity: If the graph is disconnected and you expect an MST, you must detect that you picked fewer than |V| − 1 edges. The result is an MSF, not an MST.
  • Mishandling 1-based vs 0-based indices: Many inputs use 1-based node labels; forgetting to convert causes out-of-bounds errors or wrong unions.
  • DSU without path compression or union by rank/size: This makes DSU operations degrade toward O(n) per operation, potentially TLE for large graphs. Always implement both optimizations.
  • Not skipping self-loops or parallel edges properly: Self-loops should never be added; among parallel edges, only the lightest that doesn’t create a cycle may be chosen at its turn.
  • Assuming uniqueness: Equal-weight edges can produce multiple valid MSTs. Do not rely on a specific structure unless you’ve proven uniqueness (e.g., by verifying strict inequalities on all cut minima or via max-edge-on-path checks).
  • Integer overflow: Edge weights or MST sums can exceed 32-bit int. Use 64-bit types (long long) for weights and totals.
  • Early stopping errors: If you stop once you added |V| − 1 edges, ensure the graph is indeed connected up to that point; otherwise, verify E exhausted and handle the MSF case gracefully.

Key Formulas

MST Optimization Problem

Explanation: We want a subset of edges forming a tree that connects all vertices and has the smallest possible total weight. The constraint ensures no cycles and exactly |V|−1 edges.

Tree Edge Count

Explanation: Any tree on |V| vertices has exactly |V|−1 edges. This helps detect completion of Kruskal when we have chosen |V|−1 edges.

Cut Property

Explanation: For any partition of vertices into two non-empty sets, the lightest edge crossing it is safe to include in an MST. This justifies Kruskal’s greedy choice.

Cycle Property

Explanation: In any cycle, at least one of the heaviest edges can be removed to reduce or preserve total weight, so skipping such edges does not harm optimality.

Kruskal Time Complexity

Explanation: Sorting edges dominates the runtime with O(E log E); DSU operations add almost linear overhead with the inverse Ackermann function which is tiny in practice.

Inverse Ackermann Function (Informal)

Explanation: This defines the extremely slowly growing inverse Ackermann function that appears in DSU bounds. For all practical input sizes, ≀ 4.

Clustering Spacing

Explanation: When Kruskal halts with k components, the next lightest edge between different components reflects how far apart clusters are.

Max Edge on Path in MST

Explanation: Given an MST T, the maximum edge weight along the unique path between u and v is useful to evaluate swapping in a non-tree edge for second-best MST analyses.

Complexity Analysis

Kruskal’s algorithm begins by sorting all E edges by non-decreasing weight, which costs O(E log E). It then scans edges once, performing at most E DSU operations (find and union). With both path compression and union by rank/size, DSU operations take amortized O( time, where α is the inverse Ackermann function. Since ≀ 4 for any realistic input, this term behaves as a very small constant, making the overall time complexity effectively O(E log E) dominated by sorting. Space usage includes storing the edge list O(E) and the DSU arrays O(V). If you also store the selected MST edges, that adds O(V) space. Therefore, the total auxiliary space is O(V + E). If you construct an adjacency list of the MST for further processing (like LCA/binary lifting), you use O(V) additional space for tree edges and O(V log V) if you store binary lifting tables. For disconnected graphs, the same runtime and space bounds apply, but the result is a Minimum Spanning Forest (MSF). Early stopping variants, such as k-clustering, may process fewer edges in practice once the desired number of components is reached, but asymptotically they remain O(E log E) due to the initial sort. If edges are pre-sorted or if a non-comparison sort is viable for small integer weights, sorting may be optimized, potentially lowering constants or even achieving O(E) expected time in specialized settings.

Code Examples

Basic Kruskal MST (connected graph expected)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct DSU {
5 vector<int> parent, sz;
6 DSU(int n=0) { init(n); }
7 void init(int n) {
8 parent.resize(n);
9 sz.assign(n, 1);
10 iota(parent.begin(), parent.end(), 0);
11 }
12 int find(int x) {
13 if (parent[x] == x) return x;
14 return parent[x] = find(parent[x]); // path compression
15 }
16 bool unite(int a, int b) {
17 a = find(a); b = find(b);
18 if (a == b) return false;
19 if (sz[a] < sz[b]) swap(a, b); // union by size
20 parent[b] = a;
21 sz[a] += sz[b];
22 return true;
23 }
24};
25
26struct Edge {
27 int u, v;
28 long long w;
29};
30
31int main() {
32 ios::sync_with_stdio(false);
33 cin.tie(nullptr);
34
35 int n, m;
36 if (!(cin >> n >> m)) return 0;
37 vector<Edge> edges;
38 edges.reserve(m);
39 for (int i = 0; i < m; ++i) {
40 int u, v; long long w;
41 cin >> u >> v >> w;
42 // convert from 1-based to 0-based if needed
43 --u; --v;
44 if (u == v) continue; // skip self-loops (never useful for MST)
45 edges.push_back({u, v, w});
46 }
47
48 sort(edges.begin(), edges.end(), [](const Edge& a, const Edge& b){
49 if (a.w != b.w) return a.w < b.w;
50 // tie-breaker optional but stable: doesn't affect correctness
51 if (a.u != b.u) return a.u < b.u;
52 return a.v < b.v;
53 });
54
55 DSU dsu(n);
56 long long mst_cost = 0;
57 vector<Edge> mst_edges; mst_edges.reserve(n - 1);
58
59 for (const auto &e : edges) {
60 if ((int)mst_edges.size() == n - 1) break; // early stop
61 if (dsu.unite(e.u, e.v)) {
62 mst_cost += e.w;
63 mst_edges.push_back(e);
64 }
65 }
66
67 if ((int)mst_edges.size() != n - 1) {
68 cout << "No MST exists (graph is disconnected).\n";
69 // Optionally: still print the minimum spanning forest
70 cout << "MSF total weight: " << mst_cost << "\n";
71 cout << "Chosen edges: " << mst_edges.size() << "\n";
72 for (auto &e : mst_edges) cout << e.u+1 << " " << e.v+1 << " " << e.w << "\n";
73 return 0;
74 }
75
76 cout << "MST total weight: " << mst_cost << "\n";
77 cout << "Edges in MST (u v w):\n";
78 for (auto &e : mst_edges) cout << e.u+1 << " " << e.v+1 << " " << e.w << "\n";
79 return 0;
80}
81

Reads a weighted undirected graph, sorts all edges, and greedily adds the next lightest edge that connects two different components, tracked by DSU. If fewer than n−1 edges are chosen, the graph is disconnected; otherwise, we obtain a valid MST and its total cost.

Time: O(E log E) overall; DSU operations add O(E α(V)).Space: O(V + E) for DSU and edge storage; O(V) for storing selected MST edges.
Minimum Spanning Forest (handles disconnected graphs)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct DSU {
5 vector<int> parent, sz;
6 DSU(int n=0){init(n);}
7 void init(int n){ parent.resize(n); sz.assign(n,1); iota(parent.begin(), parent.end(), 0);}
8 int find(int x){ return parent[x]==x? x : parent[x]=find(parent[x]); }
9 bool unite(int a,int b){ a=find(a); b=find(b); if(a==b) return false; if(sz[a]<sz[b]) swap(a,b); parent[b]=a; sz[a]+=sz[b]; return true; }
10};
11
12struct Edge{int u,v; long long w;};
13
14int main(){
15 ios::sync_with_stdio(false);
16 cin.tie(nullptr);
17
18 int n,m; cin>>n>>m;
19 vector<Edge> edges; edges.reserve(m);
20 for(int i=0;i<m;++i){
21 int u,v; long long w; cin>>u>>v>>w; --u;--v; if(u==v) continue; edges.push_back({u,v,w});
22 }
23
24 sort(edges.begin(), edges.end(), [](const Edge&a, const Edge&b){return a.w<b.w;});
25 DSU dsu(n);
26 long long forest_cost=0; vector<Edge> forest_edges; forest_edges.reserve(n-1);
27
28 for(const auto &e: edges){
29 if(dsu.unite(e.u,e.v)){
30 forest_cost += e.w;
31 forest_edges.push_back(e);
32 }
33 }
34
35 // Count components by counting distinct roots
36 vector<int> root(n); for(int i=0;i<n;++i) root[i]=dsu.find(i);
37 sort(root.begin(), root.end());
38 int components = unique(root.begin(), root.end()) - root.begin();
39
40 cout << "Components: " << components << "\n";
41 cout << "MSF total weight: " << forest_cost << "\n";
42 cout << "Forest edges (u v w):\n";
43 for(auto &e: forest_edges) cout<<e.u+1<<" "<<e.v+1<<" "<<e.w<<"\n";
44 return 0;
45}
46

This version does not assume connectivity. It returns a Minimum Spanning Forest, reporting how many components exist, the total forest weight, and the edges used across all components.

Time: O(E log E) for sorting plus O(E α(V)) for DSU.Space: O(V + E).
Kruskal + Binary Lifting for Max-Edge-On-Path Queries
1#include <bits/stdc++.h>
2using namespace std;
3
4struct DSU { vector<int> p, sz; DSU(int n=0){init(n);} void init(int n){p.resize(n); sz.assign(n,1); iota(p.begin(),p.end(),0);} int find(int x){return p[x]==x?x:p[x]=find(p[x]);} bool unite(int a,int b){a=find(a);b=find(b); if(a==b) return false; if(sz[a]<sz[b]) swap(a,b); p[b]=a; sz[a]+=sz[b]; return true;} };
5struct Edge{int u,v; int w;};
6
7int main(){
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n,m; cin>>n>>m;
12 vector<Edge> edges; edges.reserve(m);
13 for(int i=0;i<m;++i){int u,v,w;cin>>u>>v>>w; --u;--v; if(u==v) continue; edges.push_back({u,v,w});}
14 sort(edges.begin(), edges.end(), [](const Edge&a,const Edge&b){return a.w<b.w;});
15
16 DSU dsu(n);
17 long long mst_cost=0;
18 vector<vector<pair<int,int>>> g(n); // MST adjacency: (to, weight)
19 int picked=0;
20 for(auto &e: edges){
21 if(picked==n-1) break;
22 if(dsu.unite(e.u,e.v)){
23 mst_cost += e.w; picked++;
24 g[e.u].push_back({e.v,e.w});
25 g[e.v].push_back({e.u,e.w});
26 }
27 }
28
29 // If disconnected, the LCA preprocessing should be done per-tree; here we assume connected for queries.
30 cout << "MST total weight: " << mst_cost << "\n";
31
32 // Binary lifting preprocessing for max edge on path
33 int LOG = 1; while((1<<LOG) <= n) ++LOG;
34 vector<int> depth(n, -1);
35 vector<vector<int>> up(LOG, vector<int>(n, -1));
36 vector<vector<int>> mx(LOG, vector<int>(n, 0)); // mx[k][v]: max edge to 2^k ancestor
37
38 // BFS/DFS to set depth and up[0], mx[0]
39 queue<int> q;
40 for(int s=0;s<n;++s){
41 if(depth[s]!=-1) continue; // handle forest if needed
42 depth[s]=0; up[0][s]=-1; mx[0][s]=0; q.push(s);
43 while(!q.empty()){
44 int u=q.front(); q.pop();
45 for(auto [v,w]: g[u]){
46 if(v==up[0][u]) continue;
47 if(depth[v]==-1){
48 depth[v]=depth[u]+1; up[0][v]=u; mx[0][v]=w; q.push(v);
49 }
50 }
51 }
52 }
53
54 for(int k=1;k<LOG;++k){
55 for(int v=0; v<n; ++v){
56 if(up[k-1][v] != -1){
57 up[k][v] = up[k-1][ up[k-1][v] ];
58 if(up[k][v] != -1) mx[k][v] = max(mx[k-1][v], mx[k-1][ up[k-1][v] ]);
59 else { up[k][v] = -1; mx[k][v] = mx[k-1][v]; }
60 }
61 }
62 }
63
64 auto maxOnPath = [&](int a, int b){
65 if(a==b) return 0;
66 int res=0;
67 if(depth[a] < depth[b]) swap(a,b);
68 int diff = depth[a]-depth[b];
69 for(int k=LOG-1;k>=0;--k){
70 if(diff & (1<<k)){
71 res = max(res, mx[k][a]);
72 a = up[k][a];
73 }
74 }
75 if(a==b) return res;
76 for(int k=LOG-1;k>=0;--k){
77 if(up[k][a] != up[k][b]){
78 res = max(res, mx[k][a]);
79 res = max(res, mx[k][b]);
80 a = up[k][a];
81 b = up[k][b];
82 }
83 }
84 // now a and b are children of LCA
85 res = max(res, mx[0][a]);
86 res = max(res, mx[0][b]);
87 return res;
88 };
89
90 int qn; if(!(cin>>qn)) return 0; // number of queries
91 // Each query: u v -> maximum edge on path in the MST
92 while(qn--){
93 int u,v; cin>>u>>v; --u;--v;
94 cout << maxOnPath(u,v) << '\n';
95 }
96 return 0;
97}
98

Builds an MST with Kruskal, then preprocesses the tree using binary lifting to answer max-edge-on-path queries in O(log n). This is useful to evaluate the effect of adding a non-MST edge (u,v,w): the candidate MST weight would be mst_cost − maxOnPath(u,v) + w.

Time: Building MST: O(E log E). Preprocessing: O(V log V). Each query: O(log V).Space: O(V + E) for graph and MST; O(V log V) for lifting tables.
K-clustering via Kruskal (maximum spacing)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct DSU{vector<int> p, sz; DSU(int n=0){init(n);} void init(int n){p.resize(n); sz.assign(n,1); iota(p.begin(),p.end(),0);} int find(int x){return p[x]==x?x:p[x]=find(p[x]);} bool unite(int a,int b){a=find(a); b=find(b); if(a==b) return false; if(sz[a]<sz[b]) swap(a,b); p[b]=a; sz[a]+=sz[b]; return true;}};
5struct Edge{int u,v; long long w;};
6
7int main(){
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 int n,m,k; cin>>n>>m>>k;
12 vector<Edge> edges; edges.reserve(m);
13 for(int i=0;i<m;++i){int u,v; long long w; cin>>u>>v>>w; --u;--v; if(u==v) continue; edges.push_back({u,v,w});}
14
15 sort(edges.begin(), edges.end(), [](const Edge&a,const Edge&b){return a.w<b.w;});
16
17 DSU dsu(n);
18 int components = n;
19 long long spacing = -1; // minimum inter-cluster edge after stopping
20
21 if(k <= 0 || k > n){
22 cout << "Invalid k.\n";
23 return 0;
24 }
25
26 for(const auto &e: edges){
27 if(components == k){
28 // First edge that connects two different clusters defines spacing
29 if(dsu.find(e.u) != dsu.find(e.v)) { spacing = e.w; break; }
30 else continue;
31 }
32 if(dsu.unite(e.u,e.v)){
33 components--;
34 }
35 }
36
37 if(components > k){
38 // Ran out of edges before reaching k components separation; graph is too disconnected
39 spacing = -1; // undefined
40 }
41
42 // Build cluster labels
43 vector<int> label(n);
44 unordered_map<int,int> root2id; int id=0;
45 for(int i=0;i<n;++i){ int r=dsu.find(i); if(!root2id.count(r)) root2id[r]=id++; label[i]=root2id[r]; }
46
47 cout << "Clusters formed: " << components << "\n";
48 cout << "Desired k: " << k << "\n";
49 cout << "Spacing (min inter-cluster edge after stopping): " << spacing << "\n";
50 cout << "Cluster labels (1-indexed nodes -> cluster id):\n";
51 for(int i=0;i<n;++i){
52 cout << (i+1) << ": " << label[i] << '\n';
53 }
54 return 0;
55}
56

Runs Kruskal while tracking the number of connected components. When exactly k clusters remain, the next lightest edge that would connect two different clusters gives the maximum spacing among all k-clusterings. The program also outputs cluster labels.

Time: O(E log E) due to sorting; DSU adds O(E α(V)).Space: O(V + E).