Minimum Spanning Tree - Prim
Key Points
- •Prim's algorithm builds a Minimum Spanning Tree (MST) by growing a tree from an arbitrary start vertex, always adding the lightest edge that connects the tree to a new vertex.
- •Using a binary heap or ordered set, Prim runs in O(E V) time on sparse graphs; using a simple array (adjacency matrix), it runs in O(), which is good for dense graphs.
- •It maintains for each non-tree vertex a key value: the minimum edge weight that can connect it to the current tree.
- •Prim is similar to Dijkstra but optimizes edge weights to enter the tree, not shortest path distances from the source.
- •The algorithm is correct by the cut property: the minimum-weight edge crossing any cut that separates the tree from the rest is always safe to add.
- •Prim works for undirected graphs with any real edge weights (including negatives) as long as the graph is connected; otherwise you obtain a minimum spanning forest.
- •Priority queues in C++ do not support decrease-key directly, so use lazy updates with visited[] or use std::set to simulate decrease-key.
- •Store weights in long long to avoid overflow, and beware of disconnected graphs, self-loops, and parallel edges.
Prerequisites
- →Graphs: vertices, edges, undirected vs. directed — Prim operates on undirected weighted graphs; understanding basic graph terminology is essential.
- →Priority queues and heaps — Efficient Prim uses a min-priority queue to select the next cheapest connection.
- →Asymptotic complexity (Big-O) — Choosing between O(E log V) and O(V^2) implementations depends on graph size and density.
- →Trees and properties — Knowing that a tree has V−1 edges and no cycles clarifies why Prim builds an MST correctly.
- →Balanced BST containers (std::set) — To emulate decrease-key for Prim, you need ordered set operations like erase/insert.
- →Edge weights and sums — Understanding weight aggregation and potential overflow dictates using 64-bit integers.
Detailed Explanation
Tap terms for definitions01Overview
Prim's algorithm is a greedy method for finding a Minimum Spanning Tree (MST) in a weighted, undirected graph. Think of a spanning tree as a way to connect all vertices with exactly V−1 edges and no cycles, and the MST is the one whose total weight is as small as possible. Prim grows this tree one vertex at a time. Starting from any vertex, it repeatedly chooses the lightest edge that connects the current tree to a new vertex outside the tree. This decision is always safe due to the cut property: at each step, consider the cut that separates the already-built tree from the rest; the minimum crossing edge will never lead to a suboptimal MST. Hook → Concept → Example: Hook—imagine wiring a campus with network cables, always attaching the next building using the cheapest available cable that reaches it from the existing network. Concept—this is precisely Prim's approach: track, for each unconnected building, the cheapest cable leading to it from the current network and pick the smallest. Example—on a small graph with five nodes, start at node 1, connect the nearest neighbor by the smallest edge, then update the best connection for every remaining node, and repeat until all nodes are connected. In practice, Prim is implemented with either an adjacency list plus a heap/ordered set (O(E log V) time, good for sparse graphs) or with an adjacency matrix and a simple array (O(V^2), good for dense graphs). The algorithm is simple, robust, and widely applicable to infrastructure planning, clustering, and baseline network design.
02Intuition & Analogies
Imagine building a road network between towns with a limited budget. You already have some towns connected. Which town should you connect next? Naturally, you pick the town that can be reached most cheaply from your current network. After connecting it, the list of cheapest options for the remaining unconnected towns may change, because the new town might offer cheaper routes to them. Repeat this process until every town is connected. This incremental, always-choose-the-cheapest-next-step mindset is the heart of Prim's greedy strategy. Hook → Concept → Example: Hook—consider how you might string holiday lights across a set of hooks on a wall. You start at one hook and stretch a wire to the nearest unlit hook to save wire. Then, from your now larger lit region, you again pick the closest unlit hook, and so on. Concept—in Prim, the 'closest' measure is the minimum edge weight that reaches each unconnected vertex from the current tree; we keep a key for each vertex equal to that minimal cost. Example—suppose keys are {A:0 (start), B:3, C:7, D:2}. You pick D (2), then update neighbors of D; maybe C drops to 4 because D→C is 4, and B stays 3. Next you pick B (3), and so on until all are included. Why is this safe? Consider a dividing line (a cut) between the connected set and the rest. The cheapest bridge across this line cannot be part of a cheaper solution that excludes it; if we avoided it, any alternative would cost at least as much or more to cross the same boundary. Thus the greedy choice never paints us into a corner. This property repeats at every step, so the final result is globally optimal, not just locally reasonable.
03Formal Definition
04When to Use
Use Prim when you need to connect all vertices of a weighted, undirected graph with minimum total cost and you prefer growing from a single seed. It is ideal for network design problems (laying cables, road networks, pipelines), baseline clustering (e.g., single-link hierarchical clustering via MST), approximating some NP-hard problems (like Euclidean TSP via MST-based heuristics), and as a foundation for more advanced structures (e.g., minimum bottleneck spanning trees). Choose the data structure based on graph density: with adjacency lists plus a heap or std::set, Prim runs in O(E \log V), which excels on sparse graphs common in competitive programming. With an adjacency matrix and simple arrays, Prim runs in O(V^{2}), which is competitive or better when E is near V^{2} (dense graphs), or when constraints are small enough (e.g., V up to a few thousand). Prim is especially convenient when you only need one MST and don’t need to sort all edges (contrast with Kruskal). It naturally maintains parent pointers that directly output the tree. If the graph may be disconnected, Prim still yields a minimum spanning forest by restarting from each unvisited vertex—useful when each component should be minimized separately.
⚠️Common Mistakes
- Assuming the graph is connected: Prim builds a single MST only if the graph is connected. If not, your algorithm should either report disconnection or produce a minimum spanning forest by restarting on remaining vertices.
- Using int for weights: Edge weights or their sum may exceed 32-bit limits. Use long long (64-bit) to store weights and totals to avoid overflow, especially in Codeforces-style problems.
- Misusing C++ priority_queue: It lacks decrease-key; pushing updated keys without a visited[] check can cause incorrect totals if you don't skip obsolete entries. Use a visited[] array with lazy deletion, or use std::set to simulate decrease-key by erase/insert.
- Forgetting undirected insertion: For undirected graphs, add edges in both directions in the adjacency list, or maintain symmetric weights in the matrix. Missing one direction can disconnect the structure.
- Mishandling parallel edges and self-loops: Keep the smallest edge between two vertices and ignore self-loops; otherwise you might pick a non-minimal or invalid connection.
- Confusing Dijkstra and Prim updates: Dijkstra uses d[v] = min(d[v], d[u] + w(u,v)) (path sums), while Prim uses key[v] = min(key[v], w(u,v)) (single edge). Mixing them produces wrong results.
- Wrong comparator for min-heap: In C++ priority_queue, use greater<> for a min-heap on pairs/tuples; or store negative weights if you must use the default max-heap, but this is error-prone.
- Off-by-one indexing: Competitive problems mix 0-based and 1-based vertices. Normalize indices consistently when building adjacency structures and printing answers.
Key Formulas
MST Weight
Explanation: The total weight of a spanning tree T is the sum of its edge weights. Prim minimizes this quantity over all spanning trees.
Tree Edge Count
Explanation: Any spanning tree on V vertices has exactly V−1 edges. Prim adds one new vertex (and one edge) per iteration after the start.
Key Definition
Explanation: At any step, the key of vertex v is the lightest edge connecting it to the already-built set S. Prim always selects the vertex with the smallest key.
Greedy Choice
Explanation: Prim adds the vertex outside S that has the minimum key value, ensuring the cheapest safe connection across the current cut.
Relaxation in Prim
Explanation: When a new vertex u enters S, we try to improve the best-known connection to each neighbor v by considering the edge (u,v) only.
Dijkstra Contrast
Explanation: Dijkstra updates full path distances, unlike Prim, which updates only a single-edge key. Mixing these leads to incorrect MST computation.
Heap-Based Complexity
Explanation: With a binary heap or balanced BST, each extract-min and key update costs O(log V), giving total O(E log V) for sparse graphs.
Array/Matrix Complexity
Explanation: Using an adjacency matrix and scanning arrays to find the next minimum key yields quadratic time, which is fine for dense graphs.
Complete Graph Edges
Explanation: A complete undirected graph has V(V−1)/2 edges. For such dense graphs, the O() implementation of Prim is usually preferable.
Cut Property
Explanation: For any cut, the minimum-weight crossing edge e* can be added to some MST without losing optimality. Prim uses this at every step.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Prim's algorithm using adjacency list and a min-heap (priority_queue with greater<>) 5 // This version uses lazy deletion: we may push multiple entries for a vertex, 6 // and we skip outdated ones using a visited[] array when they pop. 7 // It computes an MST if the graph is connected; otherwise it reports disconnection. 8 9 struct Edge { int to; long long w; }; 10 11 int main() { 12 ios::sync_with_stdio(false); 13 cin.tie(nullptr); 14 15 int n, m; // n = number of vertices (0..n-1), m = number of edges 16 if (!(cin >> n >> m)) return 0; 17 18 vector<vector<Edge>> adj(n); 19 for (int i = 0; i < m; ++i) { 20 int u, v; long long w; 21 cin >> u >> v >> w; // assume 0-based input; convert if needed 22 if (u == v) continue; // ignore self-loops 23 // For undirected graph, add both directions 24 adj[u].push_back({v, w}); 25 adj[v].push_back({u, w}); 26 } 27 28 const long long INF = (1LL<<62); 29 vector<int> parent(n, -1); 30 vector<char> visited(n, false); 31 32 // min-heap of (weight, vertex, parent_of_vertex) 33 using T = tuple<long long,int,int>; 34 priority_queue<T, vector<T>, greater<T>> pq; 35 36 // Start from vertex 0 by convention (can be any) 37 pq.push({0LL, 0, -1}); 38 39 long long total = 0; 40 vector<pair<int,int>> mst_edges; // store (u,v) 41 42 while (!pq.empty() && (int)mst_edges.size() < n - 1) { 43 auto [w, v, p] = pq.top(); pq.pop(); 44 if (visited[v]) continue; // lazy skip 45 visited[v] = true; 46 if (p != -1) { 47 total += w; // add the chosen edge weight 48 mst_edges.push_back({p, v}); 49 } 50 // push candidate edges to neighbors 51 for (const auto &e : adj[v]) { 52 if (!visited[e.to]) { 53 pq.push({e.w, e.to, v}); 54 } 55 } 56 } 57 58 // Check connectivity: all vertices must be visited to have an MST 59 bool connected = true; 60 for (int v = 0; v < n; ++v) connected &= visited[v]; 61 62 if (!connected) { 63 cout << "Graph is disconnected: no single MST exists.\n"; 64 return 0; 65 } 66 67 cout << "MST total weight = " << total << "\n"; 68 cout << "Edges (u v):\n"; 69 for (auto &e : mst_edges) cout << e.first << ' ' << e.second << '\n'; 70 return 0; 71 } 72
We build an undirected adjacency list and use a min-heap of candidate edges that cross the current cut. Each time we pop the smallest-weight crossing edge that reaches a new vertex, we add it to the MST and push that vertex’s outgoing edges. Lazy deletion means outdated entries remain in the heap but get ignored once the vertex is already visited. If not all vertices are visited by the time we gather V−1 edges, the graph is disconnected and no single MST exists.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Prim's algorithm using an adjacency matrix and arrays. 5 // Suitable for dense graphs where O(V^2) is acceptable or optimal. 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n, m; // vertices are 0..n-1 12 if (!(cin >> n >> m)) return 0; 13 14 const long long INF = (1LL<<62); 15 vector<vector<long long>> G(n, vector<long long>(n, INF)); 16 for (int i = 0; i < n; ++i) G[i][i] = INF; // ignore self-loops 17 18 for (int i = 0; i < m; ++i) { 19 int u, v; long long w; 20 cin >> u >> v >> w; 21 if (u == v) continue; 22 // Keep the smallest parallel edge 23 G[u][v] = min(G[u][v], w); 24 G[v][u] = min(G[v][u], w); 25 } 26 27 vector<long long> key(n, INF); 28 vector<int> parent(n, -1); 29 vector<char> inMST(n, false); 30 31 // Start from vertex 0 32 key[0] = 0; 33 34 for (int iter = 0; iter < n; ++iter) { 35 // pick u with minimal key not yet in MST 36 int u = -1; 37 for (int v = 0; v < n; ++v) { 38 if (!inMST[v] && (u == -1 || key[v] < key[u])) u = v; 39 } 40 if (u == -1 || key[u] == INF) { 41 cout << "Graph is disconnected: no single MST exists.\n"; 42 return 0; 43 } 44 inMST[u] = true; 45 // relax neighbors via matrix row 46 for (int v = 0; v < n; ++v) { 47 if (!inMST[v] && G[u][v] < key[v]) { 48 key[v] = G[u][v]; 49 parent[v] = u; 50 } 51 } 52 } 53 54 long long total = 0; 55 for (int v = 0; v < n; ++v) total += (parent[v] == -1 ? 0 : key[v]); 56 57 cout << "MST total weight = " << total << "\n"; 58 cout << "Edges (u v):\n"; 59 for (int v = 0; v < n; ++v) if (parent[v] != -1) cout << parent[v] << ' ' << v << '\n'; 60 return 0; 61 } 62
We store the graph in a V×V matrix and maintain arrays key[] and parent[]. Each iteration selects the vertex with the smallest key value among those not yet in the MST, then updates the key of every other vertex using the corresponding matrix row. This achieves O(V^2) time and works well on dense graphs.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Prim's algorithm using std::set to emulate decrease-key. 5 // This version restarts from each unvisited vertex to build a minimum spanning forest (MSF) 6 // if the graph is disconnected. 7 8 struct Edge { int to; long long w; }; 9 10 int main() { 11 ios::sync_with_stdio(false); 12 cin.tie(nullptr); 13 14 int n, m; // 0..n-1 vertices 15 if (!(cin >> n >> m)) return 0; 16 17 vector<vector<Edge>> adj(n); 18 for (int i = 0; i < m; ++i) { 19 int u, v; long long w; cin >> u >> v >> w; 20 if (u == v) continue; // ignore self-loops 21 adj[u].push_back({v, w}); 22 adj[v].push_back({u, w}); 23 } 24 25 const long long INF = (1LL<<62); 26 vector<long long> key(n, INF); 27 vector<int> parent(n, -1); 28 vector<char> inMST(n, false); 29 30 long long total = 0; 31 vector<pair<int,int>> forest_edges; 32 33 for (int start = 0; start < n; ++start) { 34 if (inMST[start]) continue; 35 // Initialize a set of (key, vertex) for this component 36 set<pair<long long,int>> pq; 37 key[start] = 0; 38 parent[start] = -1; 39 pq.insert({0, start}); 40 while (!pq.empty()) { 41 auto [kv, u] = *pq.begin(); 42 pq.erase(pq.begin()); 43 if (inMST[u]) continue; // may occur if key improved and old pair remained 44 inMST[u] = true; 45 if (parent[u] != -1) { 46 total += key[u]; 47 forest_edges.push_back({parent[u], u}); 48 } 49 for (auto &e : adj[u]) { 50 int v = e.to; long long w = e.w; 51 if (!inMST[v] && w < key[v]) { 52 // decrease-key: remove old pair if present, insert new one 53 if (key[v] != INF) pq.erase({key[v], v}); 54 key[v] = w; 55 parent[v] = u; 56 pq.insert({key[v], v}); 57 } 58 } 59 } 60 } 61 62 cout << "Minimum Spanning Forest total weight = " << total << "\n"; 63 cout << "Edges (u v):\n"; 64 for (auto &e : forest_edges) cout << e.first << ' ' << e.second << '\n'; 65 return 0; 66 } 67
We simulate decrease-key with std::set by erasing the old (key[v], v) and inserting the improved one. Restarting from every unvisited vertex yields a minimum spanning forest when the graph is disconnected. This approach provides clean O(E log V) behavior and directly outputs the edges and total weight across all components.