Bipartite Matching - Kuhn's Algorithm
Key Points
- â˘Kuhnâs algorithm finds a maximum matching in a bipartite graph by repeatedly searching for augmenting paths using DFS.
- â˘Each successful DFS increases the matching size by one by flipping matched/unmatched edges along the found path.
- â˘The algorithm runs in O(VE) time and O(V + E) space, which is often fast enough for competitive programming with careful implementation.
- â˘It is equivalent to max-flow with unit capacities but is much simpler to code and debug.
- â˘Correctness follows from Bergeâs theorem: a matching is maximum if and only if there is no augmenting path.
- â˘Practical heuristics like visiting low-degree vertices first and shuffling adjacency lists can significantly speed up the search.
- â˘From a maximum matching, you can recover a minimum vertex cover in O(V + E) time via KĹnigâs theorem.
- â˘If inputs are not pre-partitioned, you can first 2-color the graph to form the bipartition, then run Kuhnâs algorithm.
Prerequisites
- âGraph representations (adjacency lists) â Kuhnâs algorithm iterates over neighbors of vertices; efficient adjacency storage is essential.
- âDepth-first search (DFS) â Augmenting paths are discovered via DFS and require careful visitation logic.
- âBipartite graphs and 2-coloring â The algorithm only applies to bipartite graphs; you must split vertices into left/right parts or verify bipartiteness.
- âMatchings and alternating paths â Understanding what makes a path augmenting explains correctness and guides implementation.
- âBig-O notation â Needed to analyze Kuhnâs O(VE) time and compare with alternatives like HopcroftâKarp.
- âQueue/BFS basics â Useful for preprocessing (bipartite check) and for extracting a minimum vertex cover after matching.
- âSet operations (symmetric difference) â Theoretical correctness uses M' = M \oplus P to explain augmentation.
- âInput normalization and indexing â Avoid off-by-one and partitioning errors by standardizing indices and sides.
Detailed Explanation
Tap terms for definitions01Overview
Bipartite matching is about pairing items from two distinct sets (left and right) so that no item is used more than once, and the number of pairs is as large as possible. Kuhnâs algorithm is a classic, lightweight method to compute a maximum matching in such graphs. It uses depth-first search (DFS) to find augmenting pathsâpaths that start at an unmatched left vertex and end at an unmatched right vertex, alternating between edges not in the current matching and edges in it. When such a path is found, the algorithm flips the status of the edges along the path (matched â unmatched), which increases the matching size by one. The algorithm proceeds by trying to match each left vertex in turn. For a given left vertex, we run a DFS that attempts to either claim a free right vertex or to re-route the current partner of that right vertex through another path, thereby freeing it. This greedy-sounding process is actually globally optimal because of a fundamental theorem: a matching is maximum if and only if there is no augmenting path. Kuhnâs algorithm is equivalent in power to computing a maximum flow in a unit-capacity network formed from the bipartite graph, but is usually much shorter to implement and has lower constant factors. Its worst-case time is O(VE), which is acceptable for many problem sizes encountered in contests. Moreover, simple heuristics can improve empirical performance. The algorithm also enables extracting related objects such as a minimum vertex cover via KĹnigâs theorem.
02Intuition & Analogies
Imagine you are assigning students (left side) to unique project topics (right side). Initially, no one has a topic. You try to give the first student one of their preferred topics. If that topic is free, greatâmatch made. If itâs taken, you ask, âCan the student who currently has this topic switch to another acceptable topic?â If yes, then you nudge that student to try to find a new topic, possibly causing a chain of reassignments, like a sliding puzzle where each move frees space for another piece. If the chain ends at a genuinely free topic, everyone in that chain successfully shifts, and your first student gets matched. This chain is an "augmenting path": it starts at an unmatched student, alternates between taking a new topic (unmatched edge) and displacing a current holder (matched edge), and ends at a free topic. Flipping who-owns-what along this path increases the total number of assigned students by exactly one. If you can no longer find any such chain for any student, then no further global improvements are possible; youâre done. Using DFS mirrors this process: for each unmatched student, you explore their topic preferences. When you find a topic thatâs taken, you recursively ask that topicâs current owner to move aside by trying to find a different topic. This recursion either uncovers a free topic somewhere downstream (success) or hits a dead end (failure). Visitation markers ensure that in one attempt you donât run in circles among the same students. Though it seems local, this strategy is globally optimal because any non-maximum configuration must have an augmenting path that you will eventually find if you explore systematically.
03Formal Definition
04When to Use
Use Kuhnâs algorithm when you need a maximum cardinality matching in an unweighted bipartite graph and want a compact implementation. Common scenarios include assigning tasks to workers (each worker performs at most one task), pairing applicants to positions based on eligibility, or selecting non-conflicting intervals/items that can be modeled as a bipartite graph. In competitive programming, it is practical for graphs where O(VE) is acceptableâoften up to around E Ⲡ2Ă10^5 and V ⲠfewĂ10^4 in well-behaved cases, though this depends heavily on structure and time limits. Prefer Kuhn over max-flow if you only need unweighted matching and value simplicity and speed of coding. It also integrates naturally with extracting a minimum vertex cover via KĹnigâs theorem, enabling solutions to problems that ask for either the matching or the cover. However, if instance sizes are very large or worst-case heavy (e.g., dense graphs with tens of thousands of nodes), consider HopcroftâKarp, which improves time to O(\sqrt{V} E). If edges carry weights (costs) and you need a minimum-cost perfect matching, Kuhn is not appropriate; use the Hungarian algorithm or min-cost max-flow. If the graph might not be bipartite, first test bipartiteness and split into partitions; if it is not bipartite, Kuhn does not apply.
â ď¸Common Mistakes
⢠Forgetting to reset or differentiate the visited array per DFS attempt. If you reuse a single boolean visited without clearing, you will incorrectly block valid paths. Use a visitation token (integer time-stamp) to avoid O(V) clears per attempt. ⢠Building the graph with the wrong orientation. Kuhnâs DFS should start from L and traverse to R along unmatched edges and from R back to L along matched edges. Store adjacency from L to R only; do not duplicate reverse edges in the adjacency list you use in DFS. ⢠Off-by-one and indexing errors when mixing 0-based and 1-based indices from input. Normalize to one convention immediately after reading. ⢠Not handling disconnected graphs or isolated vertices. The algorithm naturally supports them, but ensure you attempt DFS for every left vertex even if its degree is zero (it will simply fail quickly). ⢠Recursion depth overflow on very deep DFS chains (rare but possible on large inputs). Consider an iterative re-matching routine or compile with sufficient stack, or restructure to avoid deep recursion if constraints are extreme. ⢠Assuming it works for weighted matchings or non-bipartite graphs. Kuhnâs algorithm is only for unweighted bipartite matching. For weighted bipartite matching, use Hungarian; for general graph matching, use Edmondsâ blossom algorithm. ⢠Ignoring performance heuristics on adversarial inputs. Visiting low-degree left vertices first and shuffling adjacency lists can reduce search space and speed up runtimes in practice.
Key Formulas
Kuhn Time Complexity
Explanation: The total time is proportional to the number of vertices times the number of edges. Intuitively, each augmentation costs O(E) and there are at most O(V) augmentations.
Augmentation via Symmetric Difference
Explanation: If P is an augmenting path relative to matching M, flipping membership of edges along P yields a new matching M' of size one larger. This underpins progress in Kuhnâs algorithm.
Bergeâs Theorem
Explanation: A matching cannot be increased if and only if there is no augmenting path. Kuhnâs algorithm halts only when this condition holds, proving optimality.
Hallâs Theorem (Condition)
Explanation: A bipartite graph has a matching that saturates all vertices in S if and only if every subset S of L has at least as many neighbors as its size. This characterizes when perfect/complete matchings exist.
KĹnigâs Theorem
Explanation: In bipartite graphs, the size of a maximum matching equals the size of a minimum vertex cover. This lets us derive a minimum cover after computing a maximum matching.
Augmentation Bound
Explanation: Each augmentation increases the matching size by 1, so we can have at most as many augmentations as the size of a maximum matching, which is bounded by min(|L|,|R|).
HopcroftâKarp Complexity
Explanation: The HopcroftâKarp algorithm improves over Kuhnâs in worst-case time using BFS layers and blocking flows. Use it when inputs are very large.
Minimum Vertex Cover from Alternating Reachability
Explanation: Let Z be the set of vertices reachable from unmatched L via alternating edges (unmatched , matched ). Then removing from L and taking on R gives a minimum vertex cover.
Space Complexity
Explanation: Storing the graph in adjacency lists and bookkeeping arrays uses memory linear in vertices plus edges.
Degree Sum on Left
Explanation: Iterating neighbors of all left vertices costs time proportional to the number of edges. This is used to argue O(E) work per DFS phase.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Kuhn's algorithm implementation for maximum bipartite matching. 5 // Left side has n vertices [0..n-1], Right side has m vertices [0..m-1]. 6 // Input format (example): 7 // n m E\n 8 // E lines of edges: u v (0-indexed; 0 <= u < n, 0 <= v < m) 9 // Output: size of maximum matching and each matched pair u-v. 10 11 struct Kuhn { 12 int n, m; // sizes of left (n) and right (m) 13 vector<vector<int>> g; // g[u] = list of right vertices adjacent to left u 14 vector<int> matchR; // matchR[v] = matched left vertex for right v, or -1 15 vector<int> vis; // visitation timestamp per left vertex 16 int timer = 1; // increases per DFS attempt 17 18 Kuhn(int n, int m): n(n), m(m), g(n), matchR(m, -1), vis(n, 0) {} 19 20 void add_edge(int u, int v) { 21 // Assumes 0 <= u < n and 0 <= v < m 22 g[u].push_back(v); 23 } 24 25 bool dfs(int u) { 26 if (vis[u] == timer) return false; // already visited this attempt 27 vis[u] = timer; 28 for (int v : g[u]) { 29 // If v is free, take it; otherwise try to rematch its current partner 30 if (matchR[v] == -1 || dfs(matchR[v])) { 31 matchR[v] = u; // match v with u (augment) 32 return true; 33 } 34 } 35 return false; // no augmenting path from u 36 } 37 38 int max_matching(vector<pair<int,int>>& pairs_out) { 39 int matched = 0; 40 for (int u = 0; u < n; ++u) { 41 ++timer; // new DFS attempt for this u 42 if (dfs(u)) ++matched; 43 } 44 // Build (u, v) pairs from matchR 45 pairs_out.clear(); 46 vector<int> matchL(n, -1); 47 for (int v = 0; v < m; ++v) if (matchR[v] != -1) matchL[matchR[v]] = v; 48 for (int u = 0; u < n; ++u) if (matchL[u] != -1) pairs_out.push_back({u, matchL[u]}); 49 return matched; 50 } 51 }; 52 53 int main() { 54 ios::sync_with_stdio(false); 55 cin.tie(nullptr); 56 57 int n, m, E; 58 if (!(cin >> n >> m >> E)) return 0; 59 Kuhn solver(n, m); 60 for (int i = 0; i < E; ++i) { 61 int u, v; cin >> u >> v; 62 if (u < 0 || u >= n || v < 0 || v >= m) { 63 cerr << "Edge index out of range\n"; 64 return 0; 65 } 66 solver.add_edge(u, v); 67 } 68 69 vector<pair<int,int>> pairs; 70 int ans = solver.max_matching(pairs); 71 cout << ans << "\n"; 72 for (auto [u, v] : pairs) cout << u << " " << v << "\n"; 73 return 0; 74 } 75
This program implements Kuhnâs algorithm using a DFS-based search for augmenting paths. For each left vertex u, we try to acquire a free right vertex v or to re-route the current occupant of v, thereby freeing it and increasing the matching size. We use a timestamped visited array to avoid clearing visitation state per attempt.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct KuhnHeuristic { 5 int n, m; 6 vector<vector<int>> g; 7 vector<int> matchR, vis, order; 8 int timer = 1; 9 10 KuhnHeuristic(int n, int m): n(n), m(m), g(n), matchR(m, -1), vis(n, 0) {} 11 12 void add_edge(int u, int v) { g[u].push_back(v); } 13 14 bool dfs(int u) { 15 if (vis[u] == timer) return false; 16 vis[u] = timer; 17 for (int v : g[u]) { 18 if (matchR[v] == -1 || dfs(matchR[v])) { 19 matchR[v] = u; 20 return true; 21 } 22 } 23 return false; 24 } 25 26 int max_matching(vector<pair<int,int>>& pairs_out, unsigned seed = 712367) { 27 // Shuffle adjacency lists to avoid worst-case patterns 28 mt19937 rng(seed); 29 for (int u = 0; u < n; ++u) shuffle(g[u].begin(), g[u].end(), rng); 30 31 // Process left vertices in increasing degree order (low-degree first) 32 order.resize(n); 33 iota(order.begin(), order.end(), 0); 34 stable_sort(order.begin(), order.end(), [&](int a, int b){ return g[a].size() < g[b].size(); }); 35 36 int matched = 0; 37 for (int u : order) { 38 ++timer; 39 if (dfs(u)) ++matched; 40 } 41 pairs_out.clear(); 42 vector<int> matchL(n, -1); 43 for (int v = 0; v < m; ++v) if (matchR[v] != -1) matchL[matchR[v]] = v; 44 for (int u = 0; u < n; ++u) if (matchL[u] != -1) pairs_out.emplace_back(u, matchL[u]); 45 return matched; 46 } 47 }; 48 49 int main(){ 50 ios::sync_with_stdio(false); 51 cin.tie(nullptr); 52 53 int n, m, E; cin >> n >> m >> E; 54 KuhnHeuristic solver(n, m); 55 for (int i = 0; i < E; ++i) { 56 int u, v; cin >> u >> v; 57 solver.add_edge(u, v); 58 } 59 vector<pair<int,int>> pairs; 60 int ans = solver.max_matching(pairs); 61 cout << ans << "\n"; 62 for (auto &p : pairs) cout << p.first << " " << p.second << "\n"; 63 return 0; 64 } 65
This variation keeps Kuhnâs logic but improves empirical performance by shuffling neighbor orders and matching left vertices with smaller degrees first. These heuristics tend to reduce backtracking and speed up augmenting path discovery on adversarial or dense inputs.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Given an undirected graph (0..N-1), check bipartiteness by BFS coloring. 5 // If bipartite, build L and R partitions and run Kuhn on edges crossing L->R. 6 7 struct Kuhn { 8 int n, m; vector<vector<int>> g; vector<int> matchR, vis; int timer = 1; 9 Kuhn(int n, int m): n(n), m(m), g(n), matchR(m, -1), vis(n, 0) {} 10 void add_edge(int u, int v){ g[u].push_back(v);} 11 bool dfs(int u){ if(vis[u]==timer) return false; vis[u]=timer; for(int v:g[u]) if(matchR[v]==-1 || dfs(matchR[v])){ matchR[v]=u; return true;} return false; } 12 int max_matching(vector<pair<int,int>>& out){ int ans=0; for(int u=0;u<n;++u){ ++timer; if(dfs(u)) ++ans; } out.clear(); vector<int> matchL(n,-1); for(int v=0;v<m;++v) if(matchR[v]!=-1) matchL[matchR[v]]=v; for(int u=0;u<n;++u) if(matchL[u]!=-1) out.push_back({u,matchL[u]}); return ans; } 13 }; 14 15 int main(){ 16 ios::sync_with_stdio(false); 17 cin.tie(nullptr); 18 19 int N, M; // general undirected graph 20 cin >> N >> M; 21 vector<vector<int>> adj(N); 22 for(int i=0;i<M;++i){ int a,b; cin>>a>>b; adj[a].push_back(b); adj[b].push_back(a); } 23 24 // 2-color the graph to ensure bipartite 25 const int INF = 1e9; 26 vector<int> color(N, INF); // 0 or 1 colors 27 bool ok = true; 28 for(int s=0;s<N && ok;++s){ 29 if(color[s]!=INF) continue; 30 queue<int> q; color[s]=0; q.push(s); 31 while(!q.empty() && ok){ 32 int u=q.front(); q.pop(); 33 for(int v:adj[u]){ 34 if(color[v]==INF){ color[v]=color[u]^1; q.push(v);} 35 else if(color[v]==color[u]){ ok=false; break; } 36 } 37 } 38 } 39 40 if(!ok){ 41 cout << "Graph is not bipartite\n"; 42 return 0; 43 } 44 45 // Build L and R index maps 46 vector<int> leftId(N, -1), rightId(N, -1); 47 int nL=0, nR=0; 48 for(int u=0;u<N;++u){ if(color[u]==0) leftId[u]=nL++; else rightId[u]=nR++; } 49 50 Kuhn solver(nL, nR); 51 // Add edges from L (color 0) to R (color 1) 52 for(int u=0;u<N;++u){ if(color[u]!=0) continue; for(int v:adj[u]){ // u in L, v must be in R 53 solver.add_edge(leftId[u], rightId[v]); 54 } 55 } 56 57 vector<pair<int,int>> pairs; // in left/right indices 58 int ans = solver.max_matching(pairs); 59 60 // Map back to original vertex IDs 61 cout << ans << "\n"; 62 for(auto [lu, rv] : pairs){ 63 // find original IDs: lu is some node with color 0; rv is some with color 1 64 // Build reverse maps 65 // For simplicity, reconstruct once here 66 // (In practice, store reverse arrays during construction.) 67 } 68 69 // Build reverse maps for output 70 vector<int> revL(nL, -1), revR(nR, -1); 71 for(int u=0;u<N;++u){ if(color[u]==0) revL[leftId[u]] = u; else revR[rightId[u]] = u; } 72 for(auto [lu, rv] : pairs){ 73 cout << revL[lu] << " " << revR[rv] << "\n"; 74 } 75 return 0; 76 } 77
This program accepts a general undirected graph, checks whether it is bipartite via BFS 2-coloring, constructs the (L,R) bipartition, and then runs Kuhn on edges directed from L to R. If the graph is not bipartite, it reports failure.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Kuhn { 5 int n, m; vector<vector<int>> g; vector<int> matchR, vis; int timer=1; 6 Kuhn(int n, int m): n(n), m(m), g(n), matchR(m, -1), vis(n, 0) {} 7 void add_edge(int u, int v){ g[u].push_back(v);} 8 bool dfs(int u){ if(vis[u]==timer) return false; vis[u]=timer; for(int v:g[u]) if(matchR[v]==-1 || dfs(matchR[v])){ matchR[v]=u; return true;} return false; } 9 int max_matching(vector<int>& matchL){ int ans=0; matchL.assign(n,-1); for(int u=0;u<n;++u){ ++timer; if(dfs(u)) ++ans; } 10 for(int v=0; v<m; ++v) if(matchR[v]!=-1) matchL[matchR[v]] = v; return ans; } 11 }; 12 13 // Given maximum matching (matchL/matchR), compute a minimum vertex cover using alternating reachability. 14 // Z_L = left vertices reachable from unmatched left via alternating paths (unmatched L->R, matched R->L) 15 // Z_R = right vertices reachable in the same search. 16 // Min vertex cover = (L \ Z_L) U (R ⊠Z_R) 17 18 int main(){ 19 ios::sync_with_stdio(false); 20 cin.tie(nullptr); 21 22 int n, m, E; cin >> n >> m >> E; 23 Kuhn solver(n, m); 24 for(int i=0;i<E;++i){ int u,v; cin>>u>>v; solver.add_edge(u,v);} 25 26 vector<int> matchL; 27 int mm = solver.max_matching(matchL); 28 cout << mm << "\n"; // size of maximum matching (optional output) 29 30 // Alternating BFS to find Z_L and Z_R 31 vector<char> visL(n, 0), visR(m, 0); 32 queue<int> q; 33 for(int u=0; u<n; ++u){ if(matchL[u] == -1){ visL[u] = 1; q.push(u); } } 34 while(!q.empty()){ 35 int u = q.front(); q.pop(); 36 // Traverse unmatched edges u->v first (from L to R if edge u-v is NOT the matched edge of u) 37 for(int v : solver.g[u]){ 38 // Edge u-v is unmatched if matchL[u] != v 39 if(!visR[v] && matchL[u] != v){ 40 visR[v] = 1; 41 // From right to left, traverse only matched edges v->u2 where matchR[v] = u2 42 int u2 = solver.matchR[v]; 43 if(u2 != -1 && !visL[u2]){ 44 visL[u2] = 1; q.push(u2); 45 } 46 } 47 } 48 } 49 50 // Minimum vertex cover = (Left \ Z_L) union (Right ⊠Z_R) 51 vector<int> coverL, coverR; 52 for(int u=0; u<n; ++u) if(!visL[u]) coverL.push_back(u); 53 for(int v=0; v<m; ++v) if(visR[v]) coverR.push_back(v); 54 55 cout << coverL.size() + coverR.size() << "\n"; 56 cout << coverL.size() << " "; 57 for(int u: coverL) cout << u << " "; 58 cout << "\n"; 59 cout << coverR.size() << " "; 60 for(int v: coverR) cout << v << " "; 61 cout << "\n"; 62 63 return 0; 64 } 65
After computing a maximum matching with Kuhnâs algorithm, we find a minimum vertex cover using KĹnigâs theorem. Start BFS from all unmatched left vertices, traverse along alternating edges (unmatched LâR, matched RâL). The minimum vertex cover is (Left â reachable_L) ⪠(Right ⊠reachable_R).