⚙️AlgorithmIntermediate

Strongly Connected Components

Key Points

  • Strongly Connected Components (SCCs) partition a directed graph into maximal groups where every vertex can reach every other vertex in the group.
  • You can find SCCs in linear time O(V + E) using Tarjan’s algorithm (one DFS with low-link values) or Kosaraju’s algorithm (two DFS passes).
  • Compressing SCCs into single nodes forms the condensation graph, which is always a Directed Acyclic Graph (DAG).
  • SCCs power many applications such as 2-SAT, cycle detection, program analysis, and graph compression for DP on DAGs.
  • Tarjan’s algorithm uses discovery times and a stack to detect component roots when low-link equals discovery time.
  • Kosaraju’s algorithm uses finishing times on the original graph and DFS on the transpose graph in reverse finishing order.
  • To make a graph strongly connected with minimum edges, add max(#sourc, #sin) edges, unless there is only one SCC.
  • In practice, SCC IDs let you shrink complex graphs and run faster algorithms like topological sort and DP on the compressed DAG.

Prerequisites

  • Directed graphs and adjacency listsSCCs are defined on directed graphs; efficient implementations use adjacency lists.
  • Depth-First Search (DFS)Both Tarjan’s and Kosaraju’s algorithms rely on DFS traversals.
  • Stacks and recursionTarjan uses an explicit stack and often recursion; understanding call stacks helps with low-link logic.
  • Big-O notationTo reason about the linear-time O(V + E) performance and memory usage.
  • Graph transposeKosaraju’s algorithm requires reversing edges to run the second DFS pass.
  • Equivalence relationsSCCs form equivalence classes under mutual reachability; this underpins correctness proofs.
  • Topological sortingThe condensation is a DAG; topological order is useful after SCC compression.
  • Logical implications (for 2-SAT)Translating clauses to implications is essential for SCC-based satisfiability checking.
  • Cycle concepts in directed graphsSCCs capture cycles; distinguishing back edges and cycle structures is important.
  • Dynamic programming on DAGsAfter compressing SCCs, many problems reduce to DP over the condensation DAG.

Detailed Explanation

Tap terms for definitions

01Overview

Strongly Connected Components (SCCs) are fundamental structures in directed graphs. An SCC is a maximal set of vertices such that for any two vertices u and v in the set, there is a directed path from u to v and from v to u. SCCs partition the vertex set of a directed graph into disjoint blocks that capture its cyclic structure. Two classic linear-time algorithms are used to compute SCCs: Tarjan’s algorithm and Kosaraju’s algorithm. Tarjan’s algorithm performs a single depth-first search (DFS) and maintains two arrays—discovery index and low-link—to detect when a vertex is the root of an SCC. Kosaraju’s algorithm performs two DFS passes: one to compute a finishing-time order on the original graph and a second on the transpose (reversed edges) in reverse finishing-time order. Once SCCs are identified, we can build the condensation graph by contracting each SCC into a single vertex and merging parallel edges; the result is always a Directed Acyclic Graph (DAG). This compression often simplifies problems: many tasks become easier on a DAG, such as topological sorting, dynamic programming, and counting source/sink components. SCCs are particularly important in competitive programming and systems like compilers, static analyzers, and network analysis.

02Intuition & Analogies

Imagine a city with one-way streets. Pick a block of neighborhoods where, no matter which intersection you start from, you can drive along one-way streets to reach any other intersection in that block and also return. That block is an SCC: a tightly knit neighborhood of mutual reachability. Outside that block, you might be able to drive out, but not necessarily come back; that means different blocks are connected in a one-way pattern. If you shrink each block to a single super-intersection, the city map becomes a structure with no cycles—like a hillside with only downhill roads—this is the condensation DAG. Now, how do we find those blocks quickly? Tarjan’s algorithm is like a careful exploration where you stamp each intersection with the time you first see it and keep track of the earliest-stamped place you can reach by going forward along streets and possibly taking a single back road (edge) to an ancestor you’ve already seen. If an intersection can’t reach an earlier one than itself within the current search, it’s the start of a new tightly knit block; you peel off all intersections on your exploration stack until you get back to it. Kosaraju’s method uses a different trick: first, drive through the city and record the order in which your trips “finish”—the last time you leave each intersection. Then flip all streets (reverse their direction). If you now start exploring in the reverse of the finishing order, every time you start you’ll collect exactly one tight block before you hit a dead end, because the reversed streets align your search to stay inside one neighborhood at a time. Both viewpoints exploit how cycles trap reachability locally while the between-block structure is acyclic.

03Formal Definition

Let G = (V, E) be a directed graph with vertex set V and edge set E V V. For u, v V, say u reaches v if there exists a directed path from u to v. Define a relation on V by u v if and only if u reaches v and v reaches u. The relation is an equivalence relation (reflexive, symmetric, transitive), and its equivalence classes are the Strongly Connected Components (SCCs) of G. SCCs form a partition of V. The condensation graph is obtained by contracting each SCC into a single vertex and introducing a directed edge between two SCCs if there exists at least one edge between their members in G. Formally, if , are distinct SCCs and there exists (u, v) E with u and v , then there is an edge (, ) in . A fundamental property is that is a Directed Acyclic Graph (DAG). Tarjan’s algorithm assigns each vertex u an index (u) in order of first discovery during a DFS and computes a low-link value (u), the smallest index reachable from u by following zero or more tree edges and at most one back edge to a vertex currently on the recursion stack. When (u) = (u), u is the root of an SCC; all vertices on the stack from the top down to u form one SCC. Kosaraju’s algorithm runs a DFS on G to compute finishing times, then runs DFS on (the transpose graph) in the order of decreasing finishing times; each DFS tree in the second pass yields one SCC.

04When to Use

Use SCC decomposition whenever mutual reachability or cyclic structure matters in a directed graph. Common scenarios include:

  • 2-SAT: Convert each clause into implications, build the implication graph, and check that no variable and its negation belong to the same SCC; SCC topological order also builds a satisfying assignment.
  • Graph compression: Replace each SCC by a single node to get a DAG, enabling topological sorting and DP to compute properties such as longest paths, number of ways, or reachability summaries.
  • Cycle detection and component analysis: Determine which parts of a graph are cyclic versus acyclic; analyze feedback loops in dependency graphs or control-flow graphs.
  • Minimum edges to strongly connect: Count source and sink SCCs in the condensation DAG; when there are k > 1 SCCs, the minimal number of edges needed to make the whole graph strongly connected is max(#sources, #sinks).
  • Network reliability and modularity: Identify tightly coupled modules in software call graphs or social networks; compression aids visualization and queries at scale.
  • Preprocessing for queries: Precompute SCC IDs to answer reachability or “same strongly connected region?” queries quickly, especially when handling multiple queries.

⚠️Common Mistakes

  • Ignoring direction: Treating a directed graph like an undirected one breaks SCC logic; ensure edges are added with the correct orientation and that the transpose in Kosaraju truly reverses them.
  • Forgetting on-stack checks in Tarjan: low-link updates must consider only neighbors that are still on the DFS stack; updating low with any visited neighbor yields incorrect merges.
  • Mixing discovery vs. finishing times: Kosaraju depends on finishing times from the first pass; pushing a node on first visit instead of on finish leads to wrong order.
  • Not resetting globals: In multi-test cases, clear adjacency lists, indices, low arrays, and stacks; stale data silently corrupts results.
  • Recursion depth limits: Naive recursive DFS may stack-overflow on large graphs (e.g., n ≈ 2e5). Consider iterative DFS or increase stack size (platform-dependent).
  • Building the transpose incorrectly: For Kosaraju, you need the exact edge reversal; omitting parallel edges is fine but missing any reverse edge changes components.
  • Mishandling self-loops and multi-edges: Self-loops mean a single vertex is strongly connected (to itself), which is okay; multi-edges don’t harm SCC correctness, but be consistent when building condensation.
  • Off-by-one indexing: Mixing 0-based and 1-based vertex IDs can cause out-of-bounds or unvisited nodes.
  • Forgetting to pop entire SCC: In Tarjan, after detecting a root (low == idx), pop until the root itself is popped; partial pops fragment components.

Key Formulas

Linear-time SCC computation

Explanation: Both Tarjan’s and Kosaraju’s algorithms visit each vertex and edge a constant number of times. This means the total running time grows linearly with the size of the graph.

Strong connectivity relation

Explanation: Two vertices are in the same SCC if and only if each can reach the other via directed paths. This defines an equivalence relation whose classes are the SCCs.

Condensation graph

Explanation: Each SCC becomes a node. There is an edge between two SCCs if any edge connects their members in the original graph. This graph is always acyclic.

Tarjan low-link

Explanation: The low-link of u is the smallest discovery index reachable from u using zero or more tree edges and at most one back edge to a vertex still on the stack. It detects when u starts a new SCC (when low(u) equals idx(u)).

Kosaraju finishing-time property

Explanation: In the original graph, SCCs with edges to others finish later in the first DFS. Processing vertices in decreasing finishing time on the transpose isolates SCCs correctly.

Transpose edges

Explanation: The transpose reverses all edges. Running DFS on the transpose in a particular order is the key step in Kosaraju’s algorithm.

Minimum edges to strongly connect

Explanation: Let S be the number of SCCs in the condensation DAG with indegree 0 and T the number with outdegree 0. If k > 1, the minimal number of edges to add to make the whole graph strongly connected is max(S, T).

Size bound of condensation

Explanation: SCC compression never increases the number of vertices or edges. Often it dramatically reduces them, enabling faster algorithms on the condensed DAG.

Partition by SCCs

Explanation: SCCs form a partition of the vertex set: they are disjoint and cover all vertices. This ensures every vertex belongs to exactly one SCC.

Complexity Analysis

Both Tarjan’s and Kosaraju’s algorithms run in linear time O(V + E) and require linear space. In Tarjan’s algorithm, each vertex is pushed to and popped from the stack exactly once, and each directed edge is examined a constant number of times during DFS. The arrays for discovery indices, low-link values, stack membership, and the recursion (or explicit) stack together take O(V) space, while the adjacency list representation of the graph takes O(V + E) space. Kosaraju’s algorithm performs two full DFS passes: the first computes finishing times on the original graph, and the second finds SCCs on the transpose graph. Each pass touches every vertex and edge at most once, so the time is still O(V + E). However, Kosaraju needs to either store the transpose graph explicitly (adding O(V + E) space) or generate it on the fly (trading speed for memory). The space cost includes adjacency lists for both G and if stored, plus an order vector of size O(V). In practice, constants differ slightly: Tarjan avoids building the transpose and uses one pass, which is cache-friendly and often faster. But Tarjan’s recursive version can hit recursion-depth limits for very deep graphs, so an iterative implementation or stack-size adjustments may be necessary. With careful implementation and fast I/O, both methods comfortably handle graphs with up to a few million edges in competitive programming constraints.

Code Examples

Tarjan’s Algorithm (Single-DFS, Low-Link) to Compute SCC IDs
1#include <bits/stdc++.h>
2using namespace std;
3
4struct TarjanSCC {
5 int n;
6 vector<vector<int>> g;
7 vector<int> idx, low, comp, st;
8 vector<char> onstack;
9 int timer = 0, comp_cnt = 0;
10
11 TarjanSCC(int n): n(n), g(n), idx(n, -1), low(n, -1), comp(n, -1), onstack(n, 0) {}
12
13 void add_edge(int u, int v) { g[u].push_back(v); }
14
15 void dfs(int u) {
16 idx[u] = low[u] = timer++;
17 st.push_back(u);
18 onstack[u] = 1;
19 for (int v : g[u]) {
20 if (idx[v] == -1) { // tree edge
21 dfs(v);
22 low[u] = min(low[u], low[v]);
23 } else if (onstack[v]) { // back edge to a node on stack
24 low[u] = min(low[u], idx[v]);
25 }
26 }
27 // If u is a root of an SCC, pop the stack up to u
28 if (low[u] == idx[u]) {
29 while (true) {
30 int v = st.back(); st.pop_back();
31 onstack[v] = 0;
32 comp[v] = comp_cnt; // assign component id
33 if (v == u) break;
34 }
35 comp_cnt++;
36 }
37 }
38
39 // Returns number of components and fills comp[u] with [0..comp_cnt-1]
40 int run() {
41 for (int u = 0; u < n; ++u) if (idx[u] == -1) dfs(u);
42 return comp_cnt;
43 }
44};
45
46int main() {
47 ios::sync_with_stdio(false);
48 cin.tie(nullptr);
49
50 int n, m; // vertices 0..n-1
51 if (!(cin >> n >> m)) return 0;
52 TarjanSCC scc(n);
53 for (int i = 0; i < m; ++i) {
54 int u, v; cin >> u >> v; // assume 0-based input
55 scc.add_edge(u, v);
56 }
57 int k = scc.run();
58
59 cout << k << "\n";
60 for (int u = 0; u < n; ++u) cout << scc.comp[u] << (u+1==n?'\n':' ');
61}
62

This program computes SCCs using Tarjan’s algorithm. It assigns each node a discovery index and maintains a low-link value that records the smallest discovery index reachable from the node using tree edges and at most one back edge to a node on the stack. When low[u] equals idx[u], u starts a new SCC; the algorithm pops nodes from the stack until u is removed, assigning them the same component ID. The result is a partition of the graph into SCCs in O(V + E) time.

Time: O(V + E)Space: O(V + E)
Kosaraju’s Algorithm (Two DFS Passes) to Compute SCC IDs
1#include <bits/stdc++.h>
2using namespace std;
3
4int main() {
5 ios::sync_with_stdio(false);
6 cin.tie(nullptr);
7
8 int n, m; // 0..n-1
9 if (!(cin >> n >> m)) return 0;
10 vector<vector<int>> g(n), gt(n);
11 for (int i = 0; i < m; ++i) {
12 int u, v; cin >> u >> v; // 0-based
13 g[u].push_back(v);
14 gt[v].push_back(u); // transpose edge
15 }
16
17 vector<int> vis(n, 0), order; order.reserve(n);
18 // First pass: compute finishing order on original graph
19 function<void(int)> dfs1 = [&](int u) {
20 vis[u] = 1;
21 for (int v : g[u]) if (!vis[v]) dfs1(v);
22 order.push_back(u); // push on finish
23 };
24 for (int u = 0; u < n; ++u) if (!vis[u]) dfs1(u);
25
26 // Second pass: explore transpose in reverse finishing order
27 vector<int> comp(n, -1);
28 int cid = 0;
29 function<void(int)> dfs2 = [&](int u) {
30 comp[u] = cid;
31 for (int v : gt[u]) if (comp[v] == -1) dfs2(v);
32 };
33
34 for (int i = n - 1; i >= 0; --i) {
35 int u = order[i];
36 if (comp[u] == -1) {
37 dfs2(u);
38 ++cid;
39 }
40 }
41
42 cout << cid << "\n";
43 for (int u = 0; u < n; ++u) cout << comp[u] << (u+1==n?'\n':' ');
44 return 0;
45}
46

This implementation builds both the original graph and its transpose. The first DFS stores vertices by finishing time. The second DFS iterates vertices in reverse finishing order but traverses the transpose graph; each DFS tree in this pass yields one SCC. The comp array stores SCC IDs from 0 to cid−1.

Time: O(V + E)Space: O(V + E)
Build Condensation DAG and Compute Minimum Edges to Strongly Connect
1#include <bits/stdc++.h>
2using namespace std;
3
4// Reuse Tarjan to get SCCs, then build condensation and count sources/sinks.
5struct TarjanSCC {
6 int n; vector<vector<int>> g; vector<int> idx, low, comp, st; vector<char> on;
7 int timer = 0, cc = 0;
8 TarjanSCC(int n): n(n), g(n), idx(n,-1), low(n,-1), comp(n,-1), on(n,0) {}
9 void add_edge(int u,int v){ g[u].push_back(v);}
10 void dfs(int u){
11 idx[u]=low[u]=timer++; st.push_back(u); on[u]=1;
12 for(int v:g[u]){
13 if(idx[v]==-1){ dfs(v); low[u]=min(low[u],low[v]); }
14 else if(on[v]) low[u]=min(low[u],idx[v]);
15 }
16 if(low[u]==idx[u]){
17 while(true){ int v=st.back(); st.pop_back(); on[v]=0; comp[v]=cc; if(v==u) break; }
18 cc++;
19 }
20 }
21 int run(){ for(int i=0;i<n;++i) if(idx[i]==-1) dfs(i); return cc; }
22};
23
24int main(){
25 ios::sync_with_stdio(false);
26 cin.tie(nullptr);
27
28 int n,m; if(!(cin>>n>>m)) return 0;
29 TarjanSCC scc(n);
30 for(int i=0;i<m;++i){ int u,v;cin>>u>>v; scc.add_edge(u,v);}
31 int k=scc.run();
32
33 if(k==1){ cout<<0<<"\n"; return 0; }
34
35 vector<int> indeg(k,0), outdeg(k,0);
36 // Build condensation degrees without storing full DAG
37 for(int u=0;u<n;++u){
38 for(int v: scc.g[u]){
39 int cu = scc.comp[u], cv = scc.comp[v];
40 if(cu!=cv){
41 outdeg[cu]++;
42 indeg[cv]++;
43 }
44 }
45 }
46
47 int sources=0,sinks=0;
48 for(int c=0;c<k;++c){ if(indeg[c]==0) sources++; if(outdeg[c]==0) sinks++; }
49
50 cout << max(sources, sinks) << "\n";
51 return 0;
52}
53

After computing SCC IDs with Tarjan, we don’t need to explicitly store the condensation DAG to answer the classic question: the minimum number of edges to add to make the graph strongly connected. Counting how many SCCs have indegree 0 (sources) and outdegree 0 (sinks) in the condensed graph gives the answer: if k > 1 SCCs, the answer is max(sources, sinks); otherwise, it is 0.

Time: O(V + E)Space: O(V + E)
2-SAT Solver via SCC (Implication Graph + Kosaraju)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct SCC {
5 int n; vector<vector<int>> g, gt; vector<int> comp, vis, order;
6 SCC(int n): n(n), g(n), gt(n), comp(n,-1), vis(n,0) {}
7 void add_edge(int u,int v){ g[u].push_back(v); gt[v].push_back(u);} // u -> v, and v -> u in transpose
8 void dfs1(int u){ vis[u]=1; for(int v:g[u]) if(!vis[v]) dfs1(v); order.push_back(u);}
9 void dfs2(int u,int c){ comp[u]=c; for(int v:gt[u]) if(comp[v]==-1) dfs2(v,c);}
10 int run(){ for(int i=0;i<n;++i) if(!vis[i]) dfs1(i); int c=0; for(int i=n-1;i>=0;--i){ int u=order[i]; if(comp[u]==-1) dfs2(u,c++);} return c; }
11};
12
13// 2-SAT over variables x in [0..vars-1]
14struct TwoSAT {
15 int vars; // number of boolean variables
16 SCC scc; // implication graph with 2*vars nodes (x and not x)
17 TwoSAT(int n): vars(n), scc(2*n) {}
18 // Map literal to node id: x -> 2*x, ~x -> 2*x^1
19 static int var(int x){ return 2*x; }
20 static int neg(int id){ return id^1; }
21 static int lit(int x, bool val){ return 2*x ^ (val?0:1); } // val=true => x, false => ~x
22
23 // add implication a -> b
24 void imply(int a, int b){ scc.add_edge(a,b); }
25
26 // add clause (A or B), where A and B are literal ids
27 void add_or(int A, int B){ imply(neg(A), B); imply(neg(B), A); }
28
29 // force literal A to be true
30 void add_true(int A){ imply(neg(A), A); }
31
32 // enforce A == B
33 void add_equiv(int A, int B){ add_or(A, B); add_or(neg(A), neg(B)); }
34
35 // enforce A -> B
36 void add_if(int A, int B){ imply(A, B); imply(neg(B), neg(A)); }
37
38 // Solve: returns pair<is_satisfiable, assignment>
39 pair<bool, vector<int>> solve(){
40 scc.run();
41 vector<int> assign(vars, 0);
42 for(int x=0;x<vars;++x){
43 int a = var(x), na = neg(a);
44 if(scc.comp[a] == scc.comp[na]) return {false, {}}; // contradiction
45 }
46 // Variables with higher component order (later in topo) are false by this scheme
47 vector<pair<int,int>> ord(2*vars);
48 for(int i=0;i<2*vars;++i) ord[i] = {scc.comp[i], i};
49 sort(ord.rbegin(), ord.rend());
50 vector<int> val(2*vars, -1);
51 for(auto [c, id] : ord){ if(val[id]==-1){ val[id]=0; val[neg(id)]=1; } }
52 for(int x=0;x<vars;++x) assign[x] = val[var(x)];
53 return {true, assign};
54 }
55};
56
57int main(){
58 ios::sync_with_stdio(false);
59 cin.tie(nullptr);
60
61 int nVars, m; // variables, clauses of form (a v b). Input format example:
62 // First line: nVars m
63 // Next m lines: ax av bx bv where literal is (x if av=1 else ~x)
64 if(!(cin>>nVars>>m)) return 0;
65 TwoSAT solver(nVars);
66 for(int i=0;i<m;++i){
67 int ax,av,bx,bv; cin>>ax>>av>>bx>>bv;
68 int A = TwoSAT::lit(ax, av==1);
69 int B = TwoSAT::lit(bx, bv==1);
70 solver.add_or(A, B);
71 }
72 auto [ok, asg] = solver.solve();
73 if(!ok){ cout << "UNSAT" << '\n'; }
74 else{
75 cout << "SAT" << '\n';
76 for(int i=0;i<nVars;++i) cout << asg[i] << (i+1==nVars?'\n':' ');
77 }
78 return 0;
79}
80

This is a complete 2-SAT solver using SCCs. Each variable x has two nodes: x (true) and ¬x (false). A clause (A ∨ B) is encoded as (¬A → B) and (¬B → A). After running Kosaraju on the implication graph, the instance is unsatisfiable if a variable and its negation lie in the same SCC. Otherwise, the SCC topological order yields a valid assignment.

Time: O(V + E) where V = 2·nVars and E = O(number of implications)Space: O(V + E)