Breadth-First Search (BFS)
Key Points
- •Breadth-First Search (BFS) explores a graph level by level, visiting all vertices at distance d from the source before any at distance d+1.
- •BFS uses a queue to ensure first-in, first-out processing that exactly matches increasing distance from the source.
- •In unweighted graphs, BFS computes shortest path distances from a source to every reachable vertex.
- •The time complexity of BFS is O(V + E), and the space complexity is O(V + E) for adjacency lists and O(V) for BFS bookkeeping.
- •BFS naturally produces a shortest-path tree (parent pointers) that can reconstruct an actual shortest path.
- •Level-order traversal of a tree is a direct application of BFS.
- •Multi-source BFS finds the nearest source to each node by seeding the queue with multiple starting vertices.
- •On grids, BFS solves many maze and pathfinding problems when all moves have equal cost.
Prerequisites
- →Graphs and basic terminology — Understand vertices, edges, directed vs. undirected, and representations like adjacency lists.
- →Queues and FIFO behavior — BFS relies on a queue to maintain increasing-distance order.
- →Big-O notation — To analyze BFS time and space complexities such as O(V + E).
- →Tree basics — BFS constructs a shortest-path tree using parent pointers, and level-order traversal is BFS on trees.
- →Grid indexing and bounds checking — Essential for implementing BFS over 2D grids common in contest problems.
Detailed Explanation
Tap terms for definitions01Overview
Breadth-First Search (BFS) is a fundamental algorithm for traversing or searching graph structures. Starting from a chosen source vertex, BFS explores the graph in concentric layers: it first visits all vertices at distance 1, then all vertices at distance 2, and so on. This layered exploration is achieved by maintaining a queue, ensuring that vertices are processed in the order they were discovered. Because of this property, BFS is the go-to method for shortest paths in unweighted graphs: the first time we visit a vertex, we have found a shortest path to it in terms of number of edges. BFS also constructs a spanning tree of the reachable part of the graph (often called the BFS tree), capturing parent-child relationships that allow us to reconstruct specific shortest paths. The algorithm runs in linear time with respect to the size of the graph when using adjacency lists, specifically O(V + E), where V is the number of vertices and E is the number of edges. BFS is versatile: it is used in level-order traversal of trees, bipartite checking, finding connected components, solving pathfinding on grids when each move has equal cost, and as a building block in more advanced algorithms like Edmonds–Karp for max flow and in 0–1 BFS variants. Its simplicity, efficiency, and correctness make it a must-know tool for programming contests and real-world systems.
02Intuition & Analogies
Imagine ripples in a pond when you drop a stone. The water waves expand outward in perfect circles. If a lily pad represents the source vertex, the ripples reach nearby pads first (distance 1), then a bit farther ones (distance 2), and so on. BFS is that ripple: it expands from the source, visiting everything one step away, then two steps away, etc. The queue is the line of tasks waiting to be processed—like people queuing to board a bus. The first person to arrive (closest to the front) boards first, and newcomers join the end. This guarantees fairness and preserves the order of discovery, which is precisely how we maintain increasing distance layers. In a maze, if every corridor takes the same effort to traverse, the smartest way to find the shortest exit is to explore all locations one step away, then all two steps away, and so forth—never skipping ahead. You pin a breadcrumb (parent pointer) each time you step into a new cell so you can retrace your steps once you find the exit. For multiple fire exits (multi-source BFS), imagine starting ripples at each exit simultaneously. The place where a ripple reaches first determines the nearest exit. For trees, BFS’s level-order traversal is like inspecting a building floor by floor: you check all rooms on the first floor, then second floor, before moving to the third. This uniform expansion is the core intuition behind BFS’s correctness for shortest paths in unweighted settings.
03Formal Definition
04When to Use
Use BFS when all edges have equal weight or cost, and you need the shortest path in terms of hops. Classic situations include finding the shortest route in a maze where each move costs the same, computing minimum number of transformations in word-ladder problems, and level-order traversal in trees. BFS is also ideal for discovering connected components in an unweighted graph (by repeating BFS from unvisited nodes), checking whether a graph is bipartite by coloring levels alternately, and determining reachability. When multiple sources exist (e.g., several virus starting points or multiple portals), multi-source BFS computes the nearest source to each node efficiently by seeding the queue with all sources at distance 0. On grids, BFS handles four- or eight-direction movement with uniform step cost. In more advanced contexts, BFS underlies algorithms like Edmonds–Karp for maximum flow (where BFS finds shortest augmenting paths in terms of number of edges) and is adapted in 0–1 BFS to handle edges with costs 0 or 1 using a deque. Avoid Dijkstra’s algorithm if all weights are equal—BFS is simpler and faster in practice for that case. If edges have arbitrary nonnegative weights, prefer Dijkstra; if negative weights exist, consider Bellman–Ford or SPFA variants.
⚠️Common Mistakes
Common pitfalls include: (1) Marking nodes as visited only when dequeued, which can enqueue the same node multiple times; instead, mark visited (or set dist) when enqueuing to avoid duplicates. (2) Forgetting to initialize dist to +\infty (or a sentinel like -1), leading to incorrect comparisons and missed relaxations. (3) Using an adjacency matrix inadvertently causes O(V^2) time even on sparse graphs; prefer adjacency lists for O(V + E). (4) Not clearing or reinitializing arrays for multiple test cases in contest settings, causing stale state to pollute results. (5) Off-by-one errors in grid BFS: not checking bounds, mixing up row/column order, or forgetting to mark walls as blocked. (6) Reconstructing paths without storing parent pointers, forcing expensive recomputation; always maintain parent[v] when you set dist[v]. (7) Assuming BFS works on weighted graphs for shortest paths; BFS only guarantees shortest paths when all edges have equal weight (or when weights are 0/1 with 0–1 BFS). (8) Overflow concerns: while distances in BFS are small (\le V-1), ensure your container sizes and indices fit within int, and prefer size_t or long long for very large inputs. (9) For directed graphs, forgetting directionality and erroneously traversing edges in both directions can break correctness. (10) For disconnected graphs, assuming all vertices are reached from a single source; handle unreachable nodes by checking dist[v] remains +\infty or -1.
Key Formulas
BFS Time Complexity
Explanation: BFS enqueues each vertex at most once and inspects each edge a constant number of times. Therefore the total running time is linear in the number of vertices plus edges.
Handshake Lemma (Undirected)
Explanation: The sum of degrees over all vertices equals twice the number of edges. This identity helps bound the total number of adjacency inspections during BFS on undirected graphs.
Unweighted Shortest-Path Recurrence
Explanation: The shortest distance from s to v is one plus the minimum distance to any predecessor u with an edge to v. BFS realizes this by exploring nodes in increasing distance order.
BFS Layers
Explanation: Vertices are partitioned into layers by their distance from the source. BFS processes layers in order , , , ....
Triangle Inequality (Unweighted)
Explanation: Traversing one edge increases path length by at most one, ensuring BFS's monotonic distance updates are safe and nondecreasing along explored edges.
BFS Tree Size
Explanation: In the BFS tree T spanning the reachable vertex set from s, the number of tree edges equals the number of reachable vertices minus one.
Queue Operation Bound
Explanation: Each vertex is enqueued once and dequeued once, so the number of queue operations is linear in the number of vertices.
Space Complexity (Adjacency List)
Explanation: Storing the graph in adjacency lists takes O(|V| + |E|) space, and BFS’s auxiliary arrays (distance, parent, visited, and queue) add O(|V|).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h>\nusing namespace std;\n\n// Perform BFS from source s on an unweighted graph.\n// Computes distance (in edges) and parent pointers for shortest-path tree.\nint main() {\n ios::sync_with_stdio(false);\n cin.tie(nullptr);\n\n int n, m;\n // Input format: n m, then m edges (0-indexed or 1-indexed as you prefer).\n // Here we assume 0-indexed vertices.\n cin >> n >> m;\n vector<vector<int>> adj(n);\n for (int i = 0; i < m; ++i) {\n int u, v;\n cin >> u >> v;\n // Undirected graph: add both directions. For directed, add only adj[u].push_back(v).\n adj[u].push_back(v);\n adj[v].push_back(u);\n }\n\n int s;\n cin >> s; // source vertex\n\n const int INF = INT_MAX;\n vector<int> dist(n, INF);\n vector<int> parent(n, -1);\n queue<int> q;\n\n // Initialize BFS\n dist[s] = 0;\n q.push(s);\n\n while (!q.empty()) {\n int u = q.front();\n q.pop();\n // Explore all neighbors of u\n for (int v : adj[u]) {\n if (dist[v] == INF) { // not visited yet\n dist[v] = dist[u] + 1;\n parent[v] = u;\n q.push(v);\n }\n }\n }\n\n // Output distances and parents\n cout << "Distances:\n";\n for (int i = 0; i < n; ++i) {\n if (dist[i] == INF) cout << i << ": unreachable\n";\n else cout << i << ": " << dist[i] << "\n";\n }\n\n cout << "Parents (BFS tree):\n";\n for (int i = 0; i < n; ++i) {\n cout << i << ": " << parent[i] << "\n";\n }\n\n return 0;\n}\n
This program builds an undirected graph, runs BFS from a source s, and computes shortest distances and parent pointers. Nodes are marked visited by setting dist[v] when they are enqueued, preventing duplicates. The parent array forms the BFS tree for path reconstruction.
1 #include <bits/stdc++.h>\nusing namespace std;\n\n// Reconstructs a shortest path in an unweighted graph using BFS parents.\nvector<int> bfs_path(int n, const vector<vector<int>>& adj, int s, int t) {\n const int INF = INT_MAX;\n vector<int> dist(n, INF), parent(n, -1);\n queue<int> q;\n dist[s] = 0;\n q.push(s);\n while (!q.empty()) {\n int u = q.front(); q.pop();\n if (u == t) break; // early exit when target found\n for (int v : adj[u]) {\n if (dist[v] == INF) {\n dist[v] = dist[u] + 1;\n parent[v] = u;\n q.push(v);\n }\n }\n }\n if (dist[t] == INF) return {}; // no path\n // Reconstruct path t -> s via parents, then reverse.\n vector<int> path;\n for (int cur = t; cur != -1; cur = parent[cur]) path.push_back(cur);\n reverse(path.begin(), path.end());\n return path;\n}\n\nint main() {\n ios::sync_with_stdio(false);\n cin.tie(nullptr);\n\n int n, m;\n cin >> n >> m;\n vector<vector<int>> adj(n);\n for (int i = 0; i < m; ++i) {\n int u, v;\n cin >> u >> v;\n // Undirected graph\n adj[u].push_back(v);\n adj[v].push_back(u);\n }\n int s, t;\n cin >> s >> t;\n\n vector<int> path = bfs_path(n, adj, s, t);\n if (path.empty()) {\n cout << "No path\n";\n } else {\n cout << "Path length (edges): " << (int)path.size() - 1 << "\n";\n cout << "Path: ";\n for (int i = 0; i < (int)path.size(); ++i) {\n if (i) cout << " ";\n cout << path[i];\n }\n cout << "\n";\n }\n return 0;\n}\n
This example finds a shortest path between s and t in an undirected, unweighted graph. It runs BFS to fill parent pointers, exits early when t is discovered, and reconstructs the path by walking back through parent[] from t to s.
1 #include <bits/stdc++.h>\nusing namespace std;\n\n// Find shortest path on an n x m grid from 'S' to 'T' avoiding '#'.\n// Moves allowed: up, down, left, right (cost 1 each).\nint main() {\n ios::sync_with_stdio(false);\n cin.tie(nullptr);\n\n int n, m;\n cin >> n >> m;\n vector<string> grid(n);\n for (int i = 0; i < n; ++i) cin >> grid[i];\n\n pair<int,int> S{-1, -1}, T{-1, -1};\n for (int i = 0; i < n; ++i) {\n for (int j = 0; j < m; ++j) {\n if (grid[i][j] == 'S') S = {i, j};\n if (grid[i][j] == 'T') T = {i, j};\n }\n }\n\n const int INF = INT_MAX;\n vector<vector<int>> dist(n, vector<int>(m, INF));\n vector<vector<pair<int,int>>> parent(n, vector<pair<int,int>>(m, {-1, -1}));\n queue<pair<int,int>> q;\n\n auto inb = [&](int r, int c){ return 0 <= r && r < n && 0 <= c && c < m; };\n const int dr[4] = {-1, 1, 0, 0};\n const int dc[4] = {0, 0, -1, 1};\n\n dist[S.first][S.second] = 0;\n q.push(S);\n\n while (!q.empty()) {\n auto [r, c] = q.front(); q.pop();\n if (make_pair(r, c) == T) break; // early exit when target reached\n for (int k = 0; k < 4; ++k) {\n int nr = r + dr[k], nc = c + dc[k];\n if (!inb(nr, nc)) continue;\n if (grid[nr][nc] == '#') continue;\n if (dist[nr][nc] == INF) {\n dist[nr][nc] = dist[r][c] + 1;\n parent[nr][nc] = {r, c};\n q.push({nr, nc});\n }\n }\n }\n\n if (dist[T.first][T.second] == INF) {\n cout << "No path\n";\n return 0;\n }\n\n // Reconstruct path from T back to S.\n vector<pair<int,int>> path;\n for (auto cur = T; cur.first != -1; cur = parent[cur.first][cur.second]) {\n path.push_back(cur);\n }\n reverse(path.begin(), path.end());\n\n cout << "Shortest distance: " << dist[T.first][T.second] << "\n";\n cout << "Path cells (r,c):\n";\n for (auto [r, c] : path) {\n cout << r << " " << c << "\n";\n }\n\n return 0;\n}\n
This program solves a maze-like grid problem: cells with '#' are walls, 'S' is start, 'T' is target. BFS explores four-direction neighbors, computes shortest distance, and reconstructs the cell-by-cell path.
1 #include <bits/stdc++.h>\nusing namespace std;\n\n// Given a graph and a set of sources, compute the distance to the nearest source for every node.\nint main() {\n ios::sync_with_stdio(false);\n cin.tie(nullptr);\n\n int n, m;\n cin >> n >> m;\n vector<vector<int>> adj(n);\n for (int i = 0; i < m; ++i) {\n int u, v;\n cin >> u >> v;\n // Undirected\n adj[u].push_back(v);\n adj[v].push_back(u);\n }\n int k;\n cin >> k;\n vector<int> sources(k);\n for (int i = 0; i < k; ++i) cin >> sources[i];\n\n const int INF = INT_MAX;\n vector<int> dist(n, INF);\n queue<int> q;\n\n // Seed all sources with distance 0.\n for (int s : sources) {\n if (dist[s] == INF) { // avoid duplicates if sources repeat\n dist[s] = 0;\n q.push(s);\n }\n }\n\n while (!q.empty()) {\n int u = q.front(); q.pop();\n for (int v : adj[u]) {\n if (dist[v] == INF) {\n dist[v] = dist[u] + 1;\n q.push(v);\n }\n }\n }\n\n // Output nearest-source distances (or -1 if unreachable)\n for (int i = 0; i < n; ++i) {\n cout << (dist[i] == INF ? -1 : dist[i]) << (i + 1 == n ? '\n' : ' ');\n }\n\n return 0;\n}\n
This code initializes the queue with multiple sources at distance 0. BFS then expands all sources simultaneously, assigning each node the distance to its nearest source.