⚙️AlgorithmIntermediate

Tree Distances and Diameter

Key Points

  • Tree diameter is the longest simple path in a tree and can be found with two BFS/DFS runs.
  • To find the diameter: pick any node A, find the farthest node B from A, then find the farthest node C from B; the path B–C is the diameter.
  • Tree centers are the middle node(s) on a diameter path: one center if the diameter length is even, two centers if it is odd.
  • The radius of a tree equals the ceiling of half the diameter length.
  • All-pairs distances on a rooted tree can be answered quickly using Lowest Common Ancestor (LCA) with binary lifting.
  • The distance formula using LCA is dist(u,v) = depth[u] + depth[v] − 2 × depth[lca(u,v)].
  • Eccentricity of any node u in a tree equals max(dist(u, a), dist(u, b)) where a and b are diameter endpoints.
  • For competitive programming, preprocess LCA in O(n log n) and answer each distance query in O(log n) time.

Prerequisites

  • Graph representation (adjacency lists)To store and traverse the tree efficiently.
  • BFS and DFS fundamentalsTo perform linear-time traversals and compute distances/parents.
  • Tree propertiesUnderstanding unique simple paths and acyclicity is essential for correctness proofs.
  • Big-O notation and logarithmsTo analyze time/space complexity, especially for binary lifting (logarithmic factors).
  • Bit operations and powers of twoBinary lifting relies on 2^k jumps and bitwise checks when lifting.
  • Queue and stack usageBFS requires a queue; iterative DFS may need a stack.

Detailed Explanation

Tap terms for definitions

01Overview

Trees are graphs without cycles, and because they are minimally connected, distances in trees behave very cleanly: there is exactly one simple path between any two nodes. This makes two fundamental tasks especially elegant: finding the diameter (the longest simple path) and answering distance queries between arbitrary nodes. The classic and surprisingly simple trick to get the diameter is to run a breadth-first search (BFS) or depth-first search (DFS) from any node to find a farthest node B, then run BFS/DFS again from B to find a farthest node C; the path B–C is the diameter, and its length is the tree’s longest distance. From the diameter, we can read off the tree’s center(s): the middle node(s) on this longest path. Centers minimize the maximum distance to all other nodes (they are the best places to stand if you want to be as close as possible to everyone else). For many queries, especially in competitive programming, rooting the tree and using the Lowest Common Ancestor (LCA) allows fast distance computations using the formula dist(u, v) = depth[u] + depth[v] − 2 × depth[lca(u, v)]. With binary lifting, we preprocess ancestor jumps in O(n log n) time and answer each LCA (and thus distance) query in O(log n). These tools together solve a wide range of problems about paths, radii, centers, eccentricities, and pairwise distances in trees.

02Intuition & Analogies

Imagine a long network of roads without loops—more like tree branches than city blocks. If you start walking from any intersection A as far as you can go, you’ll end up at one of the two ends of the longest road-like stretch in the network. Call that end B. If you now start at B and walk as far as possible, you’ll reach the other end C of that longest stretch. That’s the diameter: the longest straight-shot path you can take without backtracking. The center(s) are the ideal meetup points along this longest stretch: stand in the middle so that the longest person’s walk to you is as short as it can be. If the longest stretch has an odd number of edges, there are two middle intersections; otherwise, there’s a single perfect center. For distances between any two intersections u and v, think of the unique path that goes up from u to the point where their paths meet (their Lowest Common Ancestor), then down to v. Your total steps are just the steps up plus the steps down. LCA is basically the point in the tree where the paths from u and v converge when you move towards the root, like the first common ancestor in a family tree. Binary lifting is like keeping an escalator for each node that can take you up by 1 step, 2 steps, 4 steps, 8 steps, and so on, so you can jump upwards quickly by combining powers of two. With these tools, you can answer distance queries rapidly and reason about the tree’s shape—long spines, balanced subtrees, and the best places to position facilities (centers) to minimize the worst-case travel.

03Formal Definition

Let T = (V, E) be a tree with = n and = n − 1, undirected and connected. The distance between vertices u and v, denoted dist(u, v), is the number of edges on the unique simple path between u and v (or the sum of edge weights in a weighted tree). The tree diameter D is defined as D = max_{u,v V} dist(u, v). A pair (a, b) is called a pair of diameter endpoints if dist(a, b) = D. The eccentricity of a vertex u is e(u) = dist(u, v). The tree radius R is R = min_{u V} e(u), and any vertex attaining this minimum is called a tree center. For trees, the set of centers consists of either one node or two adjacent nodes and sits at the middle of any diameter path. If we root the tree at a node r, we define depth[u] as dist(r, u). The Lowest Common Ancestor lca(u, v) is the unique node w of maximum depth that is an ancestor of both u and v in the rooted tree. Distances can be computed via lca as dist(u, v) = depth[u] + depth[v] − 2 × depth[lca(u, v)]. For unweighted trees, BFS from any node returns shortest path distances in O(n). The two-sweep method (two BFS/DFS runs) finds the diameter in O(n). With binary lifting over a rooted tree, we precompute up to n ancestors for each node to answer LCA in O( n).

04When to Use

Use the two-sweep BFS/DFS technique whenever you need the tree’s diameter or its endpoints; it is optimal and simple for both unweighted and weighted trees (use BFS/DFS for unweighted, and a DFS/Dijkstra accumulation for weighted). If a problem asks for the best placement that minimizes the maximum distance—like where to put a facility on a tree-shaped network—compute the center(s) from the diameter path and use the radius. When you need many distance queries between arbitrary pairs on a fixed tree, preprocess LCA with binary lifting: after O(n log n) preprocessing, each query is O(log n), which is excellent for up to 2e5 queries typical in competitive programming. If a problem asks for every node’s eccentricity (its farthest distance), use the property that for a tree with diameter endpoints a and b, e(u) = max(dist(u, a), dist(u, b)); two BFS runs from a and b let you compute all eccentricities in O(n). Choose BFS over DFS if you want iterative code and to avoid recursion depth issues; both give O(n) on trees. For weighted trees with positive weights, replace BFS with two Dijkstra runs or two DFS passes that accumulate weighted distances. LCA-based distances remain valid regardless of weights, as long as depth means root-to-node path length using the same weight metric.

⚠️Common Mistakes

Common pitfalls include: (1) Mixing node count and edge count distances. In trees, distance usually counts edges, so a path with L edges has L + 1 nodes; off-by-one errors here can misidentify centers. (2) Forgetting to reconstruct the actual diameter path when computing centers; you must keep parent pointers from the second BFS/DFS and walk back from the farthest node to get the path. (3) Using BFS on weighted trees; BFS is correct only for unweighted (or equally weighted) edges. For weighted trees, use Dijkstra or weighted DFS accumulation. (4) Building an LCA table with the wrong LOG size or not initializing parent[0][root] correctly; ensure LOG ≥ \lceil \log_2 n \rceil + 1 and that missing ancestors are set to the root or -1 consistently. (5) Confusing depth with distance: if your root is not a true root (any node is fine), depth is measured from that root, but the LCA distance formula still holds; just ensure depths are computed from the same root. (6) Recursion stack overflows with deep trees when using recursive DFS; prefer BFS or iterative DFS for n up to 2e5 on platforms with limited stack. (7) Not clearing or resizing adjacency lists and arrays between test cases, leading to stale data. (8) Assuming the diameter endpoints are unique; multiple pairs can realize the same diameter. The centers, however, are still the middle node(s) of any diameter path. Carefully handle the even/odd diameter length case when outputting one vs two centers.

Key Formulas

Tree Diameter

Explanation: The diameter D is the largest distance among all pairs of nodes in the tree. It captures the length of the longest simple path.

Tree Radius

Explanation: The radius is the smallest possible eccentricity over all nodes. In any tree, it equals the ceiling of half the diameter length.

Eccentricity

Explanation: The eccentricity of node u is its farthest distance to any other node. Minimizing this over u gives the tree’s center(s).

Eccentricity via Diameter Endpoints

Explanation: If a and b are diameter endpoints, the farthest node from u must be one of them. So computing two BFS arrays from a and b yields all eccentricities.

Distance via LCA

Explanation: The path from u to v goes up from u to the LCA, then down to v. The depth overlap is subtracted twice to avoid double counting.

BFS on a Tree

Explanation: On a tree with n nodes and n−1 edges, BFS runs in linear time because it visits each node and edge once.

Binary Lifting Complexity

Explanation: Precomputing 2^k ancestors for each node takes O(n log n). Each LCA query lifts nodes upward in O(log n) steps.

Centers from Diameter Path

Explanation: If the diameter path has L edges, there are L+1 nodes on it. The middle is one node when L is even and two adjacent nodes when L is odd.

Binary Lifting Transition

Explanation: The 2^k-th ancestor of v is the 2^{k−1}-th ancestor of the 2^{k−1}-th ancestor of v, enabling fast upward jumps.

Complexity Analysis

On trees with n nodes and n−1 edges, BFS and DFS run in O(n) time and O(n) space because each node and edge is visited once, and we typically store distances, parents, and a queue/stack. The two-sweep method for diameter (BFS/DFS from an arbitrary node to find a far node a, then from a to find its farthest b) therefore costs O(n) time overall and O(n) space for arrays and the adjacency list. Reconstructing the diameter path requires keeping parent pointers during the second sweep and walking back along at most O(n) nodes, still linear. Computing centers from the diameter path is O(D) where D is the diameter length, bounded by O(n). For many distance queries on a fixed rooted tree, Lowest Common Ancestor via binary lifting requires precomputing ancestors up to LOG = ⌈log2 n⌉ levels. Building up[k][v] for all k and v costs O(n log n) time and O(n log n) space. Each LCA(u, v) query is answered in O(log n) time by first lifting the deeper node to the same depth and then lifting both nodes upward until their parents match. Using the LCA to compute distances adds constant extra work, so each distance query remains O(log n). If you need every node’s eccentricity, two BFS runs from the diameter endpoints a and b compute two distance arrays (da and db), and taking e[u] = max(da[u], db[u]) for all u is O(n) time and O(n) space. For weighted trees with positive weights, replace BFS with Dijkstra (using a binary heap), which costs O(n log n) in general, though still linear in the number of edges; the overall approach and asymptotic preprocessing/query complexities for LCA-based distances remain the same as in the unweighted case (so long as depth reflects weighted distances).

Code Examples

Compute Tree Diameter via Two BFS Runs (Unweighted)
1#include <bits/stdc++.h>
2using namespace std;
3
4// BFS on an unweighted tree from a start node.
5// Fills dist and parent arrays and returns the farthest node from start.
6int bfs_far(const vector<vector<int>>& g, int start, vector<int>& dist, vector<int>& parent) {
7 int n = (int)g.size() - 1; // assuming 1-indexed nodes
8 dist.assign(n + 1, -1);
9 parent.assign(n + 1, -1);
10 queue<int> q;
11 dist[start] = 0;
12 q.push(start);
13 int far = start;
14 while (!q.empty()) {
15 int u = q.front(); q.pop();
16 if (dist[u] > dist[far]) far = u;
17 for (int v : g[u]) {
18 if (dist[v] == -1) {
19 dist[v] = dist[u] + 1; // unweighted: each edge increases distance by 1
20 parent[v] = u;
21 q.push(v);
22 }
23 }
24 }
25 return far;
26}
27
28int main() {
29 ios::sync_with_stdio(false);
30 cin.tie(nullptr);
31
32 int n; // number of nodes
33 if (!(cin >> n)) return 0;
34 vector<vector<int>> g(n + 1);
35 for (int i = 0; i < n - 1; ++i) {
36 int u, v; cin >> u >> v;
37 g[u].push_back(v);
38 g[v].push_back(u);
39 }
40
41 vector<int> dist, parent;
42 // 1) From an arbitrary node (1), find the farthest node a
43 int a = bfs_far(g, 1, dist, parent);
44 // 2) From a, find the farthest node b and distances
45 int b = bfs_far(g, a, dist, parent);
46 int diameter_length = dist[b];
47
48 cout << "Diameter length (in edges): " << diameter_length << "\n";
49 cout << "One pair of diameter endpoints: " << a << " " << b << "\n";
50
51 return 0;
52}
53

This program reads an unweighted tree, runs BFS from node 1 to find a farthest node a, then runs BFS from a to find a farthest node b. The distance from a to b is the diameter length, and (a, b) is one pair of endpoints of a longest path.

Time: O(n)Space: O(n)
Find Tree Center(s) by Reconstructing the Diameter Path
1#include <bits/stdc++.h>
2using namespace std;
3
4int bfs_far(const vector<vector<int>>& g, int start, vector<int>& dist, vector<int>& parent) {
5 int n = (int)g.size() - 1;
6 dist.assign(n + 1, -1);
7 parent.assign(n + 1, -1);
8 queue<int> q;
9 dist[start] = 0;
10 q.push(start);
11 int far = start;
12 while (!q.empty()) {
13 int u = q.front(); q.pop();
14 if (dist[u] > dist[far]) far = u;
15 for (int v : g[u]) if (dist[v] == -1) {
16 dist[v] = dist[u] + 1;
17 parent[v] = u;
18 q.push(v);
19 }
20 }
21 return far;
22}
23
24int main() {
25 ios::sync_with_stdio(false);
26 cin.tie(nullptr);
27
28 int n; cin >> n;
29 vector<vector<int>> g(n + 1);
30 for (int i = 0; i < n - 1; ++i) {
31 int u, v; cin >> u >> v;
32 g[u].push_back(v);
33 g[v].push_back(u);
34 }
35
36 vector<int> dist, parent;
37 int a = bfs_far(g, 1, dist, parent);
38 int b = bfs_far(g, a, dist, parent); // parent[] now stores the tree of BFS from a
39
40 // Reconstruct diameter path from b back to a using parent[]
41 vector<int> path;
42 for (int cur = b; cur != -1; cur = parent[cur]) path.push_back(cur);
43 // path is b -> ... -> a (reverse of the path from a to b)
44 reverse(path.begin(), path.end()); // now a -> ... -> b
45
46 int L = (int)path.size() - 1; // number of edges on the path
47 int radius = (L + 1) / 2; // ceil(L/2)
48
49 cout << "Diameter length: " << L << "\n";
50 cout << "Radius: " << radius << "\n";
51
52 if (L % 2 == 0) {
53 // single center at the middle index
54 int center = path[L / 2];
55 cout << "Number of centers: 1\n";
56 cout << "Center: " << center << "\n";
57 } else {
58 // two centers at the two middle indices
59 int c1 = path[L / 2];
60 int c2 = path[L / 2 + 1];
61 cout << "Number of centers: 2\n";
62 cout << "Centers: " << c1 << " " << c2 << "\n";
63 }
64
65 return 0;
66}
67

After finding diameter endpoints (a, b) with two BFS runs, we reconstruct the path between them using the parent array from the second BFS. The center(s) are simply the middle node(s) of this path; the radius is ceil(diameter/2).

Time: O(n)Space: O(n)
LCA with Binary Lifting and Distance Queries
1#include <bits/stdc++.h>
2using namespace std;
3
4struct LCA {
5 int n, LOG;
6 vector<vector<int>> g;
7 vector<int> depth;
8 vector<vector<int>> up; // up[k][v] = 2^k-th ancestor of v (or -1)
9
10 LCA(int n) : n(n) {
11 g.assign(n + 1, {});
12 }
13
14 void add_edge(int u, int v) {
15 g[u].push_back(v);
16 g[v].push_back(u);
17 }
18
19 void build(int root = 1) {
20 // compute LOG = ceil(log2(n)) + 1
21 LOG = 1;
22 while ((1 << LOG) <= n) ++LOG;
23 depth.assign(n + 1, -1);
24 up.assign(LOG, vector<int>(n + 1, -1));
25
26 // BFS to set depth and immediate parent up[0]
27 queue<int> q;
28 depth[root] = 0;
29 up[0][root] = -1;
30 q.push(root);
31 while (!q.empty()) {
32 int u = q.front(); q.pop();
33 for (int v : g[u]) if (v != up[0][u]) {
34 if (depth[v] == -1) {
35 depth[v] = depth[u] + 1;
36 up[0][v] = u;
37 q.push(v);
38 }
39 }
40 }
41 // Binary lifting DP
42 for (int k = 1; k < LOG; ++k) {
43 for (int v = 1; v <= n; ++v) {
44 int mid = up[k - 1][v];
45 up[k][v] = (mid == -1 ? -1 : up[k - 1][mid]);
46 }
47 }
48 }
49
50 int lift(int v, int steps) const {
51 for (int k = 0; k < LOG && v != -1; ++k) {
52 if (steps & (1 << k)) v = up[k][v];
53 }
54 return v;
55 }
56
57 int lca(int a, int b) const {
58 if (depth[a] < depth[b]) swap(a, b);
59 // Lift a to the depth of b
60 a = lift(a, depth[a] - depth[b]);
61 if (a == b) return a;
62 // Lift both up together
63 for (int k = LOG - 1; k >= 0; --k) {
64 if (up[k][a] != up[k][b]) {
65 a = up[k][a];
66 b = up[k][b];
67 }
68 }
69 // Now their parents are the LCA
70 return up[0][a];
71 }
72
73 int dist(int a, int b) const {
74 int w = lca(a, b);
75 return depth[a] + depth[b] - 2 * depth[w];
76 }
77};
78
79int main() {
80 ios::sync_with_stdio(false);
81 cin.tie(nullptr);
82
83 int n; cin >> n;
84 LCA solver(n);
85 for (int i = 0; i < n - 1; ++i) {
86 int u, v; cin >> u >> v;
87 solver.add_edge(u, v);
88 }
89 solver.build(1); // root at 1 (any node works)
90
91 int q; cin >> q;
92 while (q--) {
93 int u, v; cin >> u >> v;
94 cout << solver.dist(u, v) << "\n";
95 }
96 return 0;
97}
98

We root the tree and build a binary lifting table up[k][v] to jump 2^k ancestors at a time. LCA queries are answered by lifting the deeper node to the same depth, then lifting both until their parents match. Distances follow from the LCA depth formula.

Time: Preprocessing O(n log n), per query O(log n)Space: O(n log n)
Compute All Eccentricities Using Two BFS from Diameter Endpoints
1#include <bits/stdc++.h>
2using namespace std;
3
4int bfs_far(const vector<vector<int>>& g, int start, vector<int>& dist) {
5 int n = (int)g.size() - 1;
6 dist.assign(n + 1, -1);
7 queue<int> q;
8 dist[start] = 0;
9 q.push(start);
10 int far = start;
11 while (!q.empty()) {
12 int u = q.front(); q.pop();
13 if (dist[u] > dist[far]) far = u;
14 for (int v : g[u]) if (dist[v] == -1) {
15 dist[v] = dist[u] + 1;
16 q.push(v);
17 }
18 }
19 return far;
20}
21
22int main() {
23 ios::sync_with_stdio(false);
24 cin.tie(nullptr);
25
26 int n; cin >> n;
27 vector<vector<int>> g(n + 1);
28 for (int i = 0; i < n - 1; ++i) {
29 int u, v; cin >> u >> v;
30 g[u].push_back(v);
31 g[v].push_back(u);
32 }
33
34 vector<int> distA, distB;
35 // Find diameter endpoints a and b
36 vector<int> tmp;
37 int a = bfs_far(g, 1, tmp);
38 int b = bfs_far(g, a, distA); // distA now holds distances from a
39 bfs_far(g, b, distB); // distB holds distances from b
40
41 // Eccentricity of each node u is max(dist(u,a), dist(u,b))
42 for (int u = 1; u <= n; ++u) {
43 int ecc = max(distA[u], distB[u]);
44 cout << ecc << (u == n ? '\n' : ' ');
45 }
46
47 return 0;
48}
49

After locating diameter endpoints a and b, we perform BFS from a and from b. For each node u, the maximum of these two distances equals its eccentricity, allowing us to output all eccentricities in linear time.

Time: O(n)Space: O(n)