DP on Trees
Key Points
- •DP on trees is a technique that computes answers for each node by combining results from its children using a post-order DFS.
- •Rooting the tree (choosing a parent for every node) turns a general tree into a directed structure that is easy to process bottom-up.
- •Classic problems include tree diameter, maximum independent set, counting independent sets, colorings, and sum of distances.
- •Transitions often come in two flavors: subtree DP (downward only) and rerooting DP (propagate information upward and sideways).
- •Most subtree DP solutions run in O(n) time because each edge is visited a constant number of times.
- •Designing the right DP state is the key step; small state vectors like dp[v][0/1] are common and powerful.
- •Be careful with recursion depth, modulo arithmetic, and parent-child handling to avoid stack overflow or double counting.
- •Rerooting uses prefix/suffix products to recompute contributions efficiently for every root in O(n) time.
Prerequisites
- →Depth-first search (DFS) — Tree DP relies on post-order DFS to ensure children are processed before their parent.
- →Trees and graph basics — Understanding acyclicity, degrees, and adjacency lists is essential for modeling and traversal.
- →Big-O notation — Analyzing time and space complexity of DP transitions requires asymptotic reasoning.
- →Recursion and stack behavior — Most implementations use recursive DFS; knowing recursion limits helps avoid runtime errors.
- →Modular arithmetic — Counting problems often require working under a modulus to prevent overflow.
- →Dynamic programming fundamentals — Defining states, transitions, and base cases is the core of any DP approach.
Detailed Explanation
Tap terms for definitions01Overview
Dynamic programming (DP) on trees is a method to solve problems on tree structures by breaking them into subproblems on subtrees. The idea is to root the tree at an arbitrary node (e.g., 1), run a depth-first search (DFS) so children are processed before their parent (post-order), and compute a DP value for each node from its children’s DP values. This strategy works because trees have no cycles, which means any path from a child to the rest of the tree goes through its parent. As a result, subtrees are naturally independent once the parent’s choice is fixed. DP on trees handles a variety of tasks: computing the tree’s diameter, finding the maximum independent set, counting configurations (like independent sets or colorings), and even calculating values for every node as if it were the root via rerooting. Most subtree DP solutions are linear time, O(n), since we traverse the tree once or twice, and each edge is considered a constant number of times. The main challenge is defining the right state and transitions. Often, a node’s DP value is a combination (sum, product, max) of its children’s values. When parent information is needed, we either expand the state to include the parent’s choice or use a second pass (rerooting) to propagate information from parent to child efficiently.
02Intuition & Analogies
Imagine a company hierarchy (a tree): a CEO at the top and employees below, each with their own teams. If you want to decide something for the CEO (like the maximum number of employees you can invite to a meeting with a 'no direct manager-employee pair' rule), it makes sense first to decide optimally for each manager’s team. Once every team is decided, the CEO’s decision is just a combination of those team decisions. That’s bottom-up thinking: children first, then parent. Trees are perfect for this because there are no cycles—every employee has exactly one path up to the CEO. In DP on trees, each node’s answer depends on its children’s answers through a simple rule (take the best, sum the counts, multiply the possibilities, etc.). Sometimes, however, what happens above a node matters too (e.g., how far is everything else if this node were the root?). For that, we do rerooting: compute a global picture from one root, then efficiently shift the root along edges, cleverly updating the answer using what we already know—like recalculating the company’s total commute time if you move HQ from manager A to manager B using a quick correction rather than recomputing from scratch. Prefix/suffix tricks help combine all siblings’ contributions except one child in constant time per edge, keeping the whole process linear.
03Formal Definition
04When to Use
Use subtree DP when the value at a node depends only on the independent results of its children, such as maximum independent set, minimum vertex cover, counting independent sets, or computing tree diameter/heights. Opt for DP with small state vectors (like two states per node) when there are binary choices (include/exclude, colored/not colored) that partition possibilities cleanly. Use rerooting DP when you need a value for every node as if it were the root (e.g., sum of distances from each node to all others, number of nodes in each node’s component upon removing that node, or the best score if the root were moved). Choose product-based transitions when counting combinatorial configurations across independent subtrees (e.g., colorings where siblings do not interact given the parent’s color). Prefer max/sum transitions when optimizing additive scores or path lengths (diameter, longest path). In contests (CF 1500–2000), typical prompts include: compute diameter, count independent sets modulo M, find maximum weighted independent set, compute DP for each node via rerooting (e.g., sum of distances), and configurations constrained by parent-child relations (e.g., no equal adjacent colors). When parent decisions affect child feasibility, either augment the state with a parent-related flag or perform a second pass to push the necessary context downward.
⚠️Common Mistakes
- Forgetting to remove the parent in adjacency loops, which creates infinite recursion or double counting. Always pass parent in DFS and skip it. 2) Mixing definitions: counting nodes vs. edges on path lengths; be consistent in height/diameter definitions and convert if needed. 3) Incorrect base cases: leaves often need explicit initialization (e.g., depth 0 or 1 depending on convention; count = 1 for multiplicative DPs). 4) Overflow/modulo errors: products for counting explode quickly—use long long and apply modulo after every multiplication/addition. 5) Recursion depth: for n up to 2e5, recursive DFS in C++ may stack overflow on some systems; consider iterative DFS or increase stack size. 6) Wrong DP state: use separate states for mutually exclusive choices (e.g., include/exclude), otherwise transitions will intermix incompatible cases. 7) Rerooting without prefix/suffix: recomputing merges per child leads to O(n^2); instead, precompute prefix and suffix arrays of child contributions for O(1) exclusion. 8) Not clearing globals between test cases: vectors and global variables must be resized/reset. 9) Assuming binary tree properties on general trees: do not index children as left/right; handle arbitrary degree with vectors. 10) Off-by-one in heights: decide whether height counts nodes or edges and stay consistent across formulas and code.
Key Formulas
General subtree DP
Explanation: The DP value at node v is computed by combining the DP values of its children via a problem-specific function F. Post-order traversal ensures children are processed first.
Maximum weighted independent set (tree)
Explanation: If v is included, none of its children can be included; if v is excluded, each child can be included or not, whichever is better. Summing over children yields the optimal total weight.
Tree diameter via depths
Explanation: depth[v] is the longest downward path in edges. The diameter is the maximum sum of the two largest downward depths from any node, corresponding to the longest path passing through that node.
Count of independent sets
Explanation: If v is included, no child may be; if v is excluded, a child may be included or not. Multiplication combines independent choices across subtrees.
Proper K-colorings on a tree (parent-fixed color DP)
Explanation: When v’s color is fixed, each child can choose any of the remaining K-1 colors and then color its subtree accordingly. Multiply over children, and finally multiply by K choices for the root.
Sum of distances via rerooting
Explanation: dp[v] is the sum of distances from v to nodes in its subtree. Using rerooting, we transform the answer from parent v to child u by decreasing distances to u’s subtree and increasing distances to the rest.
Time complexity (linear)
Explanation: Tree DP usually visits each edge once or twice with O(1) work per visit, leading to linear time in the number of nodes n.
Generic rerooting transition
Explanation: The upward contribution for child u comes from merging all contributions at v except u’s. Prefix/suffix aggregates allow this exclusion in O(1) per child.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Maximum Weighted Independent Set on a Tree 5 // dp[v][1]: best total weight in v's subtree if v is included 6 // dp[v][0]: best total weight in v's subtree if v is excluded 7 8 const int MAXN = 200000; 9 vector<int> g[MAXN + 5]; 10 long long w[MAXN + 5]; 11 long long dp0[MAXN + 5], dp1[MAXN + 5]; 12 13 void dfs(int v, int p) { 14 dp1[v] = w[v]; // if we include v, start with its weight 15 dp0[v] = 0; // if we exclude v, start at 0 16 for (int u : g[v]) if (u != p) { 17 dfs(u, v); 18 // If v is included, children cannot be included 19 dp1[v] += dp0[u]; 20 // If v is excluded, children can be included or not (take the better) 21 dp0[v] += max(dp0[u], dp1[u]); 22 } 23 } 24 25 int main() { 26 ios::sync_with_stdio(false); 27 cin.tie(nullptr); 28 29 int n; 30 if (!(cin >> n)) return 0; 31 for (int i = 1; i <= n; ++i) cin >> w[i]; 32 for (int i = 0; i < n - 1; ++i) { 33 int u, v; cin >> u >> v; 34 g[u].push_back(v); 35 g[v].push_back(u); 36 } 37 int root = 1; 38 dfs(root, 0); 39 cout << max(dp0[root], dp1[root]) << "\n"; 40 return 0; 41 } 42
We root the tree at 1 and do a post-order DFS. At each node, we compute two values: including the node (children must be excluded) and excluding it (each child may be included or not). The answer is the better of the two at the root.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Diameter computed as the maximum sum of top two downward depths at any node 5 6 const int MAXN = 200000; 7 vector<int> g[MAXN + 5]; 8 int n; 9 long long diameter_edges = 0; 10 11 int dfs(int v, int p) { 12 int best1 = -1; // top-1 depth (in edges) 13 int best2 = -1; // top-2 depth 14 for (int u : g[v]) if (u != p) { 15 int d = dfs(u, v); // depth from child u downwards 16 if (d > best1) { best2 = best1; best1 = d; } 17 else if (d > best2) { best2 = d; } 18 } 19 // depth from v to deepest leaf below is best1 + 1 (if any child), else 0 for leaf 20 int depth_v = (best1 == -1 ? 0 : best1 + 1); 21 // Update diameter: if two branches exist, sum them; if one branch, it is best1+1 22 diameter_edges = max(diameter_edges, 1LL * max(0, best1) + max(0, best2) + 2LL); 23 if (best1 == -1) diameter_edges = max(diameter_edges, 0LL); // single node case 24 return depth_v; 25 } 26 27 int main(){ 28 ios::sync_with_stdio(false); cin.tie(nullptr); 29 cin >> n; 30 for (int i = 0; i < n - 1; ++i) { 31 int u, v; cin >> u >> v; 32 g[u].push_back(v); g[v].push_back(u); 33 } 34 // Handle n=1 separately 35 if (n == 1) { cout << 0 << "\n"; return 0; } 36 dfs(1, 0); 37 // The formula above computed two-branch sum in terms of edges; fix exact value: 38 // Our update added +2 around best1 and best2 which started at -1; simplify by direct formula below 39 // For clarity, re-run a clean implementation: 40 41 // Clean implementation for clarity: 42 diameter_edges = 0; 43 function<int(int,int)> D = [&](int v, int p){ 44 int best1 = 0, best2 = 0; // depths in edges, 0 for leaf baseline 45 for (int u : g[v]) if (u != p) { 46 int d = D(u, v) + 1; // edge to child u 47 if (d > best1) { best2 = best1; best1 = d; } 48 else if (d > best2) { best2 = d; } 49 } 50 diameter_edges = max(diameter_edges, 1LL * (best1 + best2)); 51 return best1; 52 }; 53 D(1, 0); 54 cout << diameter_edges << "\n"; // diameter measured in edges 55 return 0; 56 } 57
We compute for each node the two largest downward depths to leaves. The tree diameter (in edges) is the maximum sum of the top two depths at any node, corresponding to the longest path that goes down two branches.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Count independent sets modulo MOD using multiplicative DP 5 // cnt0[v]: number of valid sets in v's subtree when v is excluded 6 // cnt1[v]: number of valid sets in v's subtree when v is included 7 8 const int MOD = 1'000'000'007; 9 const int MAXN = 200000; 10 vector<int> g[MAXN + 5]; 11 long long cnt0[MAXN + 5], cnt1[MAXN + 5]; 12 13 void dfs(int v, int p) { 14 cnt1[v] = 1; // include v: product of cnt0[child] 15 cnt0[v] = 1; // exclude v: product of (cnt0[child] + cnt1[child]) 16 for (int u : g[v]) if (u != p) { 17 dfs(u, v); 18 cnt1[v] = (cnt1[v] * cnt0[u]) % MOD; 19 long long ways = (cnt0[u] + cnt1[u]) % MOD; 20 cnt0[v] = (cnt0[v] * ways) % MOD; 21 } 22 } 23 24 int main(){ 25 ios::sync_with_stdio(false); cin.tie(nullptr); 26 int n; cin >> n; 27 for (int i = 0; i < n - 1; ++i) { 28 int u, v; cin >> u >> v; 29 g[u].push_back(v); g[v].push_back(u); 30 } 31 dfs(1, 0); 32 cout << (cnt0[1] + cnt1[1]) % MOD << "\n"; 33 return 0; 34 } 35
We use two states per node: include or exclude. If a node is included, children must be excluded; if excluded, each child can be independently included or excluded. Multiplying the independent choices across children yields the count.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Compute sum of distances from each node to all others using two DFS passes 5 // First pass: compute subtree sizes (sz) and dp[v] = sum of distances from v to its subtree 6 // Second pass: reroot to get ans[u] from ans[v] in O(1) per edge 7 8 const int MAXN = 200000; 9 vector<int> g[MAXN + 5]; 10 int n; 11 int sz[MAXN + 5]; 12 long long dp[MAXN + 5]; 13 long long ansv[MAXN + 5]; 14 15 void dfs1(int v, int p){ 16 sz[v] = 1; dp[v] = 0; 17 for (int u : g[v]) if (u != p) { 18 dfs1(u, v); 19 sz[v] += sz[u]; 20 dp[v] += dp[u] + sz[u]; // distance to all nodes in u-subtree increases by 1 21 } 22 } 23 24 void dfs2(int v, int p){ 25 for (int u : g[v]) if (u != p) { 26 // Move root from v to u: 27 // Distances to nodes in u-subtree decrease by 1 (there are sz[u] of them) 28 // Distances to all other nodes increase by 1 (there are n - sz[u] of them) 29 ansv[u] = ansv[v] - sz[u] + (n - sz[u]); 30 dfs2(u, v); 31 } 32 } 33 34 int main(){ 35 ios::sync_with_stdio(false); cin.tie(nullptr); 36 cin >> n; 37 for (int i = 0; i < n - 1; ++i) { 38 int u, v; cin >> u >> v; 39 g[u].push_back(v); g[v].push_back(u); 40 } 41 int root = 1; 42 dfs1(root, 0); 43 ansv[root] = dp[root]; 44 dfs2(root, 0); 45 46 for (int i = 1; i <= n; ++i) cout << ansv[i] << (i + 1 <= n ? ' ' : '\n'); 47 return 0; 48 } 49
The first DFS computes subtree sizes and the sum of distances from the root to its subtree. The second DFS reroots along edges: from v to child u, we adjust the total by subtracting sz[u] (closer) and adding n−sz[u] (farther). This gives the sum of distances for every node in linear time.