⚙️AlgorithmAdvanced

Virtual Tree (Auxiliary Tree)

Key Points

•
A Virtual Tree (Auxiliary Tree) compresses a large tree into a much smaller tree that contains only the k important nodes and the LCAs needed to keep them connected.
•
You build it in O(k $lo g$ k) time by sorting the important nodes by DFS entry time and inserting LCAs between consecutive nodes.
•
The resulting virtual tree has at most 2k-1 nodes and O(k) edges, so you can run tree DP on it in near-linear time in k instead of n.
•
LCA preprocessing on the original tree costs O(n $lo g$ n), after which each query over a subset can be answered efficiently with a virtual tree.
•
Distances and contributions on the original tree are preserved on the virtual tree by using edge lengths equal to depth differences (or accumulated weights for weighted trees).
•
Common tasks like total Steiner tree length, pairwise distance sums, and custom DPs over marked subsets become straightforward on the virtual tree.
•
Be careful to deduplicate marked nodes, always include necessary LCAs, and only clear data structures for nodes touched by the current virtual tree.
•
This technique is a staple for multi-query competitive programming problems on trees with subset constraints (CF 2000–2400 level).

Prerequisites

→Depth-First Search (DFS) and tree rooting — Necessary to compute tin/tout times, depths, and to define ancestor relationships clearly.
→Lowest Common Ancestor (LCA) — Core operation for inserting required junctions between marked nodes.
→Binary lifting or Euler tour + RMQ — Efficient LCA implementations that make virtual tree construction fast.
→Big-O notation — To understand the per-query cost improvements from O(n) to O(k log k).
→Tree distance (unweighted and weighted) — Virtual tree preserves distances; many DPs rely on accurate distance computations.
→Stacks and sorting — The construction algorithm is based on sorting by tin and maintaining a stack of ancestors.
→Tree DP patterns — Once VT is built, typical solutions require DP over the compressed structure.
→Set/deduplication techniques — Marked nodes must be deduplicated to avoid double-counting or structural errors.

Detailed Explanation

Tap terms for definitions

01Overview

A Virtual Tree (also called an Auxiliary Tree) is a powerful technique to handle problems on trees where each query involves only a subset of k important nodes (k ≪ n). Instead of solving the problem on the full tree of size n for each query, we compress the tree to include only the important nodes and the least number of extra nodes (their Lowest Common Ancestors, LCAs) required to keep them connected. The crucial insight is that when you sort the marked nodes by their DFS entry time and connect them appropriately using LCAs, you obtain a much smaller tree that preserves the connectivity and path structure among the important nodes. This compressed structure is the virtual tree. Building it takes O(k log k) time (dominated by sorting), and the number of nodes and edges becomes O(k), enabling fast dynamic programming or greedy computations per query. Preprocessing for LCA on the original tree (via binary lifting or Euler tour + RMQ) is done once in O(n log n) time. The virtual tree is especially beneficial in multi-query settings: the amortized per-query cost drops dramatically from potentially O(n) to O(k log k) or O(k).

02Intuition & Analogies

Imagine a massive road network (the tree), but today you only care about deliveries to k specific cities (the important nodes). If you highlight just those cities and the minimal set of roads that connects them, you get a much smaller subnetwork. However, some junctions (LCAs) that are not on your delivery list still need to appear because they are the necessary intersections where paths between your cities meet. The virtual tree is exactly that: the minimal backbone connecting your cities (important nodes) through required junctions (LCAs). To construct it quickly, think of driving through the cities in the order you would naturally visit them if you always go as deep as possible before backtracking—this is the DFS order. If you sort the cities by this order, then any two consecutive cities are connected by a path that meets at their LCA. By pushing these LCAs in between, you can stitch together a neat route that never makes unnecessary detours. The stack construction mimics a courier’s notebook: you keep a stack of current junctions; when a new city lies outside the region of the top junction, you close that region and draw an edge to its parent in the notebook—this draws exactly the edges of the virtual tree. Because every city and junction is touched only a constant number of times, the total extra work stays linear after sorting the cities. Now your delivery optimization (DP) runs on this tiny notebook graph instead of the giant map.

03Formal Definition

Let T = (V, E) be a rooted tree with root r. For any subset S

\subseteq

V with

∣ S ∣

= k, define C to be the closure of S under the LCA operation on T: C is the smallest set such that S

\subseteq

C and for all u, v

\in

lca

(u, v)

\in

C. The virtual tree

T_{S}

is the unique tree whose vertex set is C and whose edges are the ancestor–descendant pairs (p, v) such that p is the closest strict ancestor of v in C (i.e., there is no node w

\in

C with p as ancestor of w and w as ancestor of v). If T is edge-weighted with weight function w: E

\to

R_{\geq 0}

, define the weight of an edge (p, v) in

T_{S}

to be the total weight along the unique path from p to v in T. If T is unweighted, this weight equals

depth

(v) -

depth

(p).

T_{S}

preserves all pairwise distances among nodes of S: for any u, v

\in

dist_{T}

(u, v) =

dist_{T_{S}}

(u, v). The size of

T_{S}

satisfies

∣ C ∣

\leq

2k - 1 and

∣ E (T_{S}) ∣

∣ C ∣

- 1. With LCA preprocessing,

T_{S}

can be constructed in O(k

lo g

k + k

\cdot

t_{

lca

}) time, where t_{

lca

} is the cost of a single LCA query.

04When to Use

Use a virtual tree when many queries each involve a small subset S of nodes and you need to compute something about paths or connectivity among S, e.g., total length of the minimal connecting subtree (Steiner tree on trees), sum of pairwise distances among marked nodes, or custom DPs where transitions depend on ancestor/descendant relations. Classic competitive programming scenarios include: (1) Processing Q queries, each with k marked nodes, where k is small relative to n; (2) Counting or minimizing some cost that accumulates along unique paths between marked nodes; (3) Weighted or unweighted trees where distance-based computations are required; (4) Situations where per-query O(n) is too slow but O(k log k) is feasible. If you already have LCA preprocessing, virtual trees allow near-linear per-query time in k. They are less useful when each query’s subset is large (k close to n) or when the problem depends on global properties unrelated to the marked subset (e.g., requires traversing the entire tree regardless of S). For single-query settings with large k, constructing a virtual tree is still often helpful because it provides a clean structure for DPs and proofs.

⚠️Common Mistakes

Forgetting to include LCAs: Sorting marked nodes by DFS order but not inserting LCAs between consecutive nodes yields a forest missing necessary junctions; paths among marked nodes are then broken. Always insert LCAs or use the stack algorithm that generates them implicitly.\n- Not deduplicating: If marked nodes contain duplicates, the stack logic or counts in DP can become incorrect. Deduplicate after sorting by tin.\n- Incorrect ancestor checks: Use the invariant tin[u] \le tin[v] \le tout[u] to test if u is an ancestor of v. Off-by-one or misordered comparisons create wrong edges.\n- Clearing whole arrays each query: Only clear data for nodes present in the current virtual tree (\le 2k-1 nodes). Clearing size-n arrays per query ruins performance. Maintain a "touched" list.\n- Depth vs. distance: In weighted trees, depth differences do not equal distances. Maintain an accumulated distance-from-root array to compute dist(u, v) correctly.\n- Stack linking bugs: When popping from the stack, connect the popped node to the new stack top (its closest ancestor in C). Forgetting to flush the stack at the end leaves edges missing.\n- Recursion limits: Deep recursion can overflow the system stack on adversarial trees. Consider iterative DFS or increase stack size if allowed.

Key Formulas

Unweighted Tree Distance

dist (u, v) = depth (u) + depth (v) - 2 depth (lca (u, v))

Explanation: The distance between two nodes in an unweighted tree is the sum of their depths minus twice the depth of their LCA. It counts how many edges you go up from u to the LCA and then down to v.

Weighted Tree Distance

DistW (u, v) = D [u] + D [v] - 2 D [lca (u, v))

Explanation: With D[x] as distance from root to x using edge weights, the weighted distance is computed by subtracting twice the root-to-LCA distance. This works because paths are unique in trees.

Virtual Tree Size Bound

∣ C ∣ \leq 2 k - 1

Explanation: The virtual tree contains at most the k marked nodes plus at most k−1 LCAs, since each insertion of an LCA reduces the number of connected components. Hence the total is bounded by 2k−1.

Build Complexity

T_{build} (k) = O (k lo g k + k t_{lca})

Explanation: Construction sorts the k nodes (O(k log k)) and performs O(k) ancestor/LCA operations; with binary lifting $t_{l} c a$ = O( $lo g$ n) and with RMQ t_ $l c a = O (1$ ).

Ancestor Test

isAncestor (u, v) ⟺ tin [u] \leq tin [v] \leq tout [u]

Explanation: Using DFS entry/exit times, u is an ancestor of v if v’s entry time is within u’s interval. This is a standard invariant from Euler tour timing.

Unweighted VT Edge Length

VT edge length (p, v) = depth (v) - depth (p)

Explanation: In an unweighted tree, the number of original edges on the path from p to v equals their depth difference. Summing these over VT edges gives the Steiner subtree size.

Pairwise Distance Sum via VT

PairsDistSum = (p, v) \in E_{V T} \sum (# S_{v}) (k - # S_{v}) ℓ (p, v)

Explanation: For each VT edge, let \# $S_{v}$ be the number of marked nodes in the child v’s component after removing (p,v), and $ℓ$ the path length from p to v. Each pair split by this edge contributes $ℓ$ to the distance sum.

Combinatorial Counting Refresher

i = 1 \sum n i = \frac{n ( n + 1 )}{2}

Explanation: Often used when converting counts of pairs to sums and vice versa (e.g., choosing pairs across components). It reminds that summing first n integers yields a quadratic expression.

Logarithm Change of Base

lo g_{a} b = \frac{lo g b}{lo g a}

Explanation: Relevant when comparing complexities like O(k $lo g$ k) with different log bases; the base only changes complexity by a constant factor.

Steiner Subtree Length on Trees

SteinerLength (S) = (p, v) \in E_{V T} \sum ℓ (p, v)

Explanation: The minimal subtree connecting S has total length equal to the sum of lengths of all edges in the virtual tree, because VT exactly decomposes the union of required paths.

Complexity Analysis

Preprocessing the original tree for LCA using binary lifting takes O(n log n) time and O(n log n) space: computing tin/tout, depths, and the up-table with LOG = ⌈log2 n⌉ ancestors per node. Alternatively, Euler tour + RMQ achieves O(n) preprocessing and O(1) LCA queries at the cost of extra memory for the RMQ structure. For each query with k marked nodes, virtual tree construction first deduplicates and sorts nodes by tin in O(k log k). You either (a) explicitly insert O(k) LCAs by checking consecutive pairs, then connect nodes via a stack in O(k), or (b) use the standard stack algorithm that computes necessary LCAs on the fly. Either approach performs O(k) LCA/ancestor checks; with binary lifting LCA, each is O(log n), giving O(k log n) for these parts. The total per-query time is therefore O(k log k + k log n). If you switch to Euler tour + RMQ LCA (O(1) per query), the per-query time improves to O(k log k + k). The size of the virtual tree is bounded by O(k), with at most 2k−1 nodes and 2k−2 edges. Any DP or aggregation over VT—such as counting marked nodes in subtrees or accumulating contributions over edges—then runs in O(k). Space per query is O(k) for the VT adjacency and auxiliary arrays; to avoid O(n) clearing costs, track the set of touched nodes and reset only those. Overall, in multi-query scenarios (Q queries), the total complexity is O(n log n + Σ(q∈Q)[

k_{q}

log

k_{q}

k_{q}

log n]), which is typically a major savings over naive per-query O(n) traversals.

Code Examples

Build a Virtual Tree and compute Steiner subtree length (unweighted)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 /*
5   Virtual Tree (Auxiliary Tree) construction on an unweighted tree.
6   - Preprocess LCA with binary lifting.
7   - Given k marked nodes, build the virtual tree in O(k log k).
8   - Compute the total number of original edges in the minimal subtree connecting them
9     (Steiner subtree length) as sum of depth differences across VT edges.
10 */
11 
12 struct LCA {
13     int n, LOG, root;
14     vector<vector<int>> up;   // up[v][j] = 2^j-th ancestor of v
15     vector<int> depth, tin, tout;
16     vector<vector<int>> g;
17     int timer = 0;
18 
19     LCA(int n=0): n(n) {}
20 
21     void init(int _n, int _root, const vector<vector<int>>& _g) {
22         n = _n; root = _root; g = _g;
23         LOG = 1; while ((1<<LOG) <= n) ++LOG;
24         up.assign(n+1, vector<int>(LOG));
25         depth.assign(n+1, 0);
26         tin.assign(n+1, 0);
27         tout.assign(n+1, 0);
28         timer = 0;
29         dfs(root, root);
30     }
31 
32     void dfs(int v, int p) {
33         tin[v] = ++timer;
34         up[v][0] = p;
35         for (int j = 1; j < LOG; ++j) up[v][j] = up[ up[v][j-1] ][j-1];
36         for (int to : g[v]) if (to != p) {
37             depth[to] = depth[v] + 1; // unweighted: each edge has length 1
38             dfs(to, v);
39         }
40         tout[v] = timer;
41     }
42 
43     bool is_ancestor(int u, int v) const { // u is ancestor of v?
44         return tin[u] <= tin[v] && tout[v] <= tout[u];
45     }
46 
47     int lca(int u, int v) const {
48         if (is_ancestor(u, v)) return u;
49         if (is_ancestor(v, u)) return v;
50         int x = u;
51         for (int j = LOG-1; j >= 0; --j) {
52             int a = up[x][j];
53             if (!is_ancestor(a, v)) x = a;
54         }
55         return up[x][0];
56     }
57 };
58 
59 int main() {
60     ios::sync_with_stdio(false);
61     cin.tie(nullptr);
62 
63     int n; cin >> n;
64     vector<vector<int>> g(n+1);
65     for (int i = 0; i < n-1; ++i) {
66         int u, v; cin >> u >> v;
67         g[u].push_back(v);
68         g[v].push_back(u);
69     }
70 
71     int root = 1;
72     LCA solver; solver.init(n, root, g);
73 
74     int k; cin >> k;
75     vector<int> a(k);
76     for (int i = 0; i < k; ++i) cin >> a[i];
77 
78     // Step 1: sort by tin and deduplicate
79     sort(a.begin(), a.end(), [&](int x, int y){ return solver.tin[x] < solver.tin[y]; });
80     a.erase(unique(a.begin(), a.end()), a.end());
81 
82     // Step 2: insert LCAs of consecutive nodes
83     vector<int> nodes = a;
84     for (int i = 0; i+1 < (int)a.size(); ++i) nodes.push_back(solver.lca(a[i], a[i+1]));
85     sort(nodes.begin(), nodes.end(), [&](int x, int y){ return solver.tin[x] < solver.tin[y]; });
86     nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
87 
88     // Step 3: build Virtual Tree adjacency (directed parent->child)
89     auto is_anc = [&](int u, int v){ return solver.is_ancestor(u, v); };
90     vector<vector<int>> vt(n+1); // allocate big but we'll only touch `nodes`
91 
92     vector<int> st;
93     st.reserve(nodes.size());
94     st.push_back(nodes[0]);
95     for (int i = 1; i < (int)nodes.size(); ++i) {
96         int v = nodes[i];
97         while ((int)st.size() >= 1 && !is_anc(st.back(), v)) {
98             int u = st.back(); st.pop_back();
99             // connect u to its closest ancestor now at st.back()
100             int p = st.back();
101             vt[p].push_back(u);
102         }
103         st.push_back(v);
104     }
105     while ((int)st.size() > 1) {
106         int u = st.back(); st.pop_back();
107         int p = st.back();
108         vt[p].push_back(u);
109     }
110 
111     // Step 4: Compute Steiner subtree length = sum over edges depth[v]-depth[p]
112     long long steiner_len = 0;
113     for (int p : nodes) {
114         for (int v : vt[p]) {
115             steiner_len += (long long)(solver.depth[v] - solver.depth[p]);
116         }
117     }
118 
119     cout << steiner_len << "\n";
120     return 0;
121 }
122

This program preprocesses LCA using binary lifting, then constructs the virtual tree for a set of marked nodes. It inserts necessary LCAs, sorts by tin, and uses a stack to connect each node to its closest ancestor in the virtual set. The Steiner subtree length in an unweighted tree is the sum of depth differences across virtual edges, which equals the number of original edges in the minimal connecting subtree.

Time: Preprocessing: O(n log n). Per query: O(k log k + k log n) to build (sorting + LCAs) and O(k) to sum edge lengths.Space: O(n log n) for LCA, O(k) additional for the virtual tree and touched structures.

Sum of pairwise distances among marked nodes using Virtual Tree (unweighted)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 /*
5   Compute sum of pairwise distances among k marked nodes.
6   Key identity on Virtual Tree:
7     Sum_{pairs (x,y)} dist(x,y) = sum over VT edges (p->v): s_v * (k - s_v) * len(p,v)
8   where s_v is the number of marked nodes in v's component below the edge (p,v),
9   and len(p,v) = depth[v] - depth[p] (unweighted).
10 */
11 
12 struct LCA {
13     int n, LOG, root, timer = 0;
14     vector<vector<int>> up, g;
15     vector<int> tin, tout, depth;
16     void init(int _n, int _root, const vector<vector<int>>& _g){
17         n=_n; root=_root; g=_g;
18         LOG=1; while((1<<LOG)<=n) ++LOG;
19         up.assign(n+1, vector<int>(LOG));
20         tin.assign(n+1,0); tout.assign(n+1,0); depth.assign(n+1,0);
21         dfs(root, root);
22     }
23     void dfs(int v, int p){
24         tin[v]=++timer; up[v][0]=p;
25         for(int j=1;j<LOG;++j) up[v][j]=up[ up[v][j-1] ][j-1];
26         for(int to: g[v]) if(to!=p){ depth[to]=depth[v]+1; dfs(to,v);} 
27         tout[v]=timer;
28     }
29     bool is_ancestor(int u, int v) const { return tin[u]<=tin[v] && tout[v]<=tout[u]; }
30     int lca(int u,int v) const{
31         if(is_ancestor(u,v)) return u; if(is_ancestor(v,u)) return v;
32         int x=u; for(int j=LOG-1;j>=0;--j){int a=up[x][j]; if(!is_ancestor(a,v)) x=a;} return up[x][0];
33     }
34 };
35 
36 int main(){
37     ios::sync_with_stdio(false);
38     cin.tie(nullptr);
39 
40     int n; cin>>n; 
41     vector<vector<int>> g(n+1);
42     for(int i=0;i<n-1;++i){int u,v;cin>>u>>v; g[u].push_back(v); g[v].push_back(u);} 
43 
44     LCA lca; lca.init(n, 1, g);
45 
46     int k; cin>>k; vector<int> a(k); for(int i=0;i<k;++i) cin>>a[i];
47 
48     // sort & unique by tin
49     sort(a.begin(), a.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
50     a.erase(unique(a.begin(), a.end()), a.end());
51 
52     // insert LCAs of consecutive
53     vector<int> nodes=a; 
54     for(int i=0;i+1<(int)a.size();++i) nodes.push_back(lca.lca(a[i], a[i+1]));
55     sort(nodes.begin(), nodes.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
56     nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
57 
58     // build VT
59     vector<vector<int>> vt(n+1);
60     vector<int> st; st.push_back(nodes[0]);
61     auto is_anc = [&](int u,int v){return lca.is_ancestor(u,v);} ;
62     for(int i=1;i<(int)nodes.size();++i){
63         int v=nodes[i];
64         while(!is_anc(st.back(), v)){
65             int u=st.back(); st.pop_back();
66             int p=st.back(); vt[p].push_back(u);
67         }
68         st.push_back(v);
69     }
70     while((int)st.size()>1){int u=st.back(); st.pop_back(); int p=st.back(); vt[p].push_back(u);}    
71 
72     // mark important nodes
73     unordered_set<int> S(a.begin(), a.end());
74 
75     // DFS over VT to compute s_v and accumulate answer
76     long long ans=0; 
77     function<int(int)> dfs = [&](int u){
78         int cnt = S.count(u) ? 1 : 0;
79         for(int v: vt[u]){
80             int sub = dfs(v);
81             long long len = (long long)(lca.depth[v]-lca.depth[u]);
82             ans += len * (long long)sub * (long long)( (int)a.size() - sub );
83             cnt += sub;
84         }
85         return cnt;
86     };
87 
88     dfs(nodes[0]);
89     cout<<ans<<"\n";
90     return 0;
91 }
92

This program computes the sum of pairwise distances among the marked nodes. After building the virtual tree, it runs a single DFS to count how many marked nodes lie in each child subtree. For each virtual edge (p→v), the number of pairs split by that edge is s_v·(k−s_v), and each such pair contributes the edge length to the total distance. Summing over edges yields the total pairwise distance.

Time: Preprocessing: O(n log n). Per query: O(k log k + k log n) for VT construction and O(k) for the DP.Space: O(n log n) for LCA tables; O(k) for VT and marked set.

Multiple queries + weighted tree: VT build and two computations (Steiner length and pairwise distances)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 /*
5   Supports multiple queries on a weighted tree using a Virtual Tree per query.
6   - Preprocess LCA with distances-to-root D[v] (weighted depths).
7   - For each query's marked set S:
8       * Build VT.
9       * Compute Steiner subtree total weight.
10       * Compute sum of pairwise weighted distances among S.
11   Efficient clearing: only touch nodes present in VT.
12 */
13 
14 struct LCA {
15     int n, LOG, root, timer = 0;
16     vector<vector<int>> up;
17     vector<int> tin, tout, depth;
18     vector<long long> D; // distance to root (weighted)
19     vector<vector<pair<int,long long>>> g;
20 
21     void init(int _n, int _root, const vector<vector<pair<int,long long>>>& _g){
22         n=_n; root=_root; g=_g;
23         LOG=1; while((1<<LOG)<=n) ++LOG;
24         up.assign(n+1, vector<int>(LOG));
25         tin.assign(n+1,0); tout.assign(n+1,0); depth.assign(n+1,0);
26         D.assign(n+1,0);
27         dfs(root, root);
28     }
29     void dfs(int v,int p){
30         tin[v]=++timer; up[v][0]=p;
31         for(int j=1;j<LOG;++j) up[v][j]=up[ up[v][j-1] ][j-1];
32         for(auto [to,w]: g[v]) if(to!=p){
33             depth[to]=depth[v]+1; D[to]=D[v]+w; dfs(to,v);
34         }
35         tout[v]=timer;
36     }
37     bool is_ancestor(int u,int v) const {return tin[u]<=tin[v] && tout[v]<=tout[u];}
38     int lca(int u,int v) const{
39         if(is_ancestor(u,v)) return u; if(is_ancestor(v,u)) return v;
40         int x=u; for(int j=LOG-1;j>=0;--j){int a=up[x][j]; if(!is_ancestor(a,v)) x=a;} return up[x][0];
41     }
42     long long dist(int u,int v) const { return D[u]+D[v]-2LL*D[lca(u,v)]; }
43 };
44 
45 int main(){
46     ios::sync_with_stdio(false);
47     cin.tie(nullptr);
48 
49     int n; cin>>n;
50     vector<vector<pair<int,long long>>> g(n+1);
51     for(int i=0;i<n-1;++i){
52         int u,v; long long w; cin>>u>>v>>w;
53         g[u].push_back({v,w}); g[v].push_back({u,w});
54     }
55     LCA lca; lca.init(n, 1, g);
56 
57     int Q; cin>>Q;
58 
59     // We'll reuse these per query to avoid reallocation overhead
60     vector<vector<pair<int,long long>>> vt(n+1);
61     vector<int> touched; touched.reserve(2e5);
62 
63     while(Q--){
64         int k; cin>>k; vector<int> a(k);
65         for(int i=0;i<k;++i) cin>>a[i];
66         sort(a.begin(), a.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
67         a.erase(unique(a.begin(), a.end()), a.end());
68 
69         vector<int> nodes=a;
70         for(int i=0;i+1<(int)a.size();++i) nodes.push_back(lca.lca(a[i], a[i+1]));
71         sort(nodes.begin(), nodes.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
72         nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
73 
74         // Clean only touched from previous query
75         for(int v: touched) vt[v].clear();
76         touched.clear();
77         for(int v: nodes) touched.push_back(v);
78 
79         // Build weighted VT: edge weight is dist(parent, child) in original tree
80         vector<int> st; st.push_back(nodes[0]);
81         auto is_anc = [&](int u,int v){return lca.is_ancestor(u,v);} ;
82         for(int i=1;i<(int)nodes.size();++i){
83             int v=nodes[i];
84             while(!is_anc(st.back(), v)){
85                 int u=st.back(); st.pop_back();
86                 int p=st.back(); long long w = lca.dist(p,u);
87                 vt[p].push_back({u,w});
88             }
89             st.push_back(v);
90         }
91         while((int)st.size()>1){int u=st.back(); st.pop_back(); int p=st.back(); long long w=lca.dist(p,u); vt[p].push_back({u,w});}
92 
93         // Marked set lookup
94         unordered_set<int> S(a.begin(), a.end());
95         int K = (int)S.size();
96 
97         // Compute Steiner length (weighted) and pairwise sum (weighted)
98         long long steiner_len = 0;
99         long long pair_sum = 0;
100 
101         function<int(int)> dfs = [&](int u){
102             int cnt = S.count(u) ? 1 : 0;
103             for(auto [v,w]: vt[u]){
104                 int sub = dfs(v);
105                 steiner_len += w;                      // each VT edge counted once
106                 pair_sum += w * 1LL * sub * 1LL * (K - sub); // contribution formula
107                 cnt += sub;
108             }
109             return cnt;
110         };
111         dfs(nodes[0]);
112 
113         cout << steiner_len << " " << pair_sum << "\n";
114     }
115     return 0;
116 }
117

This multi-query solution on a weighted tree constructs a virtual tree per query and computes two quantities: the total weight of the minimal subtree connecting the marked nodes and the weighted sum of pairwise distances. It cleans only the adjacency lists of touched VT nodes per query, ensuring O(k) memory reset. Weighted distances are handled via an accumulated distance-to-root array D[v].

Time: Preprocessing: O(n log n). Per query: O(k log k + k log n) to build VT and O(k) for the DP computations.Space: O(n log n) for LCA, O(k) per query for VT and bookkeeping.

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	/*
5	Virtual Tree (Auxiliary Tree) construction on an unweighted tree.
6	- Preprocess LCA with binary lifting.
7	- Given k marked nodes, build the virtual tree in O(k log k).
8	- Compute the total number of original edges in the minimal subtree connecting them
9	(Steiner subtree length) as sum of depth differences across VT edges.
10	*/
11
12	struct LCA {
13	int n, LOG, root;
14	vector<vector<int>> up; // up[v][j] = 2^j-th ancestor of v
15	vector<int> depth, tin, tout;
16	vector<vector<int>> g;
17	int timer = 0;
18
19	LCA(int n=0): n(n) {}
20
21	void init(int _n, int _root, const vector<vector<int>>& _g) {
22	n = _n; root = _root; g = _g;
23	LOG = 1; while ((1<<LOG) <= n) ++LOG;
24	up.assign(n+1, vector<int>(LOG));
25	depth.assign(n+1, 0);
26	tin.assign(n+1, 0);
27	tout.assign(n+1, 0);
28	timer = 0;
29	dfs(root, root);
30	}
31
32	void dfs(int v, int p) {
33	tin[v] = ++timer;
34	up[v][0] = p;
35	for (int j = 1; j < LOG; ++j) up[v][j] = up[ up[v][j-1] ][j-1];
36	for (int to : g[v]) if (to != p) {
37	depth[to] = depth[v] + 1; // unweighted: each edge has length 1
38	dfs(to, v);
39	}
40	tout[v] = timer;
41	}
42
43	bool is_ancestor(int u, int v) const { // u is ancestor of v?
44	return tin[u] <= tin[v] && tout[v] <= tout[u];
45	}
46
47	int lca(int u, int v) const {
48	if (is_ancestor(u, v)) return u;
49	if (is_ancestor(v, u)) return v;
50	int x = u;
51	for (int j = LOG-1; j >= 0; --j) {
52	int a = up[x][j];
53	if (!is_ancestor(a, v)) x = a;
54	}
55	return up[x][0];
56	}
57	};
58
59	int main() {
60	ios::sync_with_stdio(false);
61	cin.tie(nullptr);
62
63	int n; cin >> n;
64	vector<vector<int>> g(n+1);
65	for (int i = 0; i < n-1; ++i) {
66	int u, v; cin >> u >> v;
67	g[u].push_back(v);
68	g[v].push_back(u);
69	}
70
71	int root = 1;
72	LCA solver; solver.init(n, root, g);
73
74	int k; cin >> k;
75	vector<int> a(k);
76	for (int i = 0; i < k; ++i) cin >> a[i];
77
78	// Step 1: sort by tin and deduplicate
79	sort(a.begin(), a.end(), [&](int x, int y){ return solver.tin[x] < solver.tin[y]; });
80	a.erase(unique(a.begin(), a.end()), a.end());
81
82	// Step 2: insert LCAs of consecutive nodes
83	vector<int> nodes = a;
84	for (int i = 0; i+1 < (int)a.size(); ++i) nodes.push_back(solver.lca(a[i], a[i+1]));
85	sort(nodes.begin(), nodes.end(), [&](int x, int y){ return solver.tin[x] < solver.tin[y]; });
86	nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
87
88	// Step 3: build Virtual Tree adjacency (directed parent->child)
89	auto is_anc = [&](int u, int v){ return solver.is_ancestor(u, v); };
90	vector<vector<int>> vt(n+1); // allocate big but we'll only touch `nodes`
91
92	vector<int> st;
93	st.reserve(nodes.size());
94	st.push_back(nodes[0]);
95	for (int i = 1; i < (int)nodes.size(); ++i) {
96	int v = nodes[i];
97	while ((int)st.size() >= 1 && !is_anc(st.back(), v)) {
98	int u = st.back(); st.pop_back();
99	// connect u to its closest ancestor now at st.back()
100	int p = st.back();
101	vt[p].push_back(u);
102	}
103	st.push_back(v);
104	}
105	while ((int)st.size() > 1) {
106	int u = st.back(); st.pop_back();
107	int p = st.back();
108	vt[p].push_back(u);
109	}
110
111	// Step 4: Compute Steiner subtree length = sum over edges depth[v]-depth[p]
112	long long steiner_len = 0;
113	for (int p : nodes) {
114	for (int v : vt[p]) {
115	steiner_len += (long long)(solver.depth[v] - solver.depth[p]);
116	}
117	}
118
119	cout << steiner_len << "\n";
120	return 0;
121	}
122

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	/*
5	Compute sum of pairwise distances among k marked nodes.
6	Key identity on Virtual Tree:
7	Sum_{pairs (x,y)} dist(x,y) = sum over VT edges (p->v): s_v * (k - s_v) * len(p,v)
8	where s_v is the number of marked nodes in v's component below the edge (p,v),
9	and len(p,v) = depth[v] - depth[p] (unweighted).
10	*/
11
12	struct LCA {
13	int n, LOG, root, timer = 0;
14	vector<vector<int>> up, g;
15	vector<int> tin, tout, depth;
16	void init(int _n, int _root, const vector<vector<int>>& _g){
17	n=_n; root=_root; g=_g;
18	LOG=1; while((1<<LOG)<=n) ++LOG;
19	up.assign(n+1, vector<int>(LOG));
20	tin.assign(n+1,0); tout.assign(n+1,0); depth.assign(n+1,0);
21	dfs(root, root);
22	}
23	void dfs(int v, int p){
24	tin[v]=++timer; up[v][0]=p;
25	for(int j=1;j<LOG;++j) up[v][j]=up[ up[v][j-1] ][j-1];
26	for(int to: g[v]) if(to!=p){ depth[to]=depth[v]+1; dfs(to,v);}
27	tout[v]=timer;
28	}
29	bool is_ancestor(int u, int v) const { return tin[u]<=tin[v] && tout[v]<=tout[u]; }
30	int lca(int u,int v) const{
31	if(is_ancestor(u,v)) return u; if(is_ancestor(v,u)) return v;
32	int x=u; for(int j=LOG-1;j>=0;--j){int a=up[x][j]; if(!is_ancestor(a,v)) x=a;} return up[x][0];
33	}
34	};
35
36	int main(){
37	ios::sync_with_stdio(false);
38	cin.tie(nullptr);
39
40	int n; cin>>n;
41	vector<vector<int>> g(n+1);
42	for(int i=0;i<n-1;++i){int u,v;cin>>u>>v; g[u].push_back(v); g[v].push_back(u);}
43
44	LCA lca; lca.init(n, 1, g);
45
46	int k; cin>>k; vector<int> a(k); for(int i=0;i<k;++i) cin>>a[i];
47
48	// sort & unique by tin
49	sort(a.begin(), a.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
50	a.erase(unique(a.begin(), a.end()), a.end());
51
52	// insert LCAs of consecutive
53	vector<int> nodes=a;
54	for(int i=0;i+1<(int)a.size();++i) nodes.push_back(lca.lca(a[i], a[i+1]));
55	sort(nodes.begin(), nodes.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
56	nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
57
58	// build VT
59	vector<vector<int>> vt(n+1);
60	vector<int> st; st.push_back(nodes[0]);
61	auto is_anc = [&](int u,int v){return lca.is_ancestor(u,v);} ;
62	for(int i=1;i<(int)nodes.size();++i){
63	int v=nodes[i];
64	while(!is_anc(st.back(), v)){
65	int u=st.back(); st.pop_back();
66	int p=st.back(); vt[p].push_back(u);
67	}
68	st.push_back(v);
69	}
70	while((int)st.size()>1){int u=st.back(); st.pop_back(); int p=st.back(); vt[p].push_back(u);}
71
72	// mark important nodes
73	unordered_set<int> S(a.begin(), a.end());
74
75	// DFS over VT to compute s_v and accumulate answer
76	long long ans=0;
77	function<int(int)> dfs = [&](int u){
78	int cnt = S.count(u) ? 1 : 0;
79	for(int v: vt[u]){
80	int sub = dfs(v);
81	long long len = (long long)(lca.depth[v]-lca.depth[u]);
82	ans += len * (long long)sub * (long long)( (int)a.size() - sub );
83	cnt += sub;
84	}
85	return cnt;
86	};
87
88	dfs(nodes[0]);
89	cout<<ans<<"\n";
90	return 0;
91	}
92

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	/*
5	Supports multiple queries on a weighted tree using a Virtual Tree per query.
6	- Preprocess LCA with distances-to-root D[v] (weighted depths).
7	- For each query's marked set S:
8	* Build VT.
9	* Compute Steiner subtree total weight.
10	* Compute sum of pairwise weighted distances among S.
11	Efficient clearing: only touch nodes present in VT.
12	*/
13
14	struct LCA {
15	int n, LOG, root, timer = 0;
16	vector<vector<int>> up;
17	vector<int> tin, tout, depth;
18	vector<long long> D; // distance to root (weighted)
19	vector<vector<pair<int,long long>>> g;
20
21	void init(int _n, int _root, const vector<vector<pair<int,long long>>>& _g){
22	n=_n; root=_root; g=_g;
23	LOG=1; while((1<<LOG)<=n) ++LOG;
24	up.assign(n+1, vector<int>(LOG));
25	tin.assign(n+1,0); tout.assign(n+1,0); depth.assign(n+1,0);
26	D.assign(n+1,0);
27	dfs(root, root);
28	}
29	void dfs(int v,int p){
30	tin[v]=++timer; up[v][0]=p;
31	for(int j=1;j<LOG;++j) up[v][j]=up[ up[v][j-1] ][j-1];
32	for(auto [to,w]: g[v]) if(to!=p){
33	depth[to]=depth[v]+1; D[to]=D[v]+w; dfs(to,v);
34	}
35	tout[v]=timer;
36	}
37	bool is_ancestor(int u,int v) const {return tin[u]<=tin[v] && tout[v]<=tout[u];}
38	int lca(int u,int v) const{
39	if(is_ancestor(u,v)) return u; if(is_ancestor(v,u)) return v;
40	int x=u; for(int j=LOG-1;j>=0;--j){int a=up[x][j]; if(!is_ancestor(a,v)) x=a;} return up[x][0];
41	}
42	long long dist(int u,int v) const { return D[u]+D[v]-2LL*D[lca(u,v)]; }
43	};
44
45	int main(){
46	ios::sync_with_stdio(false);
47	cin.tie(nullptr);
48
49	int n; cin>>n;
50	vector<vector<pair<int,long long>>> g(n+1);
51	for(int i=0;i<n-1;++i){
52	int u,v; long long w; cin>>u>>v>>w;
53	g[u].push_back({v,w}); g[v].push_back({u,w});
54	}
55	LCA lca; lca.init(n, 1, g);
56
57	int Q; cin>>Q;
58
59	// We'll reuse these per query to avoid reallocation overhead
60	vector<vector<pair<int,long long>>> vt(n+1);
61	vector<int> touched; touched.reserve(2e5);
62
63	while(Q--){
64	int k; cin>>k; vector<int> a(k);
65	for(int i=0;i<k;++i) cin>>a[i];
66	sort(a.begin(), a.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
67	a.erase(unique(a.begin(), a.end()), a.end());
68
69	vector<int> nodes=a;
70	for(int i=0;i+1<(int)a.size();++i) nodes.push_back(lca.lca(a[i], a[i+1]));
71	sort(nodes.begin(), nodes.end(), [&](int x,int y){return lca.tin[x]<lca.tin[y];});
72	nodes.erase(unique(nodes.begin(), nodes.end()), nodes.end());
73
74	// Clean only touched from previous query
75	for(int v: touched) vt[v].clear();
76	touched.clear();
77	for(int v: nodes) touched.push_back(v);
78
79	// Build weighted VT: edge weight is dist(parent, child) in original tree
80	vector<int> st; st.push_back(nodes[0]);
81	auto is_anc = [&](int u,int v){return lca.is_ancestor(u,v);} ;
82	for(int i=1;i<(int)nodes.size();++i){
83	int v=nodes[i];
84	while(!is_anc(st.back(), v)){
85	int u=st.back(); st.pop_back();
86	int p=st.back(); long long w = lca.dist(p,u);
87	vt[p].push_back({u,w});
88	}
89	st.push_back(v);
90	}
91	while((int)st.size()>1){int u=st.back(); st.pop_back(); int p=st.back(); long long w=lca.dist(p,u); vt[p].push_back({u,w});}
92
93	// Marked set lookup
94	unordered_set<int> S(a.begin(), a.end());
95	int K = (int)S.size();
96
97	// Compute Steiner length (weighted) and pairwise sum (weighted)
98	long long steiner_len = 0;
99	long long pair_sum = 0;
100
101	function<int(int)> dfs = [&](int u){
102	int cnt = S.count(u) ? 1 : 0;
103	for(auto [v,w]: vt[u]){
104	int sub = dfs(v);
105	steiner_len += w; // each VT edge counted once
106	pair_sum += w * 1LL * sub * 1LL * (K - sub); // contribution formula
107	cnt += sub;
108	}
109	return cnt;
110	};
111	dfs(nodes[0]);
112
113	cout << steiner_len << " " << pair_sum << "\n";
114	}
115	return 0;
116	}
117