🗂️Data StructureAdvanced

Dynamic Segment Tree

Key Points

  • A dynamic segment tree stores values over a huge coordinate range by creating nodes only when an operation touches their interval.
  • Each update or query costs O(log C) where C is the size of the coordinate domain, even when C is up to 10^18.
  • Memory usage is proportional to the number of visited segments, typically O(q log C) over q operations, instead of O(C).
  • It supports standard operations like point updates, range updates (with lazy propagation), and range queries.
  • It is ideal for online problems where coordinate compression is not possible or keys arrive adversarially.
  • Implementation relies on pointer-based nodes and careful null checks; lazy tags must be pushed safely.
  • You can extend it to support order statistics (k-th value) on a sparse frequency map over a large value range.
  • For performance on competitive programming, use a custom memory pool to reduce allocation overhead and avoid TLE.

Prerequisites

  • Segment tree (static)Understanding the standard array-based segment tree clarifies intervals, recursion, and combination rules.
  • Associativity and identity (monoids)Range queries rely on associative combine functions with a neutral element to merge child results correctly.
  • Lazy propagationDynamic trees commonly need deferred range updates; knowing how lazy tags work is essential.
  • Pointer manipulation and recursion in C++Dynamic allocation, null checks, and recursive descent are fundamental to implementing the structure.
  • Coordinate compressionKnowing this alternative helps you decide when a dynamic tree is necessary vs. compressing indices.
  • Overflow-safe arithmeticLarge ranges require careful midpoint and length computations to avoid 64-bit overflow.

Detailed Explanation

Tap terms for definitions

01Overview

A dynamic segment tree is a pointer-based variant of the classical segment tree designed for extremely large coordinate ranges (e.g., 0..10^18) where a static array-based tree would be impossible to allocate. Instead of prebuilding all nodes for every subinterval, it allocates nodes on demand—only for segments that are actually visited by updates or queries. This yields two major benefits: memory proportional to the activity of the operations, and the ability to work online when you cannot foresee all coordinates in advance to compress them.

Functionally, it supports the same operations as a standard segment tree: point updates, range updates (with lazy propagation), and range queries under an associative combine operation (sum, min, max, etc.). The critical difference is that child pointers may be null, representing untouched regions that implicitly carry the neutral value. This approach makes the data structure "sparse aware": operations run in O(log C) time where C is the coordinate domain size, independent of how large C is in memory.

Dynamic segment trees are widely used in competitive programming and systems that must index large integer keys sparsely—such as frequency maps over 32/64-bit keys, online interval aggregations, or k-th order statistics over massive domains. With careful engineering—especially around lazy propagation and node allocation—they achieve both correctness and performance on demanding workloads.

02Intuition & Analogies

Imagine you are managing lamp posts along a highway that stretches for a million miles. Building a maintenance shed every mile in advance would be absurdly wasteful if only a few lamps ever need service. Instead, you place a shed only where a crew actually needs to work. A dynamic segment tree does the same for subintervals: it builds a node (a "shed") only when an update or query touches that segment of the coordinate range.

Think of the whole coordinate range as a huge bookshelf. A static segment tree builds labeled boxes for every shelf pair, quarter, eighth, and so on—regardless of whether any book will be placed there. A dynamic segment tree keeps the labels implicit. When the first book is placed on shelf 10^12, we open the path of boxes needed to reach that shelf and leave the rest nonexistent. If later we need shelf 10^15, we independently open that path. The empty areas don't require memory; they are conceptually treated as carrying the neutral value (like sum = 0) even though no physical node exists.

For range adds with lazy propagation, picture sticky notes that say "+5" attached to a big chunk of the bookshelf. You don't immediately update every single book; you just tag the region. Only when you read from or further subdivide that region do you split the sticky note into smaller notes for child regions. This minimizes work while keeping answers correct.

Finally, for k-th order statistics, visualize maintaining tallies per shelf position. To find the k-th book by position, you repeatedly ask: is the left half holding at least k books? If yes, dive left; otherwise, skip them and dive right with k reduced accordingly. You never materialize all shelves—only the ones with books.

03Formal Definition

Let the key domain be the integer interval [L, R], with cardinality . A dynamic segment tree over [L, R] is a binary tree where each node represents a subinterval [a, b] and stores an aggregate value v = f(x), where is an associative binary operation with identity element e (a monoid). Children represent [a, m] and [m+1, b], where m = \left\lfloor \right\rfloor. Nodes are created only when an operation (update or query) reaches their interval; otherwise, absent nodes are treated as storing the identity e. Point update at position p applies an update function u to f(p) and adjusts aggregates on the root-to-leaf path to maintain v. For range updates on interval [ql, qr], if a node’s interval [a, b] is fully covered, an update is applied in O(1) and recorded as a lazy tag to defer propagation to children until necessary. Range queries on [ql, qr] combine aggregates from O(log C) disjoint nodes. Under standard assumptions (bounded recursion depth and finite updates), per-operation time complexity is O(log C). Let q be the number of operations and m be the number of distinct points or disjoint ranges that cause node creation; the number of nodes is O(m log C), yielding memory O(m log C). If every operation touches O(log C) segments and most segments are unique, total nodes are O(q log C).

04When to Use

Use a dynamic segment tree when the coordinate range is huge (e.g., 32/64-bit integers) but the number of touched positions or subranges is relatively small. Typical scenarios include:

  • Online problems with unknown future coordinates where coordinate compression is impossible or would require reindexing on the fly.
  • Sparse frequency maps over large value domains for tasks like k-th order statistics, counting inversions with large values, or tracking dynamic medians.
  • Interval updates and range queries on IDs or timestamps that can be as large as 10^12–10^18 (e.g., logging systems, time series with sparse events).
  • Problems mixing insertions and deletions over time with queries interleaved, where precomputing all coordinates is infeasible.
  • Competitive programming tasks (CF 1800–2200) where memory constraints prohibit building a static segment tree of size O(C), but O(q log C) memory is realistic.

Prefer coordinate compression if all coordinates are known in advance and fit comfortably in memory; compressed indices usually run faster due to array-based nodes. Choose a dynamic segment tree if you must answer queries online, the domain is unbounded/very large, or keys are adversarially chosen.

⚠️Common Mistakes

  • Incorrect mid computation and overflow: using (l + r) / 2 with 64-bit endpoints risks overflow. Always use m = l + (r - l) / 2.
  • Missing neutral element handling: forgetting that a null child must contribute the identity (e.g., 0 for sum, -∞ for max) leads to wrong aggregates. Define and consistently use the identity.
  • Faulty lazy propagation on null children: pushing a lazy tag without ensuring children exist, or forgetting to create children before applying tags, corrupts state. Implement a helper to allocate children on demand during push.
  • Off-by-one interval boundaries: mixing [l, r] and [l, r) conventions breaks correctness. Pick one (commonly inclusive [l, r]) and apply it consistently.
  • Memory leaks and performance from excessive allocations: calling new per visited node can TLE. Use a node pool (arena) or reserve memory to reduce allocator overhead, and reset between test cases.
  • Forgetting to prune zero-nodes: when maintaining counts or sums, you can optionally prune nodes whose value returns to identity to save memory. If you don’t, trees can grow unnecessarily; if you do, ensure you only delete when both children are null and value is identity.
  • Mixing signed/unsigned and type ranges: use 64-bit signed integers for coordinates and lengths; be careful computing segment length len = r - l + 1. For very large ranges, check that len fits in 128-bit if needed when multiplying by large lazy tags.

Key Formulas

Coordinate Range Size

Explanation: The size of the integer domain covered by the tree. It sets the logarithmic factor in time complexity.

Midpoint (overflow-safe)

Explanation: Computes the midpoint of [l, r] without risking 64-bit overflow. Always use this form in large domains.

Tree Height

Explanation: The maximum depth of recursion is proportional to the base-2 logarithm of the coordinate range. Operations traverse at most h levels.

Per-operation Time

Explanation: Each update or query descends the tree, visiting at most one node per level, so it runs in logarithmic time with respect to C.

Memory Over q Operations

Explanation: Across q operations, the number of created nodes is proportional to the number of visited unique segments, which is at most a constant times q log C.

Range Aggregate

Explanation: A range query computes the monoid aggregation over indices l..r. For sum, this is the ordinary addition over that interval.

Lazy Add Effect

Explanation: When adding a constant d to every element of [l, r], the node’s stored sum increases by d times the segment length.

Node Creation Upper Bound

Explanation: Each operation touches O(log C) nodes and can create at most a constant multiple of that many new nodes. Summed over q operations, this bounds node count.

K-th Order Statistic

Explanation: The k-th smallest element is the smallest x whose prefix frequency reaches k. Traversing by subtree counts yields O(log C) time.

Query with Output Size a

Explanation: If a query must return a items (e.g., reporting segments), the time includes both the logarithmic search and the output size.

Complexity Analysis

Let C = R - L + 1 be the coordinate range size. In a dynamic segment tree, both point updates and range queries descend the tree, branching at most once per level. Since the height of the implicit complete binary tree over [L, R] is h = ⌈log2 C⌉, the time complexity per operation is O(log C). With lazy propagation, a range update on [ql, qr] marks O(log C) disjoint covering nodes and may push at most O(log C) times during a subsequent query or further updates; asymptotically, the per-operation time remains O(log C). Memory usage is proportional to the number of nodes actually created. Each new path from the root to a touched leaf creates at most h nodes. Over q operations, if most touched paths are new, the number of created nodes is O(q log C). In many practical scenarios, repeated accesses overlap and reduce the constant factor. Null subtrees implicitly represent identity values and consume no memory. Space per node is O(1), holding the aggregate value, optional lazy tag, and two child pointers. Thus, total space is O(M) where M is the number of created nodes, typically M = Θ(q log C) in the worst case. Recursion depth is O(log C); with 64-bit domains, this is at most about 60, so stack usage is modest. If allocation overhead is significant, a custom memory pool can reduce constant factors from frequent small allocations. Optional pruning can further reduce memory, but must be implemented carefully to maintain correctness and avoid dangling pointers.

Code Examples

Point update and range sum query over 64-bit coordinates (basic dynamic segment tree)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Dynamic segment tree for sum with point updates and range sum queries.
5// Coordinate range is [L, R], both inclusive. Nodes are created on demand.
6
7struct Node {
8 long long sum; // aggregate sum over this segment
9 Node* left; // left child
10 Node* right; // right child
11 Node(): sum(0), left(nullptr), right(nullptr) {}
12};
13
14// Update position pos by adding delta. Interval [l, r] is current node's segment.
15void update(Node*& node, long long l, long long r, long long pos, long long delta) {
16 if (pos < l || pos > r) return; // outside
17 if (node == nullptr) node = new Node();
18 if (l == r) {
19 node->sum += delta;
20 return;
21 }
22 long long m = l + (r - l) / 2; // overflow-safe midpoint
23 if (pos <= m) update(node->left, l, m, pos, delta);
24 else update(node->right, m + 1, r, pos, delta);
25
26 long long leftSum = node->left ? node->left->sum : 0;
27 long long rightSum = node->right ? node->right->sum : 0;
28 node->sum = leftSum + rightSum;
29}
30
31// Query sum over [ql, qr]. Interval [l, r] is current node's segment.
32long long query(Node* node, long long l, long long r, long long ql, long long qr) {
33 if (!node || qr < l || r < ql) return 0; // identity for sum
34 if (ql <= l && r <= qr) return node->sum;
35 long long m = l + (r - l) / 2;
36 return query(node->left, l, m, ql, qr) + query(node->right, m + 1, r, ql, qr);
37}
38
39int main() {
40 ios::sync_with_stdio(false);
41 cin.tie(nullptr);
42
43 const long long L = 0;
44 const long long R = (long long)1e18; // huge domain
45
46 Node* root = nullptr; // empty tree represents all zeros
47
48 // Demo: point updates and queries
49 update(root, L, R, 5, 10); // a[5] += 10
50 update(root, L, R, 1000000000000LL, 7); // a[1e12] += 7
51 update(root, L, R, R, 1); // a[R] += 1
52
53 cout << query(root, L, R, 0, 10) << "\n"; // should print 10
54 cout << query(root, L, R, 0, 1000000000000LL) << "\n"; // should print 17
55 cout << query(root, L, R, R, R) << "\n"; // should print 1
56 cout << query(root, L, R, 6, 9) << "\n"; // should print 0
57
58 return 0;
59}
60

This program implements a dynamic segment tree over [0, 1e18]. Nodes are allocated only when accessed by updates or queries. Point updates adjust sums along a single root-to-leaf path; range queries traverse only relevant nodes. Null children contribute 0 to sums, letting untouched regions remain implicit.

Time: O(log C) per update/querySpace: O(m) nodes created overall where m = O(q log C)
Range add and range sum query with lazy propagation (dynamic)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Dynamic segment tree with lazy propagation supporting:
5// - range add: add v to all a[i] in [ql, qr]
6// - range sum query: sum over [ql, qr]
7
8struct Node {
9 long long sum; // sum over this segment
10 long long lazy; // pending addition to apply to children
11 Node* left;
12 Node* right;
13 Node(): sum(0), lazy(0), left(nullptr), right(nullptr) {}
14};
15
16inline long long seglen(long long l, long long r) { return r - l + 1; }
17
18// Ensure node exists
19inline void ensure(Node*& node) { if (!node) node = new Node(); }
20
21// Push lazy tag to children, creating them on demand
22void push(Node* node, long long l, long long r) {
23 if (!node || node->lazy == 0 || l == r) return;
24 long long m = l + (r - l) / 2;
25 // create children before applying lazy
26 if (!node->left) node->left = new Node();
27 if (!node->right) node->right = new Node();
28
29 long long tag = node->lazy;
30 node->left->lazy += tag;
31 node->left->sum += tag * (m - l + 1);
32
33 node->right->lazy += tag;
34 node->right->sum += tag * (r - (m + 1) + 1);
35
36 node->lazy = 0;
37}
38
39void range_add(Node*& node, long long l, long long r, long long ql, long long qr, long long v) {
40 if (qr < l || r < ql) return;
41 ensure(node);
42 if (ql <= l && r <= qr) {
43 node->sum += v * seglen(l, r);
44 node->lazy += v;
45 return;
46 }
47 push(node, l, r);
48 long long m = l + (r - l) / 2;
49 range_add(node->left, l, m, ql, qr, v);
50 range_add(node->right, m + 1, r, ql, qr, v);
51 long long leftSum = node->left ? node->left->sum : 0;
52 long long rightSum = node->right ? node->right->sum : 0;
53 node->sum = leftSum + rightSum;
54}
55
56long long range_sum(Node* node, long long l, long long r, long long ql, long long qr) {
57 if (!node || qr < l || r < ql) return 0;
58 if (ql <= l && r <= qr) return node->sum;
59 push(node, l, r);
60 long long m = l + (r - l) / 2;
61 return range_sum(node->left, l, m, ql, qr) + range_sum(node->right, m + 1, r, ql, qr);
62}
63
64int main() {
65 ios::sync_with_stdio(false);
66 cin.tie(nullptr);
67
68 const long long L = 0;
69 const long long R = (1LL<<40) - 1; // ~1e12 range
70
71 Node* root = nullptr;
72
73 // Add +3 to [5, 10]
74 range_add(root, L, R, 5, 10, 3);
75 // Add +2 to [8, 12]
76 range_add(root, L, R, 8, 12, 2);
77
78 // Query some ranges
79 cout << range_sum(root, L, R, 0, 4) << "\n"; // 0
80 cout << range_sum(root, L, R, 5, 7) << "\n"; // 3*3 = 9
81 cout << range_sum(root, L, R, 8, 10) << "\n"; // (3+2)*3 = 15
82 cout << range_sum(root, L, R, 11, 12) << "\n"; // 2*2 = 4
83
84 return 0;
85}
86

This dynamic tree supports range add with lazy propagation and range sum queries. Lazy tags store pending increments so we avoid visiting all leaves. Children are created only when pushing or when a partially covered node must descend, keeping memory proportional to touched segments.

Time: O(log C) per range update/querySpace: O(m) nodes created over time, typically O(q log C)
Sparse frequency map with k-th order statistic over large value domain
1#include <bits/stdc++.h>
2using namespace std;
3
4// Dynamic segment tree that stores counts (frequencies) at positions in a huge domain.
5// Supports: add/remove occurrences and find the k-th smallest value.
6
7struct Node {
8 long long cnt; // count of elements in this segment
9 Node* left;
10 Node* right;
11 Node(): cnt(0), left(nullptr), right(nullptr) {}
12};
13
14void add(Node*& node, int l, int r, int pos, long long delta) {
15 if (pos < l || pos > r) return;
16 if (!node) node = new Node();
17 if (l == r) { node->cnt += delta; return; }
18 int m = l + (r - l) / 2;
19 if (pos <= m) add(node->left, l, m, pos, delta);
20 else add(node->right, m + 1, r, pos, delta);
21 long long cl = node->left ? node->left->cnt : 0;
22 long long cr = node->right ? node->right->cnt : 0;
23 node->cnt = cl + cr;
24}
25
26// Returns smallest x with prefix count >= k; assumes 1 <= k <= total count
27int kth(Node* node, int l, int r, long long k) {
28 if (!node || node->cnt < k) return -1; // not enough elements
29 if (l == r) return l;
30 int m = l + (r - l) / 2;
31 long long leftCnt = node->left ? node->left->cnt : 0;
32 if (k <= leftCnt) return kth(node->left, l, m, k);
33 else return kth(node->right, m + 1, r, k - leftCnt);
34}
35
36int main() {
37 ios::sync_with_stdio(false);
38 cin.tie(nullptr);
39
40 // Value domain [0, 1e9]
41 const int L = 0;
42 const int R = 1000000000; // 1e9
43
44 Node* root = nullptr;
45
46 // Insert values: 5 (x3), 10^9 (x1), 42 (x2)
47 add(root, L, R, 5, 3);
48 add(root, L, R, 1000000000, 1);
49 add(root, L, R, 42, 2);
50
51 // Query k-ths
52 cout << kth(root, L, R, 1) << "\n"; // 5
53 cout << kth(root, L, R, 3) << "\n"; // 5
54 cout << kth(root, L, R, 4) << "\n"; // 42
55 cout << kth(root, L, R, 6) << "\n"; // 1e9
56
57 // Remove two 5s and query again
58 add(root, L, R, 5, -2);
59 cout << kth(root, L, R, 1) << "\n"; // 5
60 cout << kth(root, L, R, 2) << "\n"; // 42
61
62 return 0;
63}
64

This example uses a dynamic segment tree as a sparse frequency structure over [0, 1e9]. Updates change counts at single positions, and kth() navigates by comparing k with the left subtree's count to decide direction. It finds the k-th smallest value in O(log C) without storing all keys explicitly.

Time: O(log C) per add/remove and per kth querySpace: O(m) nodes, where m is the number of visited segments
Performance-oriented version with a simple memory pool (arena allocator)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Memory-pool-backed dynamic segment tree for sum of point updates.
5// Adjust POOL_SIZE according to constraints to avoid bad_alloc.
6
7struct Node {
8 long long sum;
9 Node* left;
10 Node* right;
11};
12
13static const int POOL_SIZE = 1'000'000; // example capacity; tune per problem
14static Node POOL[POOL_SIZE];
15static int pool_ptr = 0;
16
17Node* new_node() {
18 if (pool_ptr >= POOL_SIZE) {
19 cerr << "Pool exhausted! Increase POOL_SIZE." << '\n';
20 exit(1);
21 }
22 Node* n = &POOL[pool_ptr++];
23 n->sum = 0; n->left = n->right = nullptr;
24 return n;
25}
26
27void reset_pool() { pool_ptr = 0; }
28
29void update(Node*& node, long long l, long long r, long long pos, long long delta) {
30 if (pos < l || pos > r) return;
31 if (!node) node = new_node();
32 if (l == r) { node->sum += delta; return; }
33 long long m = l + (r - l) / 2;
34 if (pos <= m) update(node->left, l, m, pos, delta);
35 else update(node->right, m + 1, r, pos, delta);
36 long long sL = node->left ? node->left->sum : 0;
37 long long sR = node->right ? node->right->sum : 0;
38 node->sum = sL + sR;
39}
40
41long long query(Node* node, long long l, long long r, long long ql, long long qr) {
42 if (!node || qr < l || r < ql) return 0;
43 if (ql <= l && r <= qr) return node->sum;
44 long long m = l + (r - l) / 2;
45 return query(node->left, l, m, ql, qr) + query(node->right, m + 1, r, ql, qr);
46}
47
48int main() {
49 ios::sync_with_stdio(false);
50 cin.tie(nullptr);
51
52 const long long L = -1e12; // signed coordinates also supported
53 const long long R = +1e12;
54
55 reset_pool();
56 Node* root = nullptr;
57
58 // Random-like workload
59 update(root, L, R, -5, 2);
60 update(root, L, R, 123456789012LL, 4);
61 update(root, L, R, -5, -1);
62
63 cout << query(root, L, R, -10, 10) << '\n'; // 1
64 cout << query(root, L, R, 0, 2000000000000LL) << '\n'; // 4
65
66 // For multiple test cases, call reset_pool() and set root = nullptr each time.
67
68 return 0;
69}
70

This variant replaces new/delete with a simple bump-pointer pool for faster allocation and predictable memory usage—useful under tight time limits. Resetting the pool between test cases clears all nodes at once. The logic is the same as the basic point-update sum tree.

Time: O(log C) per update/querySpace: O(min(POOL_SIZE, q log C)) nodes from the pool