πŸ—‚οΈData StructureIntermediate

Segment Tree Basics

Key Points

  • β€’
    A segment tree is a complete binary tree that stores information about array intervals to answer range queries and support point updates in O(log n).
  • β€’
    It works for any associative operation (sum, min, max, gcd, etc.) and needs a neutral (identity) element for empty contributions.
  • β€’
    Building the tree from an array takes O(n) time; each query and update takes O(log n).
  • β€’
    Iterative implementations typically use about 2n memory and are cache-friendly; recursive ones use about 4n memory and are easy to reason about.
  • β€’
    Each node covers an interval, and the parent’s value is the merge (combine) of its two children.
  • β€’
    Queries split the asked range into O(log n) disjoint segments and combine their precomputed answers.
  • β€’
    Point updates modify one leaf and then recompute O(log n) ancestors up to the root.
  • β€’
    Choosing consistent interval conventions ([l, r) vs [l, r]) and a correct identity element is crucial to avoid bugs.

Prerequisites

  • β†’Arrays and indexing β€” Segment trees store and aggregate array elements by index.
  • β†’Binary tree structure β€” Understanding parents, children, and tree height helps grasp the layout.
  • β†’Time and space complexity β€” To analyze O(n) build and O(log n) operations.
  • β†’Associativity and identity elements β€” Correctness of range merging relies on these properties.
  • β†’Divide-and-conquer strategy β€” Segment trees recursively split intervals into halves.

Detailed Explanation

Tap terms for definitions

01Overview

Imagine wanting quick answers to questions like: What is the sum of elements from index 1000 to 5000? Or what is the minimum value in a subarray, while also allowing changes to single elements? A segment tree is a data structure designed exactly for such problems. It organizes an array into a binary tree where each node stores the result of a function (like sum, min, max, gcd) over a specific interval. This lets us reuse precomputed partial answers to respond quickly to new queries. The hook is speed: after a one-time O(n) build, each query or point update runs in O(log n), even for large arrays. Conceptually, we split the array into halves recursively until we reach single elements (leaves). Each internal node stores the merge (combine) of its two children, representing a larger interval. Because merging is associative, we can combine answers from multiple nodes safely and consistently. For example, to compute the sum on [l, r), we walk down the tree, selecting O(log n) nodes whose intervals exactly cover the query range and add their values. A point update changes a leaf, and we propagate the new value up to the root, fixing O(log n) nodes. This elegant structure balances speed, flexibility, and simplicity, making it a favorite in competitive programming and systems that need fast aggregate queries.

02Intuition & Analogies

Think of a bookshelf with many numbered books. If someone often asks, β€œWhat’s the total number of pages from book 20 to 70?” you could count each time, but that’s slow. Instead, you might precompute the total pages for every shelf section: each half of the shelf, each quarter, each eighth, and so on. Now, when asked for any subrange, you cover it using a few of those sections and add their totals. That’s the core idea of a segment tree. Another analogy: Consider a road split into segments, and you want the minimum speed limit over any stretch. If every junction knows the minimum of its left stretch and right stretch, then to find the minimum on a route, you only need to look at a few junctions whose spans cover the route exactly. Because β€œmin” is associative (min(min(a, b), c) = min(a, b, c)), you can combine in any grouping and still get the correct result. Visually, picture a pyramid. The bottom layer is the original array (leaves). The next layer up stores results over pairs of elements, the next over blocks of four, then eight, and so on, until the top stores the result over the whole array. Queries climb and descend along the edges to pick a handful of blocks that exactly tile the query interval. Updates tweak one bottom brick and then adjust the few bricks above it to keep the pyramid consistent. The magic is that, no matter where you look or update, you only touch O(log n) bricks.

03Formal Definition

A segment tree over an array A[0..n-1] with an associative binary operation \(\) and identity element e is a rooted binary tree whose leaves correspond to singleton intervals [i, i+1), storing A[i]. Each internal node corresponds to an interval [L, R) and stores the aggregate value \(V[L, R) = V[L, M) V[M, R)\) where M = (L + R)/2, i.e., the merge of its two children. The root covers [0, n). The operation must satisfy associativity: \((a b) (b c)\), and there exists an identity element e such that \(a a = a\). The height of the tree is \(h = n \). In a typical array-based implementation with padding to the next power of two N = \(2^{ n }\), the tree uses at most \(2N\) storage; in a compact iterative style without explicit padding, storage is often described as about 2n, while a recursive 1-indexed implementation allocates about 4n nodes to simplify indexing. Operations: Build computes all internal nodes bottom-up in O(n). A range query over [l, r) decomposes into O(log n) disjoint nodes and returns their merged result. A point update at position i changes the leaf and recomputes O(log n) ancestors via \(\). Correctness relies on associativity and the identity element handling of uncovered subranges.

04When to Use

Use a segment tree when you need fast range queries and point updates with an associative operation: sums of subarrays, range minimum/maximum, range gcd, counting frequencies, or tracking boolean predicates over ranges (e.g., any/All). They are ideal when both queries and updates are frequent and interleaved, and a static preprocessing structure like a prefix-sum array (which gives O(1) queries but O(n) updates) is too slow for updates. Choose the iterative variant when you want speed and tight memory (about 2n), especially in competitive programming; it’s usually more cache-friendly and shorter. Choose the recursive variant when readability and easier extension (like custom queries, advanced searches, or later adding lazy propagation) matter more; it’s about 4n memory but very clear. If your operation is associative and has an identity, a basic segment tree works. If you need range updates that also require merging (e.g., add a value to every element on [l, r)), consider a lazy-propagation segment tree. If you only need prefix sums, a Fenwick tree (Binary Indexed Tree) may be simpler and slightly lighter. If you need order-statistics by frequency, you can store counts and do a binary search over the tree to find the k-th element in O(log n).

⚠️Common Mistakes

β€’ Mixing interval conventions: Iterative trees often use half-open intervals [l, r), while recursive code often uses closed intervals [l, r]. Using the wrong boundaries leads to off-by-one errors. Decide once and be consistent in build, query, and update. β€’ Wrong identity element: For sum use 0, for min use +∞, for max use βˆ’βˆž, for gcd use 0. Using an incorrect identity breaks queries that partially miss the range (contributing the identity instead of real values). β€’ Forgetting associativity: The combine operation must be associative. Non-associative operations (like subtraction or division) produce incorrect results when the query decomposes into multiple nodes. β€’ Mis-sizing arrays: Iterative implementations that place leaves at indices n..2n-1 assume storage for 2n values; recursive ones often allocate 4n. Under-allocating leads to out-of-bounds writes. If you pad to a power of two, adjust how you place elements and identity-fill the extra leaves. β€’ Not updating ancestors: After a point update on a leaf, you must recompute all ancestors on the path to the root. Skipping any ancestor will return stale results. β€’ Overflow and types: For sums, prefer 64-bit (long long) if values or n are large. For min/max, beware of sentinel values and comparisons with identities. β€’ Performance pitfalls: Excessive recursion depth for very large n can hit recursion limits (less common in C++ but still relevant). Iterative methods are often faster due to better cache locality and no function call overhead.

Key Formulas

Tree Height

Explanation: The height of a segment tree over n elements is the ceiling of log base 2 of n. This bounds how many levels you traverse for queries and updates.

Node Count Upper Bound

Explanation: A binary tree padded to the next power of two has at most this many nodes. This explains the common 2N or 4n memory rules of thumb.

Build Time

Explanation: Building computes each internal node once from its two children, resulting in linear time in the number of elements.

Query Time

Explanation: A range query touches at most a constant number of nodes per level, and there are O(log n) levels.

Update Time

Explanation: A point update changes one leaf and at most one node per level up to the root, yielding logarithmic time.

Associativity Requirement

Explanation: The combine operation must be associative so that merging partial answers in any grouping yields the same result.

Identity Element

Explanation: An identity element is needed so that nodes outside the query contribute a neutral value when combined.

Full Binary Tree Size

Explanation: The number of nodes in a complete binary tree of height h follows this geometric sum, relating height and storage.

Query Decomposition Recurrence (Idealized)

Explanation: An idealized view of splitting a query by halves shows that at each level we consider at most two nodes. Solving this gives O(log n).

Complexity Analysis

Let n be the number of elements. The tree height is h = ⌈logβ‚‚ nβŒ‰. Building the tree bottom-up initializes all leaves and computes each internal node exactly once by combining two children, so the total build time is O(n). For memory, iterative implementations that place leaves at indices [n, 2n) (or [N, 2N) with N as a power-of-two pad) use about 2n (or 2N) storage cells. Recursive, array-based implementations commonly allocate about 4n entries to simplify indexing and avoid bounds checks during recursion. A range query decomposes the interval into O(log n) disjoint nodes. Intuitively, at each level of the tree, you will visit at most two nodes that contribute to the final answer, so the total number of visited nodes is bounded by O(log n). Each visit performs O(1) work (one combine), yielding O(log n) time per query. Similarly, a point update modifies one leaf and then recomputes values on the path to the root, which involves at most one node per level, also O(log n) total operations. Space complexity is O(n) for storing the tree values plus the original array if you keep it. The constant factors differ: iterative trees are often more cache-friendly and have smaller constants due to contiguous memory and no recursion overhead; recursive trees trade a bit more memory for clarity, with a negligible auxiliary stack of O(log n) depth during queries/updates. Overall, segment trees provide an excellent balance for dynamic range queries with predictable logarithmic performance.

Code Examples

Iterative Segment Tree for Range Sum (point set, [l, r) queries)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct SegTreeIterSum {
5 int n; // number of elements
6 vector<long long> t; // tree array of size 2*n (leaves at [n,2n))
7
8 explicit SegTreeIterSum(int n_) : n(n_), t(2 * n_, 0LL) {}
9
10 // Build from initial array a (size must be n)
11 void build(const vector<long long>& a) {
12 for (int i = 0; i < n; ++i) t[n + i] = a[i]; // place leaves
13 for (int i = n - 1; i > 0; --i) t[i] = t[i << 1] + t[i << 1 | 1]; // combine
14 }
15
16 // Set A[pos] = val
17 void point_set(int pos, long long val) {
18 int p = pos + n; // move to leaf
19 t[p] = val; // set value
20 for (p >>= 1; p >= 1; p >>= 1) { // climb and recompute parents
21 t[p] = t[p << 1] + t[p << 1 | 1];
22 }
23 }
24
25 // Range sum on [l, r) (half-open)
26 long long range_sum(int l, int r) const {
27 long long resL = 0, resR = 0; // identity for sum is 0
28 // Standard iterative query: move l and r upwards while collecting contributions
29 for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
30 if (l & 1) resL += t[l++]; // l is right child -> include and move right
31 if (r & 1) resR += t[--r]; // r about to become right boundary -> include left sibling
32 }
33 return resL + resR; // combine left and right accumulators
34 }
35};
36
37int main() {
38 ios::sync_with_stdio(false);
39 cin.tie(nullptr);
40
41 vector<long long> a = {1, 3, 5, 7, 9, 11};
42 int n = (int)a.size();
43 SegTreeIterSum st(n);
44 st.build(a);
45
46 // Query sum on [1, 5): 3 + 5 + 7 + 9 = 24
47 cout << st.range_sum(1, 5) << "\n"; // 24
48
49 // Set A[3] = 10 (was 7)
50 st.point_set(3, 10);
51
52 // New sum on [1, 5): 3 + 5 + 10 + 9 = 27
53 cout << st.range_sum(1, 5) << "\n"; // 27
54
55 return 0;
56}
57

This iterative segment tree stores leaves at positions [n, 2n). Internal nodes at indices i store the sum of children i<<1 and i<<1|1. range_sum uses the classic [l, r) half-open convention and collects O(log n) nodes. point_set updates a leaf and recomputes ancestors.

Time: Build O(n), query O(log n), update O(log n)Space: O(n) extra memory (~2n cells for the tree)
Recursive Segment Tree for Range Minimum ([l, r] inclusive)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct SegTreeRecMin {
5 int n;
6 vector<long long> st; // 1-indexed tree, allocate ~4n
7 const long long INF = (1LL<<60);
8
9 explicit SegTreeRecMin(int n_) : n(n_), st(4 * n_, INF) {}
10
11 void build(const vector<long long>& a, int p, int l, int r) {
12 if (l == r) {
13 st[p] = a[l];
14 return;
15 }
16 int m = (l + r) >> 1;
17 build(a, p << 1, l, m);
18 build(a, p << 1 | 1, m + 1, r);
19 st[p] = min(st[p << 1], st[p << 1 | 1]);
20 }
21
22 // point set: A[idx] = val
23 void update(int p, int l, int r, int idx, long long val) {
24 if (l == r) {
25 st[p] = val;
26 return;
27 }
28 int m = (l + r) >> 1;
29 if (idx <= m) update(p << 1, l, m, idx, val);
30 else update(p << 1 | 1, m + 1, r, idx, val);
31 st[p] = min(st[p << 1], st[p << 1 | 1]);
32 }
33
34 // query min on [ql, qr]
35 long long query(int p, int l, int r, int ql, int qr) const {
36 if (qr < l || r < ql) return INF; // disjoint: return identity for min
37 if (ql <= l && r <= qr) return st[p]; // fully covered
38 int m = (l + r) >> 1;
39 return min(query(p << 1, l, m, ql, qr), query(p << 1 | 1, m + 1, r, ql, qr));
40 }
41};
42
43int main() {
44 ios::sync_with_stdio(false);
45 cin.tie(nullptr);
46
47 vector<long long> a = {5, 2, 6, 3, 7, 1, 4};
48 int n = (int)a.size();
49 SegTreeRecMin st(n);
50 st.build(a, 1, 0, n - 1);
51
52 // Min on [1, 4] = min(2,6,3,7) = 2
53 cout << st.query(1, 0, n - 1, 1, 4) << "\n"; // 2
54
55 // Set A[5] = 10 (was 1), new min on [0, 6]
56 st.update(1, 0, n - 1, 5, 10);
57 cout << st.query(1, 0, n - 1, 0, 6) << "\n"; // 2
58
59 return 0;
60}
61

A classic recursive tree with 4n storage. Intervals are inclusive [l, r]. The identity for min is +∞, ensuring disjoint segments contribute neutral values.

Time: Build O(n), query O(log n), update O(log n)Space: O(n) extra memory (~4n cells for the tree)
Generic Segment Tree Template (custom op and identity)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Op must provide: T operator()(const T&, const T&) const; and static T id();
5template <class T, class Op>
6struct SegTreeGeneric {
7 int n; // original number of elements
8 int N; // power-of-two padded size
9 vector<T> t; // tree size 2*N
10 Op op;
11
12 explicit SegTreeGeneric(int n_) : n(n_) {
13 N = 1; while (N < n) N <<= 1;
14 t.assign(2 * N, Op::id());
15 }
16
17 void build(const vector<T>& a) {
18 for (int i = 0; i < n; ++i) t[N + i] = a[i];
19 for (int i = N - 1; i > 0; --i) t[i] = op(t[i << 1], t[i << 1 | 1]);
20 }
21
22 // set A[pos] = val
23 void point_set(int pos, T val) {
24 int p = pos + N;
25 t[p] = val;
26 for (p >>= 1; p >= 1; p >>= 1) t[p] = op(t[p << 1], t[p << 1 | 1]);
27 }
28
29 // query on [l, r) half-open
30 T query(int l, int r) const {
31 T left = Op::id(), right = Op::id();
32 for (l += N, r += N; l < r; l >>= 1, r >>= 1) {
33 if (l & 1) left = op(left, t[l++]);
34 if (r & 1) right = op(t[--r], right);
35 }
36 return op(left, right);
37 }
38};
39
40struct OpSumLL {
41 static long long id() { return 0LL; }
42 long long operator()(long long a, long long b) const { return a + b; }
43};
44
45struct OpGcdLL {
46 static long long id() { return 0LL; } // gcd(x,0)=x
47 long long operator()(long long a, long long b) const { return std::gcd(a, b); }
48};
49
50int main() {
51 ios::sync_with_stdio(false);
52 cin.tie(nullptr);
53
54 vector<long long> a = {12, 18, 24, 30, 6};
55
56 // Sum tree
57 SegTreeGeneric<long long, OpSumLL> stSum((int)a.size());
58 stSum.build(a);
59 cout << stSum.query(1, 4) << "\n"; // sum [1,4): 18+24+30=72
60
61 // GCD tree
62 SegTreeGeneric<long long, OpGcdLL> stGcd((int)a.size());
63 stGcd.build(a);
64 cout << stGcd.query(0, 5) << "\n"; // gcd of all: 6
65
66 // Update and re-query
67 stGcd.point_set(2, 14); // a[2]=14
68 cout << stGcd.query(0, 3) << "\n"; // gcd(12,18,14) = 2
69
70 return 0;
71}
72

A reusable template with a customizable operation and identity. We demonstrate both sum and gcd. Padding to a power of two simplifies a later descent-based search and keeps the code uniform.

Time: Build O(n), query O(log n), update O(log n)Space: O(n) for tree storage (~2N with N=power of two β‰₯ n)
Find the First Index with Prefix Sum β‰₯ target (binary descent on tree)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct SegTreePrefix {
5 int n; // original size
6 int N; // power-of-two padded size
7 vector<long long> t; // sums
8
9 explicit SegTreePrefix(int n_) : n(n_) {
10 N = 1; while (N < n) N <<= 1;
11 t.assign(2 * N, 0LL);
12 }
13
14 void build(const vector<long long>& a) {
15 for (int i = 0; i < n; ++i) t[N + i] = a[i];
16 for (int i = N - 1; i > 0; --i) t[i] = t[i << 1] + t[i << 1 | 1];
17 }
18
19 void point_add(int pos, long long delta) { // A[pos] += delta
20 int p = pos + N;
21 t[p] += delta;
22 for (p >>= 1; p >= 1; p >>= 1) t[p] = t[p << 1] + t[p << 1 | 1];
23 }
24
25 // Returns smallest idx in [0, n) such that prefix sum >= target.
26 // If total sum < target, returns -1.
27 int first_prefix_ge(long long target) const {
28 if (t[1] < target) return -1; // whole sum too small
29 int p = 1;
30 while (p < N) { // descend until we reach a leaf
31 if (t[p << 1] >= target) {
32 p = p << 1; // go left
33 } else {
34 target -= t[p << 1];
35 p = p << 1 | 1; // go right
36 }
37 }
38 int idx = p - N;
39 return (idx < n ? idx : n - 1); // ensure within original range
40 }
41};
42
43int main() {
44 ios::sync_with_stdio(false);
45 cin.tie(nullptr);
46
47 vector<long long> a = {2, 1, 3, 2, 5}; // prefix sums: 2,3,6,8,13
48 SegTreePrefix st((int)a.size());
49 st.build(a);
50
51 cout << st.first_prefix_ge(6) << "\n"; // 2 (2+1+3)
52 cout << st.first_prefix_ge(7) << "\n"; // 3 (2+1+3+2)
53 cout << st.first_prefix_ge(14) << "\n"; // -1 (total=13)
54
55 // Update: add 2 to index 1 (now a[1]=3), new prefixes: 2,5,8,10,15
56 st.point_add(1, 2);
57 cout << st.first_prefix_ge(7) << "\n"; // 2 (2+3+3)
58
59 return 0;
60}
61

By storing sums, we can descend the tree from the root: if the left child's sum already reaches the target, go left; otherwise, subtract it and go right. This yields the earliest index whose prefix sum reaches target in O(log n).

Time: Build O(n), update O(log n), search O(log n)Space: O(n) tree storage (~2N cells)