Segment Tree Basics
Key Points
- β’A segment tree is a complete binary tree that stores information about array intervals to answer range queries and support point updates in O(log n).
- β’It works for any associative operation (sum, min, max, gcd, etc.) and needs a neutral (identity) element for empty contributions.
- β’Building the tree from an array takes O(n) time; each query and update takes O(log n).
- β’Iterative implementations typically use about 2n memory and are cache-friendly; recursive ones use about 4n memory and are easy to reason about.
- β’Each node covers an interval, and the parentβs value is the merge (combine) of its two children.
- β’Queries split the asked range into O(log n) disjoint segments and combine their precomputed answers.
- β’Point updates modify one leaf and then recompute O(log n) ancestors up to the root.
- β’Choosing consistent interval conventions ([l, r) vs [l, r]) and a correct identity element is crucial to avoid bugs.
Prerequisites
- βArrays and indexing β Segment trees store and aggregate array elements by index.
- βBinary tree structure β Understanding parents, children, and tree height helps grasp the layout.
- βTime and space complexity β To analyze O(n) build and O(log n) operations.
- βAssociativity and identity elements β Correctness of range merging relies on these properties.
- βDivide-and-conquer strategy β Segment trees recursively split intervals into halves.
Detailed Explanation
Tap terms for definitions01Overview
Imagine wanting quick answers to questions like: What is the sum of elements from index 1000 to 5000? Or what is the minimum value in a subarray, while also allowing changes to single elements? A segment tree is a data structure designed exactly for such problems. It organizes an array into a binary tree where each node stores the result of a function (like sum, min, max, gcd) over a specific interval. This lets us reuse precomputed partial answers to respond quickly to new queries. The hook is speed: after a one-time O(n) build, each query or point update runs in O(log n), even for large arrays. Conceptually, we split the array into halves recursively until we reach single elements (leaves). Each internal node stores the merge (combine) of its two children, representing a larger interval. Because merging is associative, we can combine answers from multiple nodes safely and consistently. For example, to compute the sum on [l, r), we walk down the tree, selecting O(log n) nodes whose intervals exactly cover the query range and add their values. A point update changes a leaf, and we propagate the new value up to the root, fixing O(log n) nodes. This elegant structure balances speed, flexibility, and simplicity, making it a favorite in competitive programming and systems that need fast aggregate queries.
02Intuition & Analogies
Think of a bookshelf with many numbered books. If someone often asks, βWhatβs the total number of pages from book 20 to 70?β you could count each time, but thatβs slow. Instead, you might precompute the total pages for every shelf section: each half of the shelf, each quarter, each eighth, and so on. Now, when asked for any subrange, you cover it using a few of those sections and add their totals. Thatβs the core idea of a segment tree. Another analogy: Consider a road split into segments, and you want the minimum speed limit over any stretch. If every junction knows the minimum of its left stretch and right stretch, then to find the minimum on a route, you only need to look at a few junctions whose spans cover the route exactly. Because βminβ is associative (min(min(a, b), c) = min(a, b, c)), you can combine in any grouping and still get the correct result. Visually, picture a pyramid. The bottom layer is the original array (leaves). The next layer up stores results over pairs of elements, the next over blocks of four, then eight, and so on, until the top stores the result over the whole array. Queries climb and descend along the edges to pick a handful of blocks that exactly tile the query interval. Updates tweak one bottom brick and then adjust the few bricks above it to keep the pyramid consistent. The magic is that, no matter where you look or update, you only touch O(log n) bricks.
03Formal Definition
04When to Use
Use a segment tree when you need fast range queries and point updates with an associative operation: sums of subarrays, range minimum/maximum, range gcd, counting frequencies, or tracking boolean predicates over ranges (e.g., any/All). They are ideal when both queries and updates are frequent and interleaved, and a static preprocessing structure like a prefix-sum array (which gives O(1) queries but O(n) updates) is too slow for updates. Choose the iterative variant when you want speed and tight memory (about 2n), especially in competitive programming; itβs usually more cache-friendly and shorter. Choose the recursive variant when readability and easier extension (like custom queries, advanced searches, or later adding lazy propagation) matter more; itβs about 4n memory but very clear. If your operation is associative and has an identity, a basic segment tree works. If you need range updates that also require merging (e.g., add a value to every element on [l, r)), consider a lazy-propagation segment tree. If you only need prefix sums, a Fenwick tree (Binary Indexed Tree) may be simpler and slightly lighter. If you need order-statistics by frequency, you can store counts and do a binary search over the tree to find the k-th element in O(log n).
β οΈCommon Mistakes
β’ Mixing interval conventions: Iterative trees often use half-open intervals [l, r), while recursive code often uses closed intervals [l, r]. Using the wrong boundaries leads to off-by-one errors. Decide once and be consistent in build, query, and update. β’ Wrong identity element: For sum use 0, for min use +β, for max use ββ, for gcd use 0. Using an incorrect identity breaks queries that partially miss the range (contributing the identity instead of real values). β’ Forgetting associativity: The combine operation must be associative. Non-associative operations (like subtraction or division) produce incorrect results when the query decomposes into multiple nodes. β’ Mis-sizing arrays: Iterative implementations that place leaves at indices n..2n-1 assume storage for 2n values; recursive ones often allocate 4n. Under-allocating leads to out-of-bounds writes. If you pad to a power of two, adjust how you place elements and identity-fill the extra leaves. β’ Not updating ancestors: After a point update on a leaf, you must recompute all ancestors on the path to the root. Skipping any ancestor will return stale results. β’ Overflow and types: For sums, prefer 64-bit (long long) if values or n are large. For min/max, beware of sentinel values and comparisons with identities. β’ Performance pitfalls: Excessive recursion depth for very large n can hit recursion limits (less common in C++ but still relevant). Iterative methods are often faster due to better cache locality and no function call overhead.
Key Formulas
Tree Height
Explanation: The height of a segment tree over n elements is the ceiling of log base 2 of n. This bounds how many levels you traverse for queries and updates.
Node Count Upper Bound
Explanation: A binary tree padded to the next power of two has at most this many nodes. This explains the common 2N or 4n memory rules of thumb.
Build Time
Explanation: Building computes each internal node once from its two children, resulting in linear time in the number of elements.
Query Time
Explanation: A range query touches at most a constant number of nodes per level, and there are O(log n) levels.
Update Time
Explanation: A point update changes one leaf and at most one node per level up to the root, yielding logarithmic time.
Associativity Requirement
Explanation: The combine operation must be associative so that merging partial answers in any grouping yields the same result.
Identity Element
Explanation: An identity element is needed so that nodes outside the query contribute a neutral value when combined.
Full Binary Tree Size
Explanation: The number of nodes in a complete binary tree of height h follows this geometric sum, relating height and storage.
Query Decomposition Recurrence (Idealized)
Explanation: An idealized view of splitting a query by halves shows that at each level we consider at most two nodes. Solving this gives O(log n).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct SegTreeIterSum { 5 int n; // number of elements 6 vector<long long> t; // tree array of size 2*n (leaves at [n,2n)) 7 8 explicit SegTreeIterSum(int n_) : n(n_), t(2 * n_, 0LL) {} 9 10 // Build from initial array a (size must be n) 11 void build(const vector<long long>& a) { 12 for (int i = 0; i < n; ++i) t[n + i] = a[i]; // place leaves 13 for (int i = n - 1; i > 0; --i) t[i] = t[i << 1] + t[i << 1 | 1]; // combine 14 } 15 16 // Set A[pos] = val 17 void point_set(int pos, long long val) { 18 int p = pos + n; // move to leaf 19 t[p] = val; // set value 20 for (p >>= 1; p >= 1; p >>= 1) { // climb and recompute parents 21 t[p] = t[p << 1] + t[p << 1 | 1]; 22 } 23 } 24 25 // Range sum on [l, r) (half-open) 26 long long range_sum(int l, int r) const { 27 long long resL = 0, resR = 0; // identity for sum is 0 28 // Standard iterative query: move l and r upwards while collecting contributions 29 for (l += n, r += n; l < r; l >>= 1, r >>= 1) { 30 if (l & 1) resL += t[l++]; // l is right child -> include and move right 31 if (r & 1) resR += t[--r]; // r about to become right boundary -> include left sibling 32 } 33 return resL + resR; // combine left and right accumulators 34 } 35 }; 36 37 int main() { 38 ios::sync_with_stdio(false); 39 cin.tie(nullptr); 40 41 vector<long long> a = {1, 3, 5, 7, 9, 11}; 42 int n = (int)a.size(); 43 SegTreeIterSum st(n); 44 st.build(a); 45 46 // Query sum on [1, 5): 3 + 5 + 7 + 9 = 24 47 cout << st.range_sum(1, 5) << "\n"; // 24 48 49 // Set A[3] = 10 (was 7) 50 st.point_set(3, 10); 51 52 // New sum on [1, 5): 3 + 5 + 10 + 9 = 27 53 cout << st.range_sum(1, 5) << "\n"; // 27 54 55 return 0; 56 } 57
This iterative segment tree stores leaves at positions [n, 2n). Internal nodes at indices i store the sum of children i<<1 and i<<1|1. range_sum uses the classic [l, r) half-open convention and collects O(log n) nodes. point_set updates a leaf and recomputes ancestors.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct SegTreeRecMin { 5 int n; 6 vector<long long> st; // 1-indexed tree, allocate ~4n 7 const long long INF = (1LL<<60); 8 9 explicit SegTreeRecMin(int n_) : n(n_), st(4 * n_, INF) {} 10 11 void build(const vector<long long>& a, int p, int l, int r) { 12 if (l == r) { 13 st[p] = a[l]; 14 return; 15 } 16 int m = (l + r) >> 1; 17 build(a, p << 1, l, m); 18 build(a, p << 1 | 1, m + 1, r); 19 st[p] = min(st[p << 1], st[p << 1 | 1]); 20 } 21 22 // point set: A[idx] = val 23 void update(int p, int l, int r, int idx, long long val) { 24 if (l == r) { 25 st[p] = val; 26 return; 27 } 28 int m = (l + r) >> 1; 29 if (idx <= m) update(p << 1, l, m, idx, val); 30 else update(p << 1 | 1, m + 1, r, idx, val); 31 st[p] = min(st[p << 1], st[p << 1 | 1]); 32 } 33 34 // query min on [ql, qr] 35 long long query(int p, int l, int r, int ql, int qr) const { 36 if (qr < l || r < ql) return INF; // disjoint: return identity for min 37 if (ql <= l && r <= qr) return st[p]; // fully covered 38 int m = (l + r) >> 1; 39 return min(query(p << 1, l, m, ql, qr), query(p << 1 | 1, m + 1, r, ql, qr)); 40 } 41 }; 42 43 int main() { 44 ios::sync_with_stdio(false); 45 cin.tie(nullptr); 46 47 vector<long long> a = {5, 2, 6, 3, 7, 1, 4}; 48 int n = (int)a.size(); 49 SegTreeRecMin st(n); 50 st.build(a, 1, 0, n - 1); 51 52 // Min on [1, 4] = min(2,6,3,7) = 2 53 cout << st.query(1, 0, n - 1, 1, 4) << "\n"; // 2 54 55 // Set A[5] = 10 (was 1), new min on [0, 6] 56 st.update(1, 0, n - 1, 5, 10); 57 cout << st.query(1, 0, n - 1, 0, 6) << "\n"; // 2 58 59 return 0; 60 } 61
A classic recursive tree with 4n storage. Intervals are inclusive [l, r]. The identity for min is +β, ensuring disjoint segments contribute neutral values.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Op must provide: T operator()(const T&, const T&) const; and static T id(); 5 template <class T, class Op> 6 struct SegTreeGeneric { 7 int n; // original number of elements 8 int N; // power-of-two padded size 9 vector<T> t; // tree size 2*N 10 Op op; 11 12 explicit SegTreeGeneric(int n_) : n(n_) { 13 N = 1; while (N < n) N <<= 1; 14 t.assign(2 * N, Op::id()); 15 } 16 17 void build(const vector<T>& a) { 18 for (int i = 0; i < n; ++i) t[N + i] = a[i]; 19 for (int i = N - 1; i > 0; --i) t[i] = op(t[i << 1], t[i << 1 | 1]); 20 } 21 22 // set A[pos] = val 23 void point_set(int pos, T val) { 24 int p = pos + N; 25 t[p] = val; 26 for (p >>= 1; p >= 1; p >>= 1) t[p] = op(t[p << 1], t[p << 1 | 1]); 27 } 28 29 // query on [l, r) half-open 30 T query(int l, int r) const { 31 T left = Op::id(), right = Op::id(); 32 for (l += N, r += N; l < r; l >>= 1, r >>= 1) { 33 if (l & 1) left = op(left, t[l++]); 34 if (r & 1) right = op(t[--r], right); 35 } 36 return op(left, right); 37 } 38 }; 39 40 struct OpSumLL { 41 static long long id() { return 0LL; } 42 long long operator()(long long a, long long b) const { return a + b; } 43 }; 44 45 struct OpGcdLL { 46 static long long id() { return 0LL; } // gcd(x,0)=x 47 long long operator()(long long a, long long b) const { return std::gcd(a, b); } 48 }; 49 50 int main() { 51 ios::sync_with_stdio(false); 52 cin.tie(nullptr); 53 54 vector<long long> a = {12, 18, 24, 30, 6}; 55 56 // Sum tree 57 SegTreeGeneric<long long, OpSumLL> stSum((int)a.size()); 58 stSum.build(a); 59 cout << stSum.query(1, 4) << "\n"; // sum [1,4): 18+24+30=72 60 61 // GCD tree 62 SegTreeGeneric<long long, OpGcdLL> stGcd((int)a.size()); 63 stGcd.build(a); 64 cout << stGcd.query(0, 5) << "\n"; // gcd of all: 6 65 66 // Update and re-query 67 stGcd.point_set(2, 14); // a[2]=14 68 cout << stGcd.query(0, 3) << "\n"; // gcd(12,18,14) = 2 69 70 return 0; 71 } 72
A reusable template with a customizable operation and identity. We demonstrate both sum and gcd. Padding to a power of two simplifies a later descent-based search and keeps the code uniform.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct SegTreePrefix { 5 int n; // original size 6 int N; // power-of-two padded size 7 vector<long long> t; // sums 8 9 explicit SegTreePrefix(int n_) : n(n_) { 10 N = 1; while (N < n) N <<= 1; 11 t.assign(2 * N, 0LL); 12 } 13 14 void build(const vector<long long>& a) { 15 for (int i = 0; i < n; ++i) t[N + i] = a[i]; 16 for (int i = N - 1; i > 0; --i) t[i] = t[i << 1] + t[i << 1 | 1]; 17 } 18 19 void point_add(int pos, long long delta) { // A[pos] += delta 20 int p = pos + N; 21 t[p] += delta; 22 for (p >>= 1; p >= 1; p >>= 1) t[p] = t[p << 1] + t[p << 1 | 1]; 23 } 24 25 // Returns smallest idx in [0, n) such that prefix sum >= target. 26 // If total sum < target, returns -1. 27 int first_prefix_ge(long long target) const { 28 if (t[1] < target) return -1; // whole sum too small 29 int p = 1; 30 while (p < N) { // descend until we reach a leaf 31 if (t[p << 1] >= target) { 32 p = p << 1; // go left 33 } else { 34 target -= t[p << 1]; 35 p = p << 1 | 1; // go right 36 } 37 } 38 int idx = p - N; 39 return (idx < n ? idx : n - 1); // ensure within original range 40 } 41 }; 42 43 int main() { 44 ios::sync_with_stdio(false); 45 cin.tie(nullptr); 46 47 vector<long long> a = {2, 1, 3, 2, 5}; // prefix sums: 2,3,6,8,13 48 SegTreePrefix st((int)a.size()); 49 st.build(a); 50 51 cout << st.first_prefix_ge(6) << "\n"; // 2 (2+1+3) 52 cout << st.first_prefix_ge(7) << "\n"; // 3 (2+1+3+2) 53 cout << st.first_prefix_ge(14) << "\n"; // -1 (total=13) 54 55 // Update: add 2 to index 1 (now a[1]=3), new prefixes: 2,5,8,10,15 56 st.point_add(1, 2); 57 cout << st.first_prefix_ge(7) << "\n"; // 2 (2+3+3) 58 59 return 0; 60 } 61
By storing sums, we can descend the tree from the root: if the left child's sum already reaches the target, go left; otherwise, subtract it and go right. This yields the earliest index whose prefix sum reaches target in O(log n).