🗂️Data StructureIntermediate

Sparse Table

Key Points

  • A Sparse Table is a static range-query data structure that preprocesses an array in O(n n) time and answers many queries in O(1) time.
  • It works perfectly for idempotent and associative operations like min, max, gcd, bitwise AND, and bitwise OR.
  • The core idea is to precompute answers for all ranges of lengths that are powers of two and then answer any query by combining two overlapping ranges.
  • The recurrence is st[i][j] = f(st[i][j-1], st[i + 2^{j-1}][j-1]) where f is the operation (e.g., min or gcd).
  • For a query [L, R], pick k = (R-L+1) and combine st[L][k] with st[R - 2^k + 1][k].
  • Sparse Tables are read-only: they do not support updates efficiently and use O(n n) memory.
  • They are widely used to solve Range Minimum Query (RMQ) and to compute Lowest Common Ancestors (LCA) via Euler tour + RMQ.
  • They are not suitable for non-idempotent operations like sum if you need O(1) queries, though O( n) queries via disjoint decomposition are possible.

Prerequisites

  • Arrays and indexingSparse Tables are built over arrays and use careful inclusive range indexing.
  • Logarithms (base 2)Choosing block sizes and building the lg[] table relies on floor(log2).
  • Associativity and IdempotencyThese properties ensure correctness of the build and O(1) query combining overlapping intervals.
  • Divide and Conquer intuitionThe table is built by combining solutions for halves of power-of-two intervals.
  • DFS and Trees (for LCA)Euler tour + RMQ reduces LCA to a Sparse Table query.
  • GCD and Bitwise operationsCommon idempotent operations supported by Sparse Tables.
  • Segment Tree / Fenwick Tree (contrast)Helps decide when Sparse Table is preferable for static queries.

Detailed Explanation

Tap terms for definitions

01Overview

A Sparse Table is a data structure designed for fast range queries on an array when the array does not change. The main goal is to answer queries like “What is the minimum in A[L..R]?” extremely quickly after a one-time preprocessing step. Sparse Tables achieve this by precomputing results for intervals whose lengths are powers of two: 1, 2, 4, 8, and so on. These precomputed answers can then be combined to answer any query in constant time for a broad class of operations. Typical operations include minimum, maximum, greatest common divisor (gcd), and bitwise AND/OR. These operations share two important properties: associativity (grouping does not matter) and idempotency (combining the same value with itself does not change it). With these, even if we combine two overlapping precomputed intervals, we still get the correct answer.

The power of the Sparse Table is its simplicity and speed on static data: preprocessing takes O(n \log n) time and O(n \log n) memory, while each query is answered in O(1) time when the operation is idempotent. This makes it ideal in competitive programming scenarios with many queries and no updates. It is also a key building block for solving Lowest Common Ancestor (LCA) queries on trees using an Euler tour and RMQ (Range Minimum Query).

02Intuition & Analogies

Imagine you run a warehouse where items are stored in a single long row (your array). Customers ask, “What is the smallest item between shelves L and R?” Instead of scanning the whole stretch each time, you prepare boxes in advance. First, you make tiny boxes of length 1 for every shelf, then bigger boxes that cover every pair of shelves (length 2), then boxes covering 4 shelves, then 8, and so on—always doubling. Each box stores the answer to the question for that exact stretch (like the smallest item inside that box).

Now when a customer asks for the minimum over shelves L..R, you choose the largest box size that still fits inside the range—say size 2^k. You can place one such box at the left end (starting at L) and another identical-sized box at the right end (ending at R). These two boxes will overlap in the middle, but that’s fine if your operation is idempotent. For example, with “min,” double-counting items in the overlap doesn’t change the minimum. So you just take the minimum of the two box answers and you’re done—no need to open all shelves.

This “power-of-two boxing” is the essence of Sparse Tables. The toolbox works whenever combining overlapping coverage is safe (idempotent). If overlapping would distort the result (like for sum), the trick doesn’t give O(1) queries, though you can still piece together several non-overlapping boxes to get O(\log n) time, or use a different structure like a segment tree.

03Formal Definition

Let A[0..n-1] be an array and let f be a binary operation on the array’s elements. We require f to be associative: f(f(x, y), z) = f(x, f(y, z)). For O(1) queries with the standard Sparse Table method, we also need idempotency: f(x, x) = x (typical examples: min, max, gcd, bitwise AND/OR). Define the table st where st[i][j] stores the value f over the interval A[i..i + 2^j - 1]. The base level is st[i][0] = A[i]. For , we compute using the recurrence st[i][j] = f(st[i][j-1], st[i + 2^{j-1}][j-1]). This describes combining two adjacent intervals of length 2^{j-1} to cover length 2^j. For a query over [L, R], let k = (R - L + 1) . Then the standard O(1) answer for idempotent f is f(st[L][k], st[R - 2^k + 1][k]). These two precomputed blocks of length 2^k cover the entire range [L, R] with overlap; idempotency ensures correctness despite overlap. Preprocessing st takes O(n n) time and O(n n) memory. Query time is O(1) under idempotency; otherwise, combining O( n) disjoint blocks is possible but not typically the motivation for Sparse Tables.

04When to Use

Use a Sparse Table when: (1) the array is static (no updates), (2) you have many range queries, and (3) the operation is associative and idempotent. Classic cases include Range Minimum Query (RMQ), Range Maximum Query, Range GCD, and bitwise AND/OR queries. It’s also excellent for Lowest Common Ancestor (LCA) in trees: perform an Euler tour that records the first occurrence of each node and its depth along the tour, then answer LCA queries as RMQs on depths.

Prefer a Sparse Table over a Segment Tree when you care about constant-time queries and don’t need updates. Segment Trees give O(\log n) queries and support updates; Sparse Tables use more memory but answer faster per query and are simpler to implement. For sums or other non-idempotent operations, a standard Sparse Table does not give O(1) queries, though you can either do O(\log n) queries via a disjoint power-of-two decomposition, use prefix sums for sum specifically, or consider a Disjoint Sparse Table variant.

Avoid Sparse Tables if memory is tight (O(n \log n)) or if you need to handle point/range updates—use Fenwick Trees or Segment Trees instead. When n is small or the number of queries is tiny, simple scanning or prefix techniques might be enough.

⚠️Common Mistakes

  • Forgetting idempotency: Using sum or XOR and expecting O(1) queries with the “two overlapping blocks” trick is incorrect. For non-idempotent operations, either accept O(\log n) by combining disjoint blocks or use another data structure.
  • Off-by-one errors: Queries are usually inclusive ranges [L, R]. Ensure the second block starts at R - 2^k + 1; missing the +1 is a common bug.
  • Wrong log table: Precompute an integer log table lg[] with lg[1] = 0 and lg[i] = lg[i/2] + 1 for i ≥ 2. Using floating logs or forgetting that lg[0] is undefined leads to crashes or wrong k values.
  • Memory overuse: st has about n(\lfloor \log_2 n \rfloor + 1) entries. For large n (e.g., 2e5), ensure memory fits. Consider using a vector of vectors or a flat array and the smallest suitable type.
  • Tie-breaking with argmin/argmax: If you store values only, you might lose which index achieved the min/max. To return positions, store pairs (value, index) and define a comparator that breaks ties deterministically.
  • Mixing 0-based and 1-based indexing: Decide on one convention for both array and queries. Converting carelessly between them is a common source of bugs.
  • Not handling n = 1 or short ranges: Ensure R ≥ L and your log computation and table building handle tiny arrays without accessing negative indices.

Key Formulas

Base Level

Explanation: Each interval of length 1 just stores the element itself. This initializes the sparse table.

Sparse Table Recurrence

Explanation: To build answers for length 2^j, combine two adjacent intervals of length 2^{j-1}. This uses associativity of f.

Query Block Size

Explanation: Pick the largest power-of-two block length that fits entirely within the query range. This minimizes the number of blocks used.

Two-Block Query

Explanation: Answer a range query by combining two overlapping blocks that together cover [L, R]. Idempotency ensures correctness despite overlap.

Space Complexity

Explanation: We store one table for each power-of-two length up to the largest that fits in n, leading to O(n log n) space.

Build Complexity

Explanation: Each level j computes roughly n entries using O(1) combinations, and there are n + 1 levels.

Query Complexity

Explanation: With idempotency, combining two precomputed blocks gives the answer immediately in constant time.

Idempotent Property

Explanation: Combining overlapping coverage does not change the result because duplicates have no effect, critical for the O(1) query trick.

RMQ Definition

Explanation: The Range Minimum Query returns the smallest array value within indices L through R inclusive.

LCA via RMQ

Explanation: After an Euler tour recording (depth, node), the LCA is the node with minimum depth between the first occurrences of u and v in the tour.

Complexity Analysis

Let n be the array size and K = ⌊log2 n⌋. The sparse table has K+1 levels, one for each power-of-two length 2^j for ..K. At level j, we store entries for starting indices ..n - 2^j, totaling n - 2^j + 1 ≈ O(n) entries per level. Summed over all j, the number of stored items is ∑_{j=0}^{K} (n - 2^j + 1) = O(nK) = O(n log n). Thus, space complexity is O(n log n). Preprocessing time is also O(n log n). The base level copies the array in O(n). Each higher level j computes every st[i][j] from two already-built values st[i][j-1] and st[i+2^{j-1}][j-1] using one O(1) combine. With about n entries per level and K levels, the total is O(n log n). For idempotent operations (min, max, gcd, AND, OR), queries run in O(1): choose k = ⌊log2 (R − L + 1)⌋ and return f(st[L][k], st[R − 2^k + 1][k]). This works because the two blocks overlap safely under idempotency. For non-idempotent but associative operations (like sum or XOR), the two-block method is invalid due to double-counting; however, you can decompose [L, R] into O(log n) disjoint power-of-two blocks, yielding O(log n) per query. In practice, for sums, prefix sums or Fenwick/Segment Trees are preferable. The Sparse Table is static: updates would require rebuilding affected levels, which in the worst case is O(n) to O(n log n), so it is generally unsuitable for dynamic scenarios.

Code Examples

Sparse Table for Range Minimum Query (RMQ) with O(1) Queries
1#include <bits/stdc++.h>
2using namespace std;
3
4// Generic Sparse Table for an idempotent, associative operation
5// Here we will use it for RMQ (min), but it works for max, gcd, AND, OR as well.
6
7template <class T, class F>
8struct SparseTable {
9 int n, K; // array size and maximum power of two
10 vector<int> lg; // floor(log2) table for 1..n
11 vector<vector<T>> st; // st[i][j] stores f over [i, i + 2^j - 1]
12 F op; // combining function (e.g., min)
13
14 SparseTable() {}
15
16 SparseTable(const vector<T>& a, F comb) { build(a, comb); }
17
18 void build(const vector<T>& a, F comb) {
19 op = comb;
20 n = (int)a.size();
21 K = 0;
22 while ((1 << K) <= n) K++;
23 lg.assign(n + 1, 0);
24 for (int i = 2; i <= n; ++i) lg[i] = lg[i / 2] + 1;
25 st.assign(K, vector<T>(n));
26 // level 0 is the array itself
27 for (int i = 0; i < n; ++i) st[0][i] = a[i];
28 // build higher levels
29 for (int j = 1; j < K; ++j) {
30 int len = 1 << j;
31 int half = len >> 1;
32 for (int i = 0; i + len - 1 < n; ++i) {
33 st[j][i] = op(st[j - 1][i], st[j - 1][i + half]);
34 }
35 }
36 }
37
38 // Query on inclusive range [L, R]
39 T query(int L, int R) const {
40 if (L > R) swap(L, R);
41 int k = lg[R - L + 1];
42 // Combine two overlapping blocks of size 2^k
43 return op(st[k][L], st[k][R - (1 << k) + 1]);
44 }
45};
46
47int main() {
48 ios::sync_with_stdio(false);
49 cin.tie(nullptr);
50
51 // Example array
52 vector<int> A = {5, 2, 7, 3, 6, 1, 4, 9};
53
54 // Build Sparse Table for minimum
55 auto mn = [](int a, int b) { return min(a, b); };
56 SparseTable<int, decltype(mn)> st(A, mn);
57
58 // Sample queries
59 vector<pair<int,int>> queries = {{0, 3}, {2, 5}, {0, 7}, {5, 5}};
60 for (auto [L, R] : queries) {
61 cout << "min(" << L << ", " << R << ") = " << st.query(L, R) << "\n";
62 }
63
64 return 0;
65}
66

We implement a generic Sparse Table that stores answers for intervals whose lengths are powers of two. For RMQ, the operation is min. During a query [L, R], we pick k = floor(log2(R-L+1)) and combine st[k][L] with st[k][R-2^k+1]. Since min is idempotent and associative, overlapping is safe and the result is correct in O(1) time.

Time: Preprocessing: O(n log n), Query: O(1)Space: O(n log n)
Range GCD Queries with Sparse Table (O(1) per query)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Sparse Table for GCD. gcd is associative and idempotent.
5
6struct GCDOp {
7 int operator()(int a, int b) const { return std::gcd(a, b); }
8};
9
10struct GCDSparseTable {
11 int n, K;
12 vector<int> lg;
13 vector<vector<int>> st;
14
15 GCDSparseTable() {}
16 GCDSparseTable(const vector<int>& a) { build(a); }
17
18 void build(const vector<int>& a) {
19 n = (int)a.size();
20 K = 0; while ((1 << K) <= n) K++;
21 lg.assign(n + 1, 0);
22 for (int i = 2; i <= n; ++i) lg[i] = lg[i / 2] + 1;
23 st.assign(K, vector<int>(n));
24 for (int i = 0; i < n; ++i) st[0][i] = a[i];
25 for (int j = 1; j < K; ++j) {
26 int half = 1 << (j - 1);
27 for (int i = 0; i + (1 << j) - 1 < n; ++i) {
28 st[j][i] = std::gcd(st[j - 1][i], st[j - 1][i + half]);
29 }
30 }
31 }
32
33 int query(int L, int R) const {
34 if (L > R) swap(L, R);
35 int k = lg[R - L + 1];
36 return std::gcd(st[k][L], st[k][R - (1 << k) + 1]);
37 }
38};
39
40int main() {
41 ios::sync_with_stdio(false);
42 cin.tie(nullptr);
43
44 vector<int> A = {24, 18, 30, 12, 6, 9, 3, 15};
45 GCDSparseTable st(A);
46
47 vector<pair<int,int>> queries = {{0, 1}, {1, 4}, {0, 7}, {3, 6}};
48 for (auto [L, R] : queries) {
49 cout << "gcd(" << L << ", " << R << ") = " << st.query(L, R) << "\n";
50 }
51 return 0;
52}
53

We specialize a sparse table for gcd using std::gcd. Because gcd is associative and idempotent, the same two-block query technique applies, giving O(1) time per query after O(n log n) preprocessing.

Time: Preprocessing: O(n log n), Query: O(1)Space: O(n log n)
LCA via Euler Tour + RMQ (Sparse Table on Depths)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Sparse Table that stores pairs (depth, node) and returns the pair with smaller depth.
5struct MinDepthOp {
6 pair<int,int> operator()(const pair<int,int>& a, const pair<int,int>& b) const {
7 if (a.first != b.first) return a.first < b.first ? a : b;
8 // Tie-break by node id for determinism (optional)
9 return a.second < b.second ? a : b;
10 }
11};
12
13struct SparseTable {
14 int n, K;
15 vector<int> lg;
16 vector<vector<pair<int,int>>> st; // (depth, node)
17 MinDepthOp op;
18
19 SparseTable() {}
20
21 void build(const vector<pair<int,int>>& a) {
22 n = (int)a.size();
23 K = 0; while ((1 << K) <= n) K++;
24 lg.assign(n + 1, 0);
25 for (int i = 2; i <= n; ++i) lg[i] = lg[i / 2] + 1;
26 st.assign(K, vector<pair<int,int>>(n));
27 for (int i = 0; i < n; ++i) st[0][i] = a[i];
28 for (int j = 1; j < K; ++j) {
29 int half = 1 << (j - 1);
30 for (int i = 0; i + (1 << j) - 1 < n; ++i) {
31 st[j][i] = op(st[j - 1][i], st[j - 1][i + half]);
32 }
33 }
34 }
35
36 // Inclusive range [L, R]
37 pair<int,int> query(int L, int R) const {
38 if (L > R) swap(L, R);
39 int k = lg[R - L + 1];
40 return op(st[k][L], st[k][R - (1 << k) + 1]);
41 }
42};
43
44struct LCA {
45 int n;
46 vector<vector<int>> g;
47 vector<int> depth;
48 vector<int> first; // first occurrence of each node in euler
49 vector<pair<int,int>> euler; // (depth, node)
50 SparseTable st; // RMQ over euler by depth
51
52 LCA(int n) : n(n), g(n), depth(n, 0), first(n, -1) {}
53
54 void add_edge(int u, int v) {
55 g[u].push_back(v);
56 g[v].push_back(u);
57 }
58
59 void dfs(int u, int p, int d) {
60 depth[u] = d;
61 first[u] = (int)euler.size();
62 euler.push_back({d, u});
63 for (int v : g[u]) if (v != p) {
64 dfs(v, u, d + 1);
65 // Upon returning, record (depth,parent) again (Euler tour property)
66 euler.push_back({d, u});
67 }
68 }
69
70 void build(int root = 0) {
71 euler.clear();
72 dfs(root, -1, 0);
73 st.build(euler);
74 }
75
76 int lca(int u, int v) const {
77 int L = first[u];
78 int R = first[v];
79 return st.query(L, R).second; // node id from pair(depth,node)
80 }
81};
82
83int main() {
84 ios::sync_with_stdio(false);
85 cin.tie(nullptr);
86
87 // Build a sample tree
88 int n = 9;
89 LCA solver(n);
90 // edges (undirected)
91 solver.add_edge(0, 1);
92 solver.add_edge(0, 2);
93 solver.add_edge(1, 3);
94 solver.add_edge(1, 4);
95 solver.add_edge(2, 5);
96 solver.add_edge(2, 6);
97 solver.add_edge(4, 7);
98 solver.add_edge(4, 8);
99
100 solver.build(0); // root at 0
101
102 vector<pair<int,int>> queries = {{3, 7}, {7, 8}, {5, 6}, {3, 5}, {3, 4}};
103 for (auto [u, v] : queries) {
104 cout << "LCA(" << u << ", " << v << ") = " << solver.lca(u, v) << "\n";
105 }
106 return 0;
107}
108

We compute LCA using an Euler tour: during DFS we push (depth, node) on entry and after each return. The first occurrence of each node in this Euler list bounds the RMQ range. The LCA of u and v is the node with minimum depth between first[u] and first[v] in the Euler sequence, which we answer in O(1) using a Sparse Table over pairs (depth, node).

Time: Preprocessing: O(n log n) after an O(n) DFS, Query: O(1)Space: O(n log n) for the Sparse Table plus O(n) for the tree and arrays