🗂️Data StructureIntermediate

2D Fenwick Tree

Key Points

  • A 2D Fenwick Tree (Binary Indexed Tree) supports point updates and rectangle sum queries in O(log n × log m) time.
  • It generalizes the 1D BIT by nesting a BIT over y inside each x-node’s BIT trajectory.
  • Rectangle sum queries use inclusion–exclusion on 2D prefix sums: four calls to the 2D prefix function.
  • Memory can be O(nm) if fully dense, but can be reduced with per-node y-coordinate compression to about O(K log n) where K is the number of points updated.
  • You can also support rectangle updates with point queries by treating the structure as a 2D difference array and updating four corners.
  • When updates/queries are offline, a sweep line across x with a 1D BIT over y is a powerful alternative.
  • Careful 1-based indexing and inclusive bounds are essential to avoid off-by-one errors.
  • Use long long for sums to avoid overflow when many updates accumulate.

Prerequisites

  • 1D Fenwick Tree (Binary Indexed Tree)The 2D version nests 1D BITs; understanding LSB traversal and prefix sums in 1D is essential.
  • Prefix sums and inclusion–exclusionRectangle queries are reduced to four prefix sums using inclusion–exclusion identities.
  • Bitwise operationsThe operations i += i & −i and i −= i & −i rely on understanding least significant bits.
  • Coordinate compressionNecessary to reduce memory when coordinates are large or the grid is sparse.
  • Offline algorithms and sweep lineProvides an alternative solution strategy when all queries are known beforehand.
  • Inclusive interval notationPrevents off-by-one errors in rectangle boundaries and updates.

Detailed Explanation

Tap terms for definitions

01Overview

A 2D Fenwick Tree (also called a 2D Binary Indexed Tree) is a data structure that efficiently maintains cumulative information (typically sums) over a 2D grid. It extends the classic 1D Fenwick Tree by introducing a second dimension: for each Fenwick node along the x-axis, we keep another Fenwick Tree that accumulates values along the y-axis. This nested approach preserves the logarithmic efficiency in both dimensions. The most common operations are point updates—adding a value to a single cell—and rectangle sum queries—asking for the sum of all values within a sub-rectangle. Both operations run in O(log n × log m), where n and m are the grid’s dimensions.

In competitive programming and algorithmic problems, 2D Fenwick Trees are particularly useful when the dataset is dynamic: elements change over time, and queries interleave with updates. While a naive 2D prefix-sum array can answer rectangle sums in O(1), it suffers from O(nm) updates, which is too slow for large grids. The 2D Fenwick Tree balances both operations efficiently. Although a straightforward implementation uses O(nm) memory, practical variants reduce memory significantly with coordinate compression or by building only where needed. When queries are known in advance (offline), a sweep line using a 1D Fenwick Tree can be even simpler and faster in practice.

02Intuition & Analogies

Imagine a city laid out on a grid of streets (x) and avenues (y). At some intersections, coins are dropped, and you want to quickly count how many coins lie inside any rectangular neighborhood. A brute force count would walk through every intersection inside the rectangle—too slow for big cities. A 2D prefix-sum map helps you count fast but is expensive to update when new coins are dropped because you must rewrite many cells.

The 2D Fenwick Tree is like having an organized set of coin counters stationed at strategic intersections. Along each row cluster (grouped by powers of two on the x-axis), you place a sub-network of counters along the y-axis, also grouped by powers of two. When you drop a coin at (x, y), you don’t shout to the entire city—only to the subset of counters who are responsible for that area. Thanks to the binary grouping, the number of counters you notify is logarithmic in each dimension.

When you need the total coins in a rectangle, you ask for four cumulative tallies (top-right cumulative areas) and combine them using inclusion–exclusion, like taking a big count, then subtracting the overlapping parts, and adding back the double-subtracted corner. This way, both adding coins and querying rectangles are fast because they involve contacting only O(log n × log m) counters. If the city is sparse (few intersections ever get coins), you don’t need counters at every single spot—you can compress the avenues tracked by each x-cluster to only those that matter, saving memory while keeping the same logic.

03Formal Definition

Let A be an n × m array with entries A[x][y]. Define the 2D prefix sum function S as S(x, y) = A[i][j]. A 2D Fenwick Tree stores partial sums enabling two primitives: pointUpdate(x, y, ), which sets + , and prefixSum(x, y), which returns S(x, y). Rectangle sums over [x1..x2] × [y1..y2] are computed by inclusion–exclusion: , y2) − S(x1−1, y2) − S(x2, y1−1) + S(x1−1, y1−1). The structure is implemented as a 2D array BIT where BIT[i][j] stores the sum over a sub-rectangle defined by least significant set bits of i and j. Specifically, BIT[i][j] aggregates A[p][q] where p ∈ (i − (i & −i) + 1 .. i) and q ∈ (j − (j & −j) + 1 .. j). The update operation iterates i along i += i & −i and for each i, iterates j along j += j & −j, adding to BIT[i][j]. The prefix sum iterates i along i -= i & −i and within iterates j along j -= j & −j, accumulating BIT[i][j]. Time complexity per update or query is O( n m) because each dimension progresses by clearing or setting the least significant bit, which takes O() steps. Space is O(nm) in the dense variant, with compressed variants reducing space based on the number of active coordinates.

04When to Use

Use a 2D Fenwick Tree when you need to mix many point updates with many rectangle sum queries on a grid. It shines when both operations are frequent and must be near-logarithmic. Typical applications include: counting points in axis-aligned rectangles, dynamic heatmaps or populations on grids, scoring submatrices, and some 2D inversion or dominance counting problems.

If updates are only point-based and queries are rectangle sums, the standard 2D BIT is ideal. If you instead need rectangle updates and point queries (e.g., painting a rectangle many times and asking the color intensity at a specific cell), the 2D BIT can be used as a 2D difference array with four corner updates.

When the grid is extremely large but only K cells are ever touched, prefer a compressed 2D BIT (Fenwick tree of compressed y-lists) to keep memory roughly proportional to K log n rather than n × m. If all queries are known beforehand (offline), consider a sweep line on x with a 1D BIT on y to answer many rectangle queries without building a full 2D structure. Finally, if you require range updates and range queries both in 2D online, consider a 2D segment tree or using multiple 2D BITs with more complex formulas—but be mindful of implementation overhead.

⚠️Common Mistakes

• Mixing 0-based and 1-based indexing: Fenwick Trees are simplest with 1-based indices. If your input is 0-based, shift by +1 everywhere, and shift back in I/O. • Incorrect rectangle bounds: Remember rectangles are inclusive. The formula S(x2, y2) − S(x1−1, y2) − S(x2, y1−1) + S(x1−1, y1−1) assumes x1 ≤ x2 and y1 ≤ y2. Guard against x1 = 1 or y1 = 1 by returning 0 for out-of-range prefixes. • Wrong corner signs for rectangle updates in the difference-array variant: To add \Delta to [x1..x2] × [y1..y2], apply +\Delta at (x1, y1), −\Delta at (x1, y2+1), −\Delta at (x2+1, y1), +\Delta at (x2+1, y2+1), checking boundaries. • Integer overflow: Sums can exceed 32-bit int when many updates accumulate. Use long long (64-bit) for BIT storage and return values. • Memory blow-up: A naive 2D array of size n × m may be impossible for large constraints. Use coordinate compression or a compressed 2D BIT (Fenwick tree of vectors) when only K points are active. • Forgetting to prebuild compressed structures: Per-node y-lists must be collected from all points that could ever be updated before processing online operations; otherwise, later updates may target missing coordinates. • Off-by-one in loops: Update loops use i += i & −i and query loops use i -= i & −i; using the wrong direction yields infinite loops or wrong results. Similarly for j.

Key Formulas

2D Prefix Sum

Explanation: This defines the cumulative sum in the top-left rectangle from (1,1) to (x,y). It is the fundamental quantity the 2D BIT can retrieve quickly.

Rectangle Sum via Inclusion–Exclusion

Explanation: A rectangle’s sum is computed from four prefix sums: subtract the two overlapping strips and add back the double-subtracted corner. This reduces rectangle queries to four prefix queries.

Fenwick Traversal

Explanation: Updates move forward along i += LSB, while queries move backward along i -= LSB. The same holds for the y-index j.

Per-Operation Time

Explanation: Both update and prefix sum touch O(log n) x-nodes and O(log m) y-nodes, yielding multiplicative logarithmic time.

Space Complexity

Explanation: A full 2D array of BIT nodes is nm. With coordinate compression per x-node and K active points, each point contributes to O(log n) x-nodes, giving about K log n stored y-entries.

2D Difference for Rectangle Add

Explanation: Adding to a rectangle in the original grid corresponds to four point updates in the 2D difference array. The value at (x,y) is then the 2D prefix sum over D.

Complexity Analysis

Let n and m denote the grid dimensions. In a dense 2D Fenwick Tree, update(x, y, Δ) iterates over O(log n) x-indices via i += i & −i. For each i, it iterates over O(log m) y-indices via j += j & −j to add Δ into BIT[i][j]. Therefore, each update is O(log n · log m). Similarly, prefixSum(x, y) runs in O(log n · log m) by traversing i -= i & −i and j -= j & −j, accumulating partial sums. A rectangle sum query requires four prefix sums, so it is O(log n · log m) with a constant factor of 4. Space for the dense structure is O(nm) because each of the n Fenwick rows has m Fenwick columns. This may be acceptable for n, m up to about 2000–4000 depending on memory limits, but not for very large grids. To reduce memory for sparse updates, a compressed 2D Fenwick stores, for each x-node i, only the y-coordinates that ever appear in operations. Each update at (x, y) affects O(log n) x-nodes; if a total of K points are updated over time, the overall number of stored y-entries is about O(K log n). Each per-node y-structure is itself a 1D Fenwick Tree, so updates and queries still take O(log n · log ), where is the size of the compressed y-list at node i; in the worst case . If all operations are offline, you can transform Q rectangle queries into O(Q) signed prefix queries at x2 and x1−1 and process them with a 1D BIT over y using a sweep line over x. This yields O((K + Q) log M) time, where M is the number of distinct y’s after compression, and O(M) space—often faster and lighter than a full 2D BIT.

Code Examples

Dense 2D Fenwick Tree: Point Update + Rectangle Sum Query
1#include <bits/stdc++.h>
2using namespace std;
3
4// 2D Fenwick Tree (dense): 1-based indices
5// Supports: add(x,y,delta), prefixSum(x,y), rectSum(x1,y1,x2,y2)
6
7struct Fenwick2D {
8 int n, m;
9 vector<vector<long long>> bit; // bit[i][j] stores partial sums
10
11 Fenwick2D(int n_, int m_) : n(n_), m(m_), bit(n_ + 1, vector<long long>(m_ + 1, 0)) {}
12
13 // Add delta to A[x][y]
14 void add(int x, int y, long long delta) {
15 for (int i = x; i <= n; i += i & -i) {
16 for (int j = y; j <= m; j += j & -j) {
17 bit[i][j] += delta;
18 }
19 }
20 }
21
22 // Sum over rectangle (1..x, 1..y)
23 long long prefixSum(int x, int y) const {
24 long long res = 0;
25 for (int i = x; i > 0; i -= i & -i) {
26 for (int j = y; j > 0; j -= j & -j) {
27 res += bit[i][j];
28 }
29 }
30 return res;
31 }
32
33 // Sum over rectangle (x1..x2, y1..y2), inclusive
34 long long rectSum(int x1, int y1, int x2, int y2) const {
35 if (x1 > x2) swap(x1, x2);
36 if (y1 > y2) swap(y1, y2);
37 x1 = max(x1, 1); y1 = max(y1, 1);
38 x2 = min(x2, n); y2 = min(y2, m);
39 if (x1 > x2 || y1 > y2) return 0;
40 long long A = prefixSum(x2, y2);
41 long long B = prefixSum(x1 - 1, y2);
42 long long C = prefixSum(x2, y1 - 1);
43 long long D = prefixSum(x1 - 1, y1 - 1);
44 return A - B - C + D;
45 }
46};
47
48int main() {
49 ios::sync_with_stdio(false);
50 cin.tie(nullptr);
51
52 int n = 5, m = 6;
53 Fenwick2D fw(n, m);
54
55 // Example operations
56 fw.add(2, 3, 5); // A[2][3] += 5
57 fw.add(4, 5, 7); // A[4][5] += 7
58 fw.add(1, 1, 2); // A[1][1] += 2
59
60 cout << fw.rectSum(1, 1, 2, 3) << "\n"; // sum of a 2x3 top-left block: 2 + 5 = 7
61 cout << fw.rectSum(3, 3, 5, 6) << "\n"; // includes A[4][5] = 7
62 cout << fw.rectSum(1, 1, 5, 6) << "\n"; // total sum: 2 + 5 + 7 = 14
63
64 return 0;
65}
66

This program builds a dense 2D Fenwick Tree over an n × m grid. The add function performs a point update in O(log n log m). The rectSum function computes the sum over any inclusive rectangle using four prefix sums and inclusion–exclusion. The example demonstrates three queries after three point updates.

Time: O(log n log m) per update/querySpace: O(n m)
2D Difference Variant: Rectangle Update + Point Query
1#include <bits/stdc++.h>
2using namespace std;
3
4// Use a 2D Fenwick Tree to store a 2D difference array D.
5// addRect(x1,y1,x2,y2,delta) performs four point updates in D.
6// pointValue(x,y) returns the original A[x][y] by 2D prefix sum over D.
7
8struct Fenwick2D {
9 int n, m;
10 vector<vector<long long>> bit;
11 Fenwick2D(int n_, int m_) : n(n_), m(m_), bit(n_ + 1, vector<long long>(m_ + 1, 0)) {}
12 void add(int x, int y, long long delta) {
13 if (x <= 0 || y <= 0) return; // ignore out-of-range
14 for (int i = x; i <= n; i += i & -i)
15 for (int j = y; j <= m; j += j & -j)
16 bit[i][j] += delta;
17 }
18 long long prefixSum(int x, int y) const {
19 x = min(x, n); y = min(y, m);
20 long long res = 0;
21 for (int i = x; i > 0; i -= i & -i)
22 for (int j = y; j > 0; j -= j & -j)
23 res += bit[i][j];
24 return res;
25 }
26};
27
28struct RectUpdatePointQuery2D {
29 int n, m;
30 Fenwick2D fw;
31 RectUpdatePointQuery2D(int n_, int m_) : n(n_), m(m_), fw(n_, m_) {}
32
33 // Add delta to all cells in rectangle [x1..x2] x [y1..y2]
34 void addRect(int x1, int y1, int x2, int y2, long long delta) {
35 if (x1 > x2) swap(x1, x2);
36 if (y1 > y2) swap(y1, y2);
37 // Apply 2D difference updates (with boundary checks)
38 fw.add(x1, y1, delta);
39 if (y2 + 1 <= m) fw.add(x1, y2 + 1, -delta);
40 if (x2 + 1 <= n) fw.add(x2 + 1, y1, -delta);
41 if (x2 + 1 <= n && y2 + 1 <= m) fw.add(x2 + 1, y2 + 1, delta);
42 }
43
44 // Value at a single point after all rectangle updates
45 long long pointValue(int x, int y) const {
46 return fw.prefixSum(x, y);
47 }
48};
49
50int main() {
51 ios::sync_with_stdio(false);
52 cin.tie(nullptr);
53
54 int n = 5, m = 5;
55 RectUpdatePointQuery2D ds(n, m);
56
57 ds.addRect(2, 2, 4, 4, 3); // Add 3 to a 3x3 centered block
58 ds.addRect(1, 1, 2, 3, 2); // Add 2 to top-left 2x3 rectangle
59
60 cout << ds.pointValue(1, 1) << "\n"; // 2
61 cout << ds.pointValue(2, 2) << "\n"; // 3 + 2 = 5
62 cout << ds.pointValue(4, 4) << "\n"; // 3
63 cout << ds.pointValue(5, 5) << "\n"; // 0
64
65 return 0;
66}
67

This code uses the 2D BIT as a 2D difference array. Rectangle updates are decomposed into four corner point updates with signs +, −, −, +. The pointValue query is a 2D prefix sum over the difference array, yielding the final value at (x, y).

Time: O(log n log m) per update/querySpace: O(n m)
Compressed 2D Fenwick Tree (Fenwick-of-Compressed-Y)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compressed 2D Fenwick Tree
5// Build with known points (x,y) that may ever be updated/queried.
6// For each x-node along Fenwick path, store a compressed y-list and a 1D BIT.
7
8struct Fenwick2DCompressed {
9 int n; // x dimension size (max coordinate after compression or bound)
10 vector<vector<int>> ys; // ys[i]: sorted unique y-coordinates attached to x-node i
11 vector<vector<long long>> bit; // bit[i]: Fenwick tree over ys[i]
12
13 Fenwick2DCompressed(int n_) : n(n_), ys(n_ + 1), bit(n_ + 1) {}
14
15 // Register a point (x,y) so that data structures can be allocated.
16 // Call for all potential points BEFORE init().
17 void register_point(int x, int y) {
18 for (int i = x; i <= n; i += i & -i) {
19 ys[i].push_back(y);
20 }
21 }
22
23 // Must be called after all register_point calls, before updates/queries.
24 void init() {
25 for (int i = 1; i <= n; ++i) {
26 auto &v = ys[i];
27 sort(v.begin(), v.end());
28 v.erase(unique(v.begin(), v.end()), v.end());
29 bit[i].assign(v.size() + 1, 0LL); // 1-based Fenwick array over compressed y
30 }
31 }
32
33 // Internal: find 1-based index of y in ys[i] for update (lower_bound)
34 static int idx_update(const vector<int> &v, int y) {
35 return int(lower_bound(v.begin(), v.end(), y) - v.begin()) + 1;
36 }
37
38 // Internal: find 1-based index for query prefix (upper_bound)
39 static int idx_query(const vector<int> &v, int y) {
40 return int(upper_bound(v.begin(), v.end(), y) - v.begin());
41 }
42
43 // Add delta to point (x,y)
44 void add(int x, int y, long long delta) {
45 for (int i = x; i <= n; i += i & -i) {
46 int j = idx_update(ys[i], y);
47 // If y was never registered for this node, skip (defensive)
48 if (j <= 0 || j > (int)ys[i].size()) continue;
49 for (int k = j; k <= (int)ys[i].size(); k += k & -k) {
50 bit[i][k] += delta;
51 }
52 }
53 }
54
55 // Sum over rectangle (1..x, 1..y)
56 long long prefixSum(int x, int y) const {
57 long long res = 0;
58 for (int i = x; i > 0; i -= i & -i) {
59 int j = idx_query(ys[i], y);
60 for (int k = j; k > 0; k -= k & -k) {
61 res += bit[i][k];
62 }
63 }
64 return res;
65 }
66
67 long long rectSum(int x1, int y1, int x2, int y2) const {
68 if (x1 > x2) swap(x1, x2);
69 if (y1 > y2) swap(y1, y2);
70 long long A = prefixSum(x2, y2);
71 long long B = prefixSum(x1 - 1, y2);
72 long long C = prefixSum(x2, y1 - 1);
73 long long D = prefixSum(x1 - 1, y1 - 1);
74 return A - B - C + D;
75 }
76};
77
78int main() {
79 ios::sync_with_stdio(false);
80 cin.tie(nullptr);
81
82 int n = 10; // maximum x coordinate
83
84 // Suppose we will update only K points (x,y). We know them beforehand (or collect from input).
85 vector<pair<int,int>> points = {{2,5},{7,3},{4,9},{2,1}};
86
87 Fenwick2DCompressed fw(n);
88
89 // Register points for all Fenwick x-nodes
90 for (auto [x,y] : points) fw.register_point(x, y);
91
92 // If you plan to query rectangles up to certain y, also register those y's as needed.
93 // For typical point-update + rect-sum, registering update points is enough.
94
95 fw.init();
96
97 // Perform updates only on registered points
98 fw.add(2, 5, 10);
99 fw.add(7, 3, 4);
100 fw.add(4, 9, -2);
101 fw.add(2, 1, 7);
102
103 cout << fw.rectSum(1,1,2,5) << "\n"; // 10 + 7 = 17
104 cout << fw.rectSum(1,1,7,3) << "\n"; // includes (7,3)=4 and (2,1)=7 => 11
105 cout << fw.rectSum(4,9,4,9) << "\n"; // point (4,9) = -2
106
107 return 0;
108}
109

This implementation compresses y-coordinates per x-node. For each potential update point (x,y), we register y along the x Fenwick path, then build a Fenwick tree over the unique y’s at each node. Updates and queries proceed with lower_bound/upper_bound to map y to local indices. Memory is proportional to the number of distinct y’s attached to nodes, roughly O(K log n) for K updated points.

Time: O(log n log M) per operation, where M is the per-node y-list sizeSpace: O(K log n)
Offline Sweep Line with 1D Fenwick: Point Updates + Rectangle Sums
1#include <bits/stdc++.h>
2using namespace std;
3
4// Offline solution for point updates and rectangle sum queries.
5// Transform each rectangle query into two prefix queries at x2 (sign +1) and x1-1 (sign -1).
6// Process all events sorted by x; maintain a 1D BIT over compressed y.
7
8struct BIT1D {
9 int n; vector<long long> bit;
10 BIT1D(int n_=0): n(n_), bit(n_+1,0) {}
11 void add(int i, long long v){ for(; i<=n; i+= i&-i) bit[i]+=v; }
12 long long sum(int i){ long long r=0; for(; i>0; i-= i&-i) r+=bit[i]; return r; }
13};
14
15struct Query { int x1,y1,x2,y2; };
16
17int main(){
18 ios::sync_with_stdio(false);
19 cin.tie(nullptr);
20
21 // Example data: points and rectangle queries
22 vector<pair<int,int>> points = {{2,3},{4,5},{5,2},{7,4}}; // each point has weight 1 (can generalize)
23 vector<int> weight(points.size(), 1);
24 vector<Query> qs = { {1,1,3,4}, {4,2,7,5}, {5,1,7,3} };
25
26 // Coordinate compress y
27 vector<int> ys; ys.reserve(points.size()*2 + qs.size()*2);
28 for (auto &p: points) ys.push_back(p.second);
29 for (auto &q: qs){ ys.push_back(q.y1); ys.push_back(q.y2); }
30 sort(ys.begin(), ys.end()); ys.erase(unique(ys.begin(), ys.end()), ys.end());
31 auto getY = [&](int y){ return int(lower_bound(ys.begin(), ys.end(), y) - ys.begin()) + 1; };
32
33 struct Event{ int x; int type; int y; long long val; int qi; int y1c,y2c; }; // type: 0=add point, 1=query strip
34 vector<Event> evs; evs.reserve(points.size() + qs.size()*2);
35
36 // Add point events
37 for (int i=0;i<(int)points.size();++i){
38 evs.push_back({points[i].first, 0, getY(points[i].second), weight[i], -1, 0,0});
39 }
40
41 // Add query strip events at x2 (+1) and x1-1 (-1)
42 for (int i=0;i<(int)qs.size();++i){
43 int x1 = qs[i].x1, x2 = qs[i].x2;
44 int y1c = getY(qs[i].y1), y2c = getY(qs[i].y2);
45 evs.push_back({x2, 1, 0, +1, i, y1c, y2c});
46 evs.push_back({x1-1, 1, 0, -1, i, y1c, y2c});
47 }
48
49 // Sort events by x, tie-breaking so point adds (type 0) occur before queries at same x
50 sort(evs.begin(), evs.end(), [](const Event&a, const Event&b){
51 if (a.x != b.x) return a.x < b.x;
52 return a.type < b.type;
53 });
54
55 BIT1D bit((int)ys.size());
56 vector<long long> ans(qs.size(), 0);
57
58 for (auto &e: evs){
59 if (e.type == 0){
60 bit.add(e.y, e.val);
61 } else {
62 long long s = bit.sum(e.y2c) - bit.sum(e.y1c - 1);
63 ans[e.qi] += e.val * s;
64 }
65 }
66
67 // Output answers
68 for (auto v: ans) cout << v << "\n";
69
70 return 0;
71}
72

This offline approach avoids building a 2D BIT. Each rectangle query is decomposed into two prefix-on-x queries with signs +1 and −1. As we sweep x from left to right, we insert points into a 1D BIT over y and answer y-interval sums for queries whose x-boundary has been reached. The final rectangle answer is the signed combination of these two prefix results.

Time: O((K + Q) log M) after sorting, where K is number of points, Q is number of queries, M is distinct y’sSpace: O(M)