2D Prefix Sum
Key Points
- •A 2D prefix sum (also called an integral image) lets you compute the sum of any axis-aligned sub-rectangle in constant time after O(nm) preprocessing.
- •The key idea is inclusion–exclusion: sums up to the top-left are added twice, so you subtract one overlap to correct it.
- •Use 1-based indexing with a zero-padded top row and left column to eliminate boundary checks and off-by-one errors.
- •To query the sum of [r1..r2] × [c1..c2], combine four prefix values to get the answer in O(1).
- •For many rectangle updates (adding a value to all cells in a rectangle), use a 2D difference array (2D Imos) and reconstruct with a 2D prefix at the end.
- •Always use 64-bit integers (long long) if values or dimensions can be large to avoid overflow.
- •Building the prefix is O(nm) time and O(nm) space; each sum query is O(1), making it ideal for lots of queries.
- •This technique is widely used in competitive programming and image processing to speed up repeated area-sum computations.
Prerequisites
- →Arrays and 2D indexing — You need to understand how to access and iterate over matrix elements.
- →1D prefix sums — 2D prefix sums generalize the 1D idea; familiarity with 1D helps grasp the extension.
- →Inclusion–exclusion principle — Rectangle queries rely on adding and subtracting overlapping areas correctly.
- →Big-O notation — To evaluate the trade-off between preprocessing time and query speed.
- →Integer overflow and data types — Prefix sums can exceed 32-bit limits; using 64-bit integers is often necessary.
- →Input normalization and bounds checking — Ensures query coordinates are valid and prevents out-of-bounds access.
Detailed Explanation
Tap terms for definitions01Overview
A 2D prefix sum is a preprocessing technique for matrices or grids that allows you to compute the sum of any rectangular subarray in constant time. The method constructs an auxiliary matrix P where P[i][j] stores the sum of all elements in the original matrix A from the top-left corner (1,1) to (i,j), inclusive. Once P is built, the sum of any rectangle can be obtained using a small combination of these stored sums through inclusion–exclusion. The overall strategy is to trade one-time preprocessing work for fast queries: you spend O(nm) time to build P for an n-by-m grid, and each subsequent rectangle-sum query costs only O(1). This yields significant speed-ups when there are many queries. The technique is known under various names: 2D prefix sums, cumulative sums, summed-area tables, or integral images. It is especially useful in competitive programming, image processing (e.g., fast box blur), heatmap analysis, and any problem requiring repeated rectangular aggregations. Variants include max/min prefix structures and extensions that support range updates via the 2D difference array (also known as the 2D Imos method), where updates are applied lazily and finalized with another 2D prefix pass.
02Intuition & Analogies
Imagine a city grid of blocks. At each intersection, there’s a small donation box with some amount of money. If you routinely need to know how much money has been collected in any rectangular neighborhood, counting coin by coin each time is slow. Instead, you keep a running ledger that tells you, for every intersection, the total money in the whole rectangle from the city’s origin (top-left) to that point. With this ledger, to find the money in a particular neighborhood (a rectangle), you look up the total up to its bottom-right corner, then subtract the totals of the regions that lie above and to the left (because they were included too), and finally add back the overlapping top-left corner region that was subtracted twice. This is the same mental math you do when figuring out areas of overlapping shapes: add the big chunk, subtract the extras, and fix the double subtraction. The 2D difference array (update trick) has a similar intuition. Think of placing four signs at a rectangle’s corners: two pluses and two minuses. If you later walk the grid and keep a running cumulative tally first across rows and then down columns, the effects of these corner signs expand to fill the whole rectangle with the intended added value. This way you don’t have to add to every cell immediately—you just leave corner markers and let a final sweep propagate the updates everywhere they belong.
03Formal Definition
04When to Use
Use 2D prefix sums when you have many queries asking for the sum of values in axis-aligned rectangles of a static grid. Typical scenarios include competitive programming problems with large n, m (up to 10^3–10^4) and many queries (10^4–10^5), heatmap or density analysis where quick area sums are required, and image processing tasks like box blurs or integral image features where filters correspond to rectangular kernels. If you also need to perform many rectangle increment updates before answering queries, use the 2D difference (Imos) method: log all updates in O(1) each and finalize with a single O(nm) sweep, then answer queries via a prefix table. If both updates and queries interleave online (updates must be reflected immediately), 2D prefix sums alone are insufficient—you may need 2D Binary Indexed Trees (Fenwick) or 2D Segment Trees, which handle updates and queries in O(log^2 n). If memory is tight or values can overflow 32-bit integers, plan for 64-bit types and possibly coordinate compression. Choose 1-based indexing with padding to simplify boundaries and reduce bugs.
⚠️Common Mistakes
Common pitfalls include off-by-one indexing errors: mixing 0-based arrays with formulas that assume 1-based indexing leads to incorrect sums. A robust pattern is to store P with size (n+1)×(m+1) and keep row 0 and column 0 as zeros, then use 1-based indices for A when filling P. Another frequent error is forgetting the inclusion–exclusion sign pattern in queries: the formula must be + − − + in that exact arrangement; any deviation yields wrong answers. Many solutions also neglect integer overflow—prefix sums can be significantly larger than individual elements, so use 64-bit types (long long) for A, P, and intermediate results. In 2D difference arrays, placing the four corner updates incorrectly (mixing c2 with c2+1, or missing one corner) silently corrupts the final grid; always follow the +, −, −, + corner rule. Developers sometimes rebuild the prefix after every single update, losing the performance benefit; batch updates with a difference array and rebuild once. Finally, not normalizing query coordinates (ensuring r1 ≤ r2 and c1 ≤ c2) or not clamping them to valid ranges can cause out-of-bounds access or logic errors.
Key Formulas
2D Prefix Recurrence
Explanation: Each prefix sum equals the current cell plus the prefixes above and to the left, minus the double-counted overlap. This is the core relation used to build the 2D prefix table.
Rectangle Sum Query
Explanation: The sum of a submatrix is obtained by inclusion–exclusion using four prefix values. The signs are +, −, −, + in that order.
Time Complexity
Explanation: Building the prefix table touches each cell once, while each query uses a constant number of arithmetic operations regardless of n and m.
2D Difference Update Rule
Explanation: To add v to a rectangle, adjust four corners of the difference array. A subsequent 2D prefix over D spreads these values to exactly the intended rectangle.
Reconstruction After Updates
Explanation: After all updates are logged in D, compute its 2D prefix to get the total added value at each cell, then add it to the original A to obtain the final grid A'.
Space Complexity
Explanation: The prefix table requires storage proportional to the number of matrix cells. Using 1-based padding adds only O(n + m) extra, which is asymptotically the same.
k-by-k Block Sum
Explanation: The sum of the k×k submatrix with top-left at (i,j) is computed with four prefix lookups, enabling enumeration of all such blocks in O(nm).
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Build a 2D prefix sum with 1-based indexing and zero padding 5 // P[i][j] = sum of A[1..i][1..j] 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n, m, q; 12 // Input format: 13 // n m 14 // n lines of m integers: the grid A 15 // q (number of queries) 16 // q lines: r1 c1 r2 c2 (1-based, inclusive) 17 if (!(cin >> n >> m)) return 0; 18 vector<vector<long long>> A(n + 1, vector<long long>(m + 1, 0)); 19 for (int i = 1; i <= n; ++i) { 20 for (int j = 1; j <= m; ++j) { 21 cin >> A[i][j]; 22 } 23 } 24 cin >> q; 25 26 // Build prefix table P with zero-padded row 0 and column 0 27 vector<vector<long long>> P(n + 1, vector<long long>(m + 1, 0)); 28 for (int i = 1; i <= n; ++i) { 29 for (int j = 1; j <= m; ++j) { 30 P[i][j] = A[i][j] + P[i - 1][j] + P[i][j - 1] - P[i - 1][j - 1]; 31 } 32 } 33 34 auto rect_sum = [&](int r1, int c1, int r2, int c2) -> long long { 35 // Normalize coordinates (ensure r1 <= r2 and c1 <= c2) 36 if (r1 > r2) swap(r1, r2); 37 if (c1 > c2) swap(c1, c2); 38 // Clamp to valid range if needed (optional safety) 39 r1 = max(1, min(r1, n)); r2 = max(1, min(r2, n)); 40 c1 = max(1, min(c1, m)); c2 = max(1, min(c2, m)); 41 // Inclusion–exclusion using the prefix table 42 return P[r2][c2] - P[r1 - 1][c2] - P[r2][c1 - 1] + P[r1 - 1][c1 - 1]; 43 }; 44 45 while (q--) { 46 int r1, c1, r2, c2; 47 cin >> r1 >> c1 >> r2 >> c2; 48 cout << rect_sum(r1, c1, r2, c2) << '\n'; 49 } 50 return 0; 51 } 52
This program reads an n×m grid, constructs a zero-padded 2D prefix sum table P in O(nm), and answers each rectangle sum query in O(1) time using the inclusion–exclusion formula. Using 1-based indexing and padding avoids boundary checks for r1−1 and c1−1.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Apply many rectangle additions in O(1) each using a 2D difference array. 5 // After all updates, reconstruct the final grid and build a 2D prefix for queries. 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n, m; 12 if (!(cin >> n >> m)) return 0; 13 14 // Read initial grid A (can be all zeros if desired) 15 vector<vector<long long>> A(n + 1, vector<long long>(m + 1, 0)); 16 for (int i = 1; i <= n; ++i) { 17 for (int j = 1; j <= m; ++j) { 18 cin >> A[i][j]; 19 } 20 } 21 22 int u; // number of updates 23 cin >> u; 24 25 // Difference array with extra padding to allow r2+1, c2+1 updates safely 26 vector<vector<long long>> D(n + 2, vector<long long>(m + 2, 0)); 27 28 // Each update: r1 c1 r2 c2 v (add v to all A[r1..r2][c1..c2]) 29 for (int k = 0; k < u; ++k) { 30 int r1, c1, r2, c2; long long v; 31 cin >> r1 >> c1 >> r2 >> c2 >> v; 32 if (r1 > r2) swap(r1, r2); 33 if (c1 > c2) swap(c1, c2); 34 // Corner updates: + - - + 35 D[r1][c1] += v; 36 D[r1][c2 + 1] -= v; 37 D[r2 + 1][c1] -= v; 38 D[r2 + 1][c2 + 1] += v; 39 } 40 41 // Reconstruct the total added value at each cell by 2D prefix over D 42 for (int i = 1; i <= n; ++i) { 43 for (int j = 1; j <= m; ++j) { 44 D[i][j] = D[i][j] + D[i - 1][j] + D[i][j - 1] - D[i - 1][j - 1]; 45 } 46 } 47 48 // Final grid after updates: A' = A + D (D now holds cumulative additions) 49 vector<vector<long long>> B(n + 1, vector<long long>(m + 1, 0)); 50 for (int i = 1; i <= n; ++i) { 51 for (int j = 1; j <= m; ++j) { 52 B[i][j] = A[i][j] + D[i][j]; 53 } 54 } 55 56 // Build prefix table on the updated grid B 57 vector<vector<long long>> P(n + 1, vector<long long>(m + 1, 0)); 58 for (int i = 1; i <= n; ++i) { 59 for (int j = 1; j <= m; ++j) { 60 P[i][j] = B[i][j] + P[i - 1][j] + P[i][j - 1] - P[i - 1][j - 1]; 61 } 62 } 63 64 int q; // number of queries on the updated grid 65 cin >> q; 66 67 auto rect_sum = [&](int r1, int c1, int r2, int c2) -> long long { 68 if (r1 > r2) swap(r1, r2); 69 if (c1 > c2) swap(c1, c2); 70 r1 = max(1, min(r1, n)); r2 = max(1, min(r2, n)); 71 c1 = max(1, min(c1, m)); c2 = max(1, min(c2, m)); 72 return P[r2][c2] - P[r1 - 1][c2] - P[r2][c1 - 1] + P[r1 - 1][c1 - 1]; 73 }; 74 75 while (q--) { 76 int r1, c1, r2, c2; 77 cin >> r1 >> c1 >> r2 >> c2; 78 cout << rect_sum(r1, c1, r2, c2) << '\n'; 79 } 80 81 return 0; 82 } 83
We batch-apply U rectangle updates using the 2D difference array D in O(1) per update. A single 2D prefix over D reconstructs the cumulative additions. We add this to A to get B and build a prefix table P over B to answer subsequent queries in O(1).
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Given an n x m grid and a block size k, compute the sum of every k x k submatrix. 5 // Outputs an (n-k+1) x (m-k+1) matrix of sums. 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int n, m, k; 12 if (!(cin >> n >> m >> k)) return 0; 13 vector<vector<long long>> A(n + 1, vector<long long>(m + 1, 0)); 14 for (int i = 1; i <= n; ++i) { 15 for (int j = 1; j <= m; ++j) cin >> A[i][j]; 16 } 17 18 // Build 2D prefix 19 vector<vector<long long>> P(n + 1, vector<long long>(m + 1, 0)); 20 for (int i = 1; i <= n; ++i) { 21 for (int j = 1; j <= m; ++j) { 22 P[i][j] = A[i][j] + P[i - 1][j] + P[i][j - 1] - P[i - 1][j - 1]; 23 } 24 } 25 26 int out_n = n - k + 1, out_m = m - k + 1; 27 if (out_n <= 0 || out_m <= 0) { 28 // No k x k submatrix fits 29 cout << 0 << '\n'; 30 return 0; 31 } 32 33 vector<vector<long long>> S(out_n + 1, vector<long long>(out_m + 1, 0)); 34 35 auto rect_sum = [&](int r1, int c1, int r2, int c2) -> long long { 36 return P[r2][c2] - P[r1 - 1][c2] - P[r2][c1 - 1] + P[r1 - 1][c1 - 1]; 37 }; 38 39 for (int i = 1; i <= out_n; ++i) { 40 for (int j = 1; j <= out_m; ++j) { 41 int r1 = i, c1 = j; 42 int r2 = i + k - 1, c2 = j + k - 1; 43 S[i][j] = rect_sum(r1, c1, r2, c2); 44 } 45 } 46 47 // Output the (n-k+1) x (m-k+1) matrix of kxk sums 48 for (int i = 1; i <= out_n; ++i) { 49 for (int j = 1; j <= out_m; ++j) { 50 if (j > 1) cout << ' '; 51 cout << S[i][j]; 52 } 53 cout << '\n'; 54 } 55 56 return 0; 57 } 58
After building the 2D prefix table, each k×k submatrix sum is a constant-time four-corner lookup. Enumerating all top-left positions yields O(nm) total time, which is optimal for producing all outputs.