Prefix Sum and Difference Array
Key Points
- â˘Prefix sums precompute running totals so any range sum [l, r] can be answered in O(1) time as prefix[r] - prefix[l-1].
- â˘Difference arrays turn many range add updates into two O(1) operations per update and then rebuild the final array with a single prefix pass.
- â˘The 2D prefix sum extends the idea to grids, answering rectangle sums in O(1) using inclusionâexclusion.
- â˘Always be clear about indexing (0-based vs 1-based) and pad boundaries to avoid off-by-one bugs.
- â˘Use long long for sums to prevent integer overflow when n and values are large.
- â˘You can combine difference arrays (to apply many updates) and prefix sums (to answer many queries) efficiently in offline scenarios.
- â˘The approach generalizes to 3D and higher with 2^d inclusionâexclusion terms for hyper-rectangles.
- â˘Time is typically O(n) to build and O(1) per query, and O(m + n) to apply m range adds and rebuild the final array.
Prerequisites
- âArrays and Indexing â Understanding how to access and iterate arrays is essential for building prefix/difference arrays.
- âBig-O Notation â To reason about why preprocessing helps when answering many queries.
- âSummations â Prefix sums are defined using cumulative sums and summation notation.
- âInteger Overflow and Data Types â Sums can exceed 32-bit limits; choosing long long avoids silent overflow.
- âInclusionâExclusion Principle (basic) â 2D and higher prefix queries rely on combining corners with plus/minus signs.
Detailed Explanation
Tap terms for definitions01Overview
Hook: Imagine you keep a daily log of savings and repeatedly get asked, âHow much did you save from day 120 to day 450?â Re-adding the amounts each time is slow and error-prone. Concept: Prefix sums store cumulative totals so that any range total is just the difference of two stored values. Difference arrays flip the story: instead of adding to every element in a range (slow), you mark only the boundaries and later reconstruct the final values in one sweep. Example: Build a prefix array S where S[i] is the sum of the first i elements. Then the sum of a[l..r] is S[r] - S[l-1]. For many range adds, keep a diff array D with D[l] += d and D[r+1] -= d; a single prefix pass over D reconstructs all adds. Hook â Concept â Example: The hook is âanswer many range questions fastâ or âapply many range updates fast.â The concept is to precompute cumulative information (prefix) or to defer work to reconstruction (difference). The example is answering queries in O(1) after O(n) preprocessing or applying millions of range increments with just two array writes per update. These ideas extend naturally to 2D and 3D, letting you sum submatrices or subcubes instantly using inclusionâexclusion.
02Intuition & Analogies
Think of prefix sums like mile markers on a highway. If you want to know the distance from marker l to r, you just do total at r minus total at lâ1. You do not re-walk the road each time; the markers encode all the walking youâve done so far. Similarly, a bank passbook shows your cumulative balance after each day; the money spent between days l and r is the difference of the balances. Difference arrays are like placing signs at the start and end of a construction zone. Instead of repainting every lane in the zone one by one, you put a âstart painting hereâ sign at l and an âend painting hereâ sign after r. When a maintenance truck drives along the road once, it toggles painting on at l and off after r, effectively painting the whole interval in one pass. In arrays, D[l] += d signals âstart adding d,â and D[r+1] -= d signals âstop adding d.â A single left-to-right sweep accumulates these starts and stops into the final painted (updated) array. In 2D, imagine rainfall over a map. The 2D prefix sum at (i, j) is the total rainfall in the rectangle from the top-left corner to (i, j). The rain in any city-block rectangle is then four markers combined by inclusionâexclusion. Extending to 3D is like stacking layers: the rain in a 3D box involves 8 corners combined with plus/minus signs. The unifying intuition: precompute cumulative effects once, then answer many questions instantly; or record only boundaries of changes and consolidate later with one cumulative sweep.
03Formal Definition
04When to Use
Use prefix sums when you have many range sum queries on a static array or matrix and can afford one O(n) (or O(HW)) preprocessing pass. Examples: answering thousands of subarray-sum queries, counting elements within ranges in a frequency array, or computing cumulative costs over time. Use difference arrays when you have many range increment/decrement operations but only need the final array (or final queries) after all updates. Examples: adding values to intervals (e.g., increasing traffic counts or brightness over frame ranges), the Imos method in competitive programming for overlapping intervals, or updating coverage counts in sweep-line style problems. Combine both when you have two phases: first, apply many range updates efficiently with a difference array; second, answer many range sum queries efficiently with a prefix sum built from the updated array. This is especially effective in offline settings where all updates and queries are known beforehand. Use 2D prefix sums for fast rectangle queries in images, heatmaps, or any grid-like data. Extend to 3D for voxel volumes or time-varying grids when you need box sums. If you need online updates combined with online range queries, consider moving to Fenwick Trees or Segment Trees; but for purely offline or static scenarios, prefix/difference arrays are simpler and faster.
â ď¸Common Mistakes
⢠Off-by-one errors: Mixing 0-based vectors with 1-based formulas often causes S[lâ1] to read invalid memory. Fix: Pick one convention (commonly 1-based for formulas) and pad arrays with an index 0 guard. ⢠Forgetting the guard element: For difference arrays, updates at r+1 require D to have size n+2 to avoid out-of-bounds. Always allocate with extra space and set guards to zero. ⢠Incorrect reconstruction: After using difference updates, some forget to prefix-sum D to rebuild a. Without this step, the array is not updated. Fix: a[i] = a[iâ1] + D[i] in a forward pass. ⢠Overflow: Range sums can exceed 32-bit int when n or values are large. Use long long (64-bit) for prefix sums and while accumulating. ⢠Wrong inclusivity: Ensure intervals are inclusive [l, r]. If you treat r as exclusive inadvertently, your D[r+1] adjustment or submatrix inclusionâexclusion signs will be wrong. ⢠2D borders: Omitting P[0][] and P[][0] zero rows/columns complicates code. Fix: build P with one-based indexing and zero-padding so formulas work uniformly. ⢠Mixing online and offline: Difference arrays are not suited for interleaved queries and updates. If you must answer queries between updates, use a Binary Indexed Tree or Segment Tree instead. ⢠Uninitialized memory: Not setting arrays to zero (especially D and P) yields garbage results. Initialize with zeros before accumulation.
Key Formulas
Prefix Sum Definition
Explanation: S[i] stores the total of the first i elements. Having this precomputed lets you answer range sums with constant time differences.
Range Sum via Prefix
Explanation: The sum over an interval is the difference of two prefix totals. This is the key O(1) query formula after O(n) preprocessing.
Difference Array Definition
Explanation: D captures the change between consecutive elements. A guard at n+1 simplifies range updates at r+1.
Reconstruction from Difference
Explanation: A single prefix pass over D reconstructs the final array after all boundary updates are applied.
Range Add Update on D
Explanation: Adding d to every a[i] for i in [l, r] is encoded by starting d at l and stopping it after r. This makes each update O(1).
2D Prefix Sum Definition
Explanation: P[i][j] accumulates totals over the top-left rectangle. Zero-padding boundaries simplifies queries.
2D Rectangle Query
Explanation: Use inclusionâexclusion of four corners to get any axis-aligned rectangle sum in O(1).
3D Prefix Sum Definition
Explanation: Extends prefix sums to volumes. Box queries use 8-term inclusionâexclusion over the corners.
Complexity for 1D Prefix
Explanation: Building the prefix takes linear time; each query is constant time. This is ideal when there are many queries.
Complexity for Difference Array
Explanation: Each of the m range adds is O(1), and one linear pass reconstructs the final array.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Reads an array a[1..n], builds prefix sum S[0..n], 5 // then answers q queries of the form l r, printing sum a[l..r]. 6 // Indexing is 1-based with S[0] = 0 to avoid edge-case branches. 7 8 int main() { 9 ios::sync_with_stdio(false); 10 cin.tie(nullptr); 11 12 int n, q; 13 if (!(cin >> n >> q)) return 0; 14 vector<long long> a(n + 1, 0); // a[1..n] 15 for (int i = 1; i <= n; ++i) cin >> a[i]; 16 17 vector<long long> S(n + 1, 0); // S[0..n], S[0] = 0 18 for (int i = 1; i <= n; ++i) { 19 S[i] = S[i - 1] + a[i]; // cumulative sum 20 } 21 22 while (q--) { 23 int l, r; cin >> l >> r; // assume 1 <= l <= r <= n 24 long long ans = S[r] - S[l - 1]; // O(1) range sum 25 cout << ans << '\n'; 26 } 27 return 0; 28 } 29
We store cumulative totals in S so any interval [l, r] is answered by subtracting two prefix values. Using 1-based indexing with S[0] = 0 keeps l = 1 valid without extra checks.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Applies m updates of the form: add d to a[l..r], using a difference array D. 5 // Finally reconstructs the updated array and prints it. 6 // This demonstrates O(1) per update and O(n) reconstruction. 7 8 int main() { 9 ios::sync_with_stdio(false); 10 cin.tie(nullptr); 11 12 int n, m; 13 if (!(cin >> n >> m)) return 0; 14 15 vector<long long> a(n + 1, 0); // initial array a[1..n], can be zero or read in 16 for (int i = 1; i <= n; ++i) cin >> a[i]; 17 18 vector<long long> D(n + 2, 0); // D[1..n+1], guard at n+1 19 // Build initial difference from a 20 D[1] = a[1]; 21 for (int i = 2; i <= n; ++i) D[i] = a[i] - a[i - 1]; 22 D[n + 1] = 0; 23 24 // Apply m range-add updates 25 while (m--) { 26 int l, r; long long d; cin >> l >> r >> d; // add d to [l, r] 27 D[l] += d; 28 D[r + 1] -= d; // safe due to guard 29 } 30 31 // Reconstruct final array by prefixing D 32 vector<long long> res(n + 1, 0); 33 long long cur = 0; 34 for (int i = 1; i <= n; ++i) { 35 cur += D[i]; 36 res[i] = cur; 37 } 38 39 // Output final array 40 for (int i = 1; i <= n; ++i) { 41 if (i > 1) cout << ' '; 42 cout << res[i]; 43 } 44 cout << '\n'; 45 return 0; 46 } 47
Each range add writes only two positions in D. A single linear pass over D accumulates all increments into the final array. Building D from an initial a supports starting from nonzero values.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Builds a 2D prefix sum P for an HxW matrix A (1-based with zero padding), 5 // then answers Q rectangle sum queries (x1 y1 x2 y2) in O(1). 6 7 int main() { 8 ios::sync_with_stdio(false); 9 cin.tie(nullptr); 10 11 int H, W, Q; 12 if (!(cin >> H >> W >> Q)) return 0; 13 14 vector<vector<long long>> A(H + 1, vector<long long>(W + 1, 0)); 15 for (int i = 1; i <= H; ++i) { 16 for (int j = 1; j <= W; ++j) cin >> A[i][j]; 17 } 18 19 vector<vector<long long>> P(H + 1, vector<long long>(W + 1, 0)); 20 for (int i = 1; i <= H; ++i) { 21 long long rowSum = 0; // optional micro-optimization 22 for (int j = 1; j <= W; ++j) { 23 rowSum += A[i][j]; 24 P[i][j] = P[i - 1][j] + rowSum; // equivalent to full 2D prefix 25 } 26 } 27 28 while (Q--) { 29 int x1, y1, x2, y2; cin >> x1 >> y1 >> x2 >> y2; // 1-based, x1<=x2, y1<=y2 30 long long ans = P[x2][y2] - P[x1 - 1][y2] - P[x2][y1 - 1] + P[x1 - 1][y1 - 1]; 31 cout << ans << '\n'; 32 } 33 return 0; 34 } 35
P[i][j] holds the sum over the top-left rectangle to (i, j). Inclusionâexclusion of four corner prefixes yields any rectangle sum in constant time. Zero padding at row 0 and column 0 removes boundary branches.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Given n, initial array a, m range-add updates, and q queries [l,r], 5 // apply all updates efficiently with a difference array, rebuild final a, 6 // then answer all queries using a prefix sum. 7 8 int main() { 9 ios::sync_with_stdio(false); 10 cin.tie(nullptr); 11 12 int n, m, q; 13 if (!(cin >> n >> m >> q)) return 0; 14 15 vector<long long> a(n + 1, 0); 16 for (int i = 1; i <= n; ++i) cin >> a[i]; 17 18 // Difference array for range adds 19 vector<long long> D(n + 2, 0); 20 D[1] = a[1]; 21 for (int i = 2; i <= n; ++i) D[i] = a[i] - a[i - 1]; 22 23 while (m--) { 24 int l, r; long long d; cin >> l >> r >> d; 25 D[l] += d; 26 D[r + 1] -= d; 27 } 28 29 // Reconstruct final array 30 vector<long long> finalA(n + 1, 0); 31 long long cur = 0; 32 for (int i = 1; i <= n; ++i) { 33 cur += D[i]; 34 finalA[i] = cur; 35 } 36 37 // Build prefix for final array 38 vector<long long> S(n + 1, 0); 39 for (int i = 1; i <= n; ++i) S[i] = S[i - 1] + finalA[i]; 40 41 // Answer q range sum queries 42 while (q--) { 43 int l, r; cin >> l >> r; 44 long long ans = S[r] - S[l - 1]; 45 cout << ans << '\n'; 46 } 47 48 return 0; 49 } 50
This combines both techniques: apply all range updates in O(m) using a difference array, reconstruct the final array in O(n), then answer each range sum in O(1) using a prefix sum. It is optimal for offline batches of updates and queries.