⚙️AlgorithmIntermediate

Mo's Algorithm

Key Points

  • Mo's algorithm answers many range queries offline by reordering them to minimize pointer movement along the array.
  • It keeps a sliding window [L, R] and supports O(1) or O(log n) add/remove operations at the boundaries to update the current answer.
  • Sorting queries by blocks of size about \(\) yields total movement and time about \(O((n+q))\).
  • It shines when the query answer can be updated incrementally (e.g., distinct count, equal pairs, frequency-based sums).
  • Coordinate compression is often required so frequency arrays are small and cache-friendly.
  • Variants include Mo with updates (time dimension) and Hilbert order for speed in practice.
  • Common pitfalls are incorrect add/remove symmetry, 1-based vs 0-based indices, and integer overflow in pair counts.
  • Mo is offline; if you need real-time answers or arbitrary updates, consider segment trees/Fenwick trees instead.

Prerequisites

  • Big-O notation and asymptotic analysisTo understand the O((n+q)√n) bound and how block size affects runtime.
  • Two-pointer/sliding window techniqueMo’s window updates are sliding one step at a time at both ends.
  • Sorting and custom comparators in C++Queries must be sorted by blocks, and a correct comparator ensures deterministic order.
  • Coordinate compressionAllows using compact frequency arrays for large value ranges.
  • Discrete mathematics: combinationsPair-count queries use k choose 2 = k(k−1)/2.
  • Hashing and frequency tablesCore to implementing O(1) add/remove via counts.
  • Segment tree and Fenwick tree (contrast)Helps decide when not to use Mo and what alternatives exist for online or update-heavy tasks.

Detailed Explanation

Tap terms for definitions

01Overview

Mo's algorithm is a technique to answer many range queries efficiently by reordering them. Imagine you have an array and many questions of the form: “What is the answer for subarray [l, r]?” If you answer each question from scratch, you repeat a lot of work. Mo’s algorithm avoids this by sorting the queries in a special order so that moving from one query’s interval to the next only changes the window’s edges slightly. Then, with fast add/remove operations at the edges, you can update the current answer incrementally. The key idea is block decomposition: split the index range into blocks of size about sqrt(n), and sort queries by the block of l, then by r. This ensures that across the entire sorted list, the window [L, R] doesn’t jump wildly too often, keeping the total pointer movement near O((n+q)√n). Because Mo is offline, you can’t process queries as they arrive—you must read them all first, sort them, and then compute answers. It’s commonly used for problems where the answer depends on element frequencies in the current range, like counting distinct values, counting equal pairs, or frequency-weighted sums. The algorithm is simple to implement and very effective in practice when add/remove are O(1).

02Intuition & Analogies

Think of a librarian answering questions about pages in a book. Each question asks: “From page l to page r, how many distinct words appear?” If the librarian opens to l and scans to r from scratch every time, it’s slow. A smarter approach is to order the questions so that the next question is close to the current one—maybe it starts on a nearby page and ends a little further. Then the librarian only flips a few pages forward or backward to adjust, and reuses previous counts. Mo’s algorithm is this strategy for arrays. We walk a window [L, R] across the array. Between consecutive queries, we slide the left edge L and the right edge R in small steps. Each one-step move adds or removes exactly one element at the border, so if we can update the current answer fast when a single element enters or leaves, we can keep the total cost low. Why block sorting? If we group l by blocks of size B ≈ sqrt(n), we can guarantee that l doesn’t change block too often—only O(n/B) times—and within each block, we sort by r, minimizing large jumps of the right edge. The total edge movement then becomes bounded by a near-linear factor times the block size. This is like visiting houses in a neighborhood by streets: first finish all houses on street 1 in increasing house numbers, then street 2, and so on—you walk far less than if you visit houses in random order.

03Formal Definition

Given an array A of length n and a set of q range queries Q = {(, )}, Mo’s algorithm answers all queries offline by: (1) choosing a block size B (typically B = ⌈√n⌉), (2) sorting queries by the pair (⌊l/B⌋, r) with an optional alternating order on r per block to reduce movement, and (3) processing them sequentially while maintaining a current interval [L, R] and a current answer Ans. We define two operations: add(x) which updates Ans when a single element x enters the interval, and remove(x) which updates Ans when x leaves. While moving from one query’s interval [l, r] to the next [l', r'], we adjust L and R using unit steps, invoking add/remove accordingly. If add and remove are O(1) or O(log n), the total cost is dominated by the total number of unit boundary moves. With block size B, the sorting guarantees at most O((n+q)·B + q·n/B) total moves. Choosing B = Θ(√n) balances the two terms, yielding O((n+q)√n) time plus O(q log q) for sorting. Space is O(n + U) where U is the universe size used by frequency bookkeeping, often reduced via coordinate compression.

04When to Use

Use Mo’s algorithm when: (1) you have many offline range queries over a static array; (2) the answer can be incrementally maintained by adding/removing a single boundary element in O(1) or O(log n); and (3) exact online order isn’t required. Typical applications include: counting distinct elements; counting pairs with equal values; computing sums like Σ f(freq[x]) over the current range; or maintaining bitwise OR/AND if add/remove can be updated quickly with counters. It also extends to Mo with updates (a third dimension “time” for point updates between queries) and Mo on trees (via Euler tour to turn path queries into intervals). Avoid Mo when queries are few (brute force may suffice), when add/remove are expensive (e.g., complex data structures per element), or when online/real-time answers are mandatory. For problems where answers compose associatively and support fast range aggregation (e.g., sum, min, max), segment trees or Fenwick trees may be simpler and faster. For dynamic range updates and queries, consider Fenwick/segment trees with lazy propagation or sqrt decomposition instead.

⚠️Common Mistakes

  • Incorrect add/remove symmetry: add(x) and remove(x) must be exact inverses on the maintained state. A tiny mismatch yields subtle wrong answers. Test with small arrays and random queries. - Indexing errors: Input is often 1-based; implementations are 0-based internally. Convert consistently and define whether ranges are inclusive [l, r]. - Block size choice: Using B = n or B = 1 destroys performance. Use B ≈ √n (or n^{2/3} for Mo with updates) and consider even–odd ordering of r to reduce pointer oscillation. - Missing coordinate compression: If values are large (e.g., up to 1e9), using them as frequency indices causes memory blowups and cache misses. Compress first. - Overflow in aggregated answers: Pair counts use combinations like cnt·(cnt−1)/2; store answers in 64-bit (long long) to avoid overflow. - Forgetting sorting stability conditions: The comparator for queries must define a total order; beware of ties and ensure deterministic ordering (e.g., by r and then id). - Slow I/O and unnecessary clearing: Don’t clear frequency arrays per query; reuse the sliding window. Use fast I/O for large inputs. - Assuming Mo is online: It’s offline. You must read all queries first and cannot output in the input order without storing answers and reordering back by id.

Key Formulas

Movement Cost Trade-off

Explanation: Total time is proportional to left-edge movement ((n+q)·B) plus right-edge movement (q·n/B). Choosing B balances these terms. This captures why B ≈ √n is optimal for standard Mo.

Optimal Block Size

Explanation: Minimizing the trade-off yields B on the order of √n when q≈n. In practice we use B = ⌈√n⌉, which is robust when q and n are same order.

Standard Mo Complexity

Explanation: With B ≈ √n and O(1) add/remove, total time is roughly (n+q)√n, plus O(q log q) for sorting. This is the common rule of thumb.

Pairs From Frequency

Explanation: If a value appears k times in a range, it contributes k choose 2 equal pairs. Summing over all values gives the total number of equal pairs.

Mo With Updates Tuning

Explanation: When adding a time dimension with m updates, sorting by (l-block, r-block, time) and using B ≈ balances movements in three axes to get the typical bound.

Frequency Functional

Explanation: Many Mo-compatible answers depend only on frequencies in the current window. The function f must support O(1) updates when a single frequency changes by ±1.

Space Usage

Explanation: We store the array and auxiliary frequency structures over a universe of size U (often the number of distinct values after compression).

Complexity Analysis

The performance of Mo’s algorithm hinges on bounding the total number of unit pointer moves when transitioning between consecutively processed queries. Partition the index set into blocks of size B. Sorting queries primarily by l-block ensures that the left pointer crosses block boundaries at most O times, and within each block, the left pointer varies by at most B between queries. Across q queries, this yields O((n+q)·B) adjustments attributable to managing left boundaries and ordering overhead. The right pointer, sorted secondarily by r, still may traverse across the entire array per l-block segment, accumulating O(q·n/B) movements in the worst case. Summing both effects gives T(B) = O((n+q)·B + q·n/B) if add/remove operations are O(1). Minimizing T(B) by setting B ≈ √n (for q and n of the same order) yields )√n). Sorting the queries contributes O(q log q), which is typically dominated for large n. Space complexity is O(n + U), where U is the number of distinct values after coordinate compression (for maintaining frequency arrays and any auxiliary structures like freq-of-freq when computing modes). In practice, cache behavior and constants matter: alternating r order within each l-block (the parity trick) and using Hilbert order can reduce constant factors substantially. For Mo with updates, movements extend to a third axis (time). Balancing l, r, and t steps with B ≈ yields an empirical and theoretical bound ) under O(1) add/remove/apply/rollback operations. If add/remove are O(log n), multiply the bounds by log n. Finally, note that extremely skewed q versus n may slightly change the optimal B, but √n works robustly in most contest settings.

Code Examples

Mo's Algorithm: Count distinct elements in range [l, r]
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Query {
5 int l, r, id;
6 int block_l; // precomputed block of l for faster sort
7};
8
9int main() {
10 ios::sync_with_stdio(false);
11 cin.tie(nullptr);
12
13 // Input format:
14 // n q
15 // a[0..n-1]
16 // q lines: l r (1-based, inclusive)
17 int n, q;
18 if (!(cin >> n >> q)) return 0;
19 vector<long long> a(n);
20 for (int i = 0; i < n; ++i) cin >> a[i];
21
22 // Coordinate compression so we can index frequency arrays compactly
23 vector<long long> vals = a;
24 sort(vals.begin(), vals.end());
25 vals.erase(unique(vals.begin(), vals.end()), vals.end());
26 vector<int> arr(n);
27 for (int i = 0; i < n; ++i) {
28 arr[i] = int(lower_bound(vals.begin(), vals.end(), a[i]) - vals.begin());
29 }
30 int U = (int)vals.size(); // universe size after compression
31
32 // Read queries and convert to 0-based inclusive indices
33 vector<Query> queries(q);
34 for (int i = 0; i < q; ++i) {
35 int l, r; cin >> l >> r; --l; --r;
36 queries[i] = {l, r, i, 0};
37 }
38
39 // Choose block size ~ sqrt(n)
40 int B = max(1, (int)sqrt((double)n));
41 for (auto &qr : queries) qr.block_l = qr.l / B;
42
43 // Sort by (block_l, r) with parity trick on r to reduce movement
44 sort(queries.begin(), queries.end(), [&](const Query &A, const Query &Bq){
45 if (A.block_l != Bq.block_l) return A.block_l < Bq.block_l;
46 if (A.block_l & 1) return A.r > Bq.r; // odd blocks: sort r descending
47 return A.r < Bq.r; // even blocks: sort r ascending
48 });
49
50 vector<int> cnt(U, 0); // frequency of each value in current window
51 vector<int> ans(q, 0);
52
53 auto add = [&](int x, int &cur){
54 // x is the compressed value
55 if (cnt[x] == 0) ++cur; // new distinct value enters the window
56 ++cnt[x];
57 };
58 auto remove_one = [&](int x, int &cur){
59 // remove one occurrence of x from the window
60 --cnt[x];
61 if (cnt[x] == 0) --cur; // value no longer present
62 };
63
64 int cur = 0; // current number of distinct values
65 int L = 0, R = -1; // current window is empty
66
67 for (const auto &qr : queries) {
68 int l = qr.l, r = qr.r;
69 // Expand / shrink to [l, r]
70 while (L > l) add(arr[--L], cur);
71 while (R < r) add(arr[++R], cur);
72 while (L < l) remove_one(arr[L++], cur);
73 while (R > r) remove_one(arr[R--], cur);
74 ans[qr.id] = cur;
75 }
76
77 // Output in input order
78 for (int i = 0; i < q; ++i) {
79 cout << ans[i] << '\n';
80 }
81 return 0;
82}
83

We compress values to 0..U-1 and maintain a frequency array cnt. The current answer is the number of values with nonzero frequency. As we reorder queries by blocks of l and then by r (with alternating r order), we slide the [L, R] window and update the answer via add/remove in O(1) per step. This yields about O((n+q)√n) time overall.

Time: O((n+q)√n) for processing plus O(q log q) for sortingSpace: O(n + U + q)
Mo's Algorithm: Count equal pairs inside [l, r]
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Query { int l, r, id, block_l; };
5
6int main(){
7 ios::sync_with_stdio(false);
8 cin.tie(nullptr);
9
10 int n, q; if(!(cin >> n >> q)) return 0;
11 vector<long long> a(n);
12 for(int i=0;i<n;++i) cin >> a[i];
13
14 // Coordinate compression
15 vector<long long> vals = a; sort(vals.begin(), vals.end());
16 vals.erase(unique(vals.begin(), vals.end()), vals.end());
17 vector<int> arr(n);
18 for(int i=0;i<n;++i) arr[i] = int(lower_bound(vals.begin(), vals.end(), a[i]) - vals.begin());
19 int U = (int)vals.size();
20
21 vector<Query> qs(q);
22 for(int i=0;i<q;++i){
23 int l,r; cin>>l>>r; --l; --r;
24 qs[i] = {l,r,i,0};
25 }
26
27 int B = max(1, (int)sqrt((double)n));
28 for(auto &qq: qs) qq.block_l = qq.l / B;
29
30 sort(qs.begin(), qs.end(), [&](const Query& A, const Query& Bq){
31 if(A.block_l != Bq.block_l) return A.block_l < Bq.block_l;
32 if(A.block_l & 1) return A.r > Bq.r; // parity trick
33 return A.r < Bq.r;
34 });
35
36 vector<long long> cnt(U, 0);
37 vector<long long> ans(q, 0);
38
39 auto add = [&](int x, long long &cur){
40 // When we add one more x, the number of equal pairs increases by current cnt[x]
41 cur += cnt[x];
42 ++cnt[x];
43 };
44 auto remove_one = [&](int x, long long &cur){
45 // When removing one x, decrement cnt first; pairs drop by new cnt[x]
46 --cnt[x];
47 cur -= cnt[x];
48 };
49
50 long long cur = 0;
51 int L = 0, R = -1;
52
53 for(const auto &qq : qs){
54 int l = qq.l, r = qq.r;
55 while(L > l) add(arr[--L], cur);
56 while(R < r) add(arr[++R], cur);
57 while(L < l) remove_one(arr[L++], cur);
58 while(R > r) remove_one(arr[R--], cur);
59 ans[qq.id] = cur; // cur equals sum over x of C(cnt[x], 2)
60 }
61
62 for(int i=0;i<q;++i) cout << ans[i] << '\n';
63 return 0;
64}
65

For each value x with frequency k in the current range, it contributes k choose 2 equal pairs. Adding an x creates new pairs equal to the current count of x; removing an x destroys pairs equal to the (updated) count. This leads to O(1) updates per element move and the same Mo complexity as before.

Time: O((n+q)√n) for processing plus O(q log q) for sortingSpace: O(n + U + q)
Mo's Algorithm with Updates (time dimension): Distinct count with point updates
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Query { int l, r, t, id; int bl, br; };
5struct Update { int pos, prev, next; };
6
7int main(){
8 ios::sync_with_stdio(false);
9 cin.tie(nullptr);
10
11 // Input format:
12 // n m
13 // a[0..n-1]
14 // then m operations, each either:
15 // Q l r (1-based, inclusive)
16 // U p x (point update: set a[p]=x)
17 // We answer all Q after interleaving with U.
18
19 int n, m; if(!(cin >> n >> m)) return 0;
20 vector<long long> orig(n);
21 for(int i=0;i<n;++i) cin >> orig[i];
22
23 struct Op { char type; int x, y; }; // 'Q' l r or 'U' p x
24 vector<Op> ops(m);
25 vector<long long> allVals = orig;
26
27 for(int i=0;i<m;++i){
28 char tp; cin >> tp;
29 if(tp=='Q'){
30 int l,r; cin>>l>>r; ops[i] = {tp, l-1, r-1};
31 }else{ // 'U'
32 int p; long long x; cin>>p>>x; ops[i] = {tp, p-1, 0}; allVals.push_back(x);
33 }
34 }
35
36 // Coordinate compression over initial and update values
37 sort(allVals.begin(), allVals.end());
38 allVals.erase(unique(allVals.begin(), allVals.end()), allVals.end());
39
40 vector<int> arr(n);
41 for(int i=0;i<n;++i) arr[i] = int(lower_bound(allVals.begin(), allVals.end(), orig[i]) - allVals.begin());
42
43 vector<Query> qs; qs.reserve(m);
44 vector<Update> us; us.reserve(m);
45
46 // Build queries with time index; record updates with prev/next
47 int t = 0; // number of updates processed so far
48 for(int i=0;i<m;++i){
49 if(ops[i].type=='Q'){
50 Query qy; qy.l = ops[i].x; qy.r = ops[i].y; qy.t = t; qy.id = (int)qs.size(); qy.bl = qy.br = 0;
51 qs.push_back(qy);
52 }else{
53 int p = ops[i].x; // value in ops[i].y is unused placeholder
54 // next compressed value from allVals using the next read value stored earlier
55 // We cannot access the raw x now; but we pushed it to allVals and will reconstruct via input?
56 // Simpler: store next in a secondary pass. For clarity, rebuild updates:
57 }
58 }
59 // Re-parse to build updates with known compressed values
60 // We need to track current value to know prev and next
61 vector<int> cur = arr; // current compressed array during building
62 us.clear(); t = 0; qs.clear();
63
64 // For reproducibility, read input again is not possible; instead, we stored ops but not x for updates.
65 // Let's enhance Op to store 64-bit x. We'll redo reading above accordingly.
66 // Restart program if not compiled with that; to keep this self-contained, we implement properly below.
67 return 0;
68}
69

Mo with updates needs three kinds of moves: L, R, and time t. Each query is associated with a time—the number of updates that have occurred before it. We sort by (l-block, r-block, t) and, when moving in time, apply or rollback updates while adjusting the current window if the updated position lies inside. Due to input handling complexity, see the next corrected version below.

Time: O((n+q+m)·n^{2/3}) in theory when block size is about n^{2/3} and updates are O(1) to apply/rollbackSpace: O(n + U + q + m)
Corrected: Mo with Updates (time dimension) — Distinct count
1#include <bits/stdc++.h>
2using namespace std;
3
4struct Query { int l, r, t, id; int bl, br; };
5struct UpRaw { int pos; long long x; }; // raw value for reading
6struct Update { int pos, prev, next; }; // compressed values for processing
7
8int main(){
9 ios::sync_with_stdio(false);
10 cin.tie(nullptr);
11
12 int n, m; if(!(cin >> n >> m)) return 0;
13 vector<long long> a0(n);
14 for(int i=0;i<n;++i) cin >> a0[i];
15
16 struct Op { char type; int p, l, r; long long x; };
17 vector<Op> ops(m);
18
19 vector<long long> allVals = a0;
20 for(int i=0;i<m;++i){
21 char tp; cin >> tp; ops[i].type = tp;
22 if(tp=='Q'){
23 int l,r; cin>>l>>r; ops[i].l = l-1; ops[i].r = r-1;
24 }else{ // 'U'
25 int p; long long x; cin>>p>>x; ops[i].p = p-1; ops[i].x = x; allVals.push_back(x);
26 }
27 }
28
29 // Coordinate compression over initial and all update values
30 sort(allVals.begin(), allVals.end());
31 allVals.erase(unique(allVals.begin(), allVals.end()), allVals.end());
32
33 auto compress = [&](long long v){
34 return (int)(lower_bound(allVals.begin(), allVals.end(), v) - allVals.begin());
35 };
36
37 vector<int> arr(n);
38 for(int i=0;i<n;++i) arr[i] = compress(a0[i]);
39
40 vector<Query> qs; qs.reserve(m);
41 vector<Update> us; us.reserve(m);
42
43 // Build query list with time indices and update list with prev/next (compressed)
44 int t = 0; // number of updates before current op
45 vector<int> cur = arr; // current compressed values while constructing updates
46 for(int i=0;i<m;++i){
47 if(ops[i].type=='Q'){
48 Query qy; qy.l = ops[i].l; qy.r = ops[i].r; qy.t = t; qy.id = (int)qs.size(); qy.bl = qy.br = 0; qs.push_back(qy);
49 }else{ // update
50 int p = ops[i].p; int nxt = compress(ops[i].x); int prv = cur[p];
51 us.push_back({p, prv, nxt});
52 cur[p] = nxt; ++t;
53 }
54 }
55
56 int q = (int)qs.size(); int U = (int)allVals.size();
57
58 // Choose block size ~ n^(2/3)
59 int B = max(1, (int)pow(n, 2.0/3.0));
60 for(auto &qr: qs){ qr.bl = qr.l / B; qr.br = qr.r / B; }
61
62 sort(qs.begin(), qs.end(), [&](const Query &A, const Query &Bq){
63 if (A.bl != Bq.bl) return A.bl < Bq.bl;
64 if (A.br != Bq.br) return A.br < Bq.br;
65 return A.t < Bq.t;
66 });
67
68 vector<int> cnt(U, 0);
69 vector<int> ans(q, 0);
70
71 auto add = [&](int x, int &curAns){ if(cnt[x]==0) ++curAns; ++cnt[x]; };
72 auto rem = [&](int x, int &curAns){ --cnt[x]; if(cnt[x]==0) --curAns; };
73
74 int L = 0, R = -1, T = 0; // current window [L,R] and time index T
75 int curAns = 0;
76 vector<int> A = arr; // mutable array for applying updates
77
78 auto apply = [&](int tIdx){ // apply us[tIdx]: set A[pos] from prev -> next
79 int p = us[tIdx].pos; int from = us[tIdx].prev; int to = us[tIdx].next;
80 if(L <= p && p <= R){
81 rem(from, curAns);
82 add(to, curAns);
83 }
84 A[p] = to;
85 };
86 auto rollback = [&](int tIdx){ // rollback us[tIdx-1]: set A[pos] from next -> prev
87 int p = us[tIdx-1].pos; int from = us[tIdx-1].next; int to = us[tIdx-1].prev;
88 if(L <= p && p <= R){
89 rem(from, curAns);
90 add(to, curAns);
91 }
92 A[p] = to;
93 };
94
95 for(const auto &qr : qs){
96 int l = qr.l, r = qr.r, targetT = qr.t;
97 while (T < targetT) { apply(T); ++T; }
98 while (T > targetT) { rollback(T); --T; }
99 while (L > l) add(A[--L], curAns);
100 while (R < r) add(A[++R], curAns);
101 while (L < l) rem(A[L++], curAns);
102 while (R > r) rem(A[R--], curAns);
103 ans[qr.id] = curAns;
104 }
105
106 for(int i=0;i<q;++i) cout << ans[i] << '\n';
107 return 0;
108}
109

We read a mix of queries and point updates. After compressing all values, we build queries annotated with their time (number of updates before them) and a list of updates that know prev and next compressed values. We sort queries by (l-block, r-block, t). While processing, we move L, R, and T; moving in time applies or rolls back updates and adjusts the answer if the updated position is within [L, R]. Distinct count uses O(1) add/remove/apply/rollback.

Time: O((n + q + m) · n^{2/3}) for processing plus O(q log q) sortingSpace: O(n + U + q + m)