🗂️Data StructureAdvanced

Suffix Array

Key Points

•
A suffix array stores the starting indices of all suffixes of a string in lexicographic order.
•
The classic O(n $lo g$ n) construction uses the doubling method and radix sort on pairs of ranks (rank[i], rank[i+2^k]).
•
The LCP array stores the longest common prefix lengths between adjacent suffixes in the suffix array and can be built in O(n) with Kasai's algorithm.
•
Pattern matching can be done in O(m $lo g$ n) by binary searching over the suffix array, comparing the pattern with selected suffixes.
•
The number of distinct substrings equals n(n+1)/2 minus the sum of all LCP values.
•
You can answer LCP(i, j) of any two suffixes using an RMQ (e.g., Sparse Table) over the LCP array in O(1) time after O(n $lo g$ n) preprocessing.
•
Suffix arrays are memory-light alternatives to suffix trees and are competitive for many string problems on large inputs.
•
Careful handling of ranks, stability in counting sort, and off-by-one indices are crucial for correctness and performance.

Prerequisites

→String basics and indexing — Suffixes, substrings, and lexicographic comparisons rely on understanding how strings are indexed and compared.
→Asymptotic analysis (Big-O) — To analyze O(n log n) construction, O(n) LCP, and O(m log n) queries.
→Sorting algorithms — Suffix array construction depends on stable sorting by integer keys; understanding stability is critical.
→Counting sort and radix sort — Efficient O(n) per round sorting of rank pairs requires integer sorting with stability.
→Binary search — Pattern matching on SA uses lower_bound/upper_bound logic on sorted suffixes.
→Range Minimum Query (RMQ) / Sparse Table — To compute LCP of arbitrary suffix pairs from the LCP array in O(1) queries.
→Basic discrete math — Counting total substrings n(n+1)/2 and reasoning about sums like Σ LCP.
→Memory management with vectors/arrays in C++ — Implementations use multiple linear-size arrays and temporary buffers efficiently.

Detailed Explanation

Tap terms for definitions

01Overview

A suffix array is a compact data structure for strings that lists the starting positions of all suffixes in lexicographic (dictionary) order. Imagine all endings of a word, sorted alphabetically: this ordering encodes powerful information about repeated patterns and how substrings relate. The suffix array alone enables binary search over suffixes, which lets us find patterns in O(m log n). When paired with the LCP (Longest Common Prefix) array, it supports deeper analyses such as counting distinct substrings, identifying repeating patterns, and computing common prefixes quickly. Construction can be done via the doubling algorithm in O(n log n) time by sorting indices based on rank pairs (current 2^k prefix and the next 2^k block) using radix/counting sort. After building the suffix array, Kasai’s algorithm computes the LCP array in O(n) time by reusing information from previous comparisons. Together, the suffix array and LCP array offer a practical and efficient alternative to heavier structures like suffix trees, with simpler implementation and lower memory usage, making them popular in competitive programming and text processing applications.

02Intuition & Analogies

Think of a book index that lists words alphabetically and tells you where they appear. A suffix array plays a similar role for all endings of a string: for a string s, it sorts s[0..], s[1..], s[2..], …, s[n-1..]. Why suffixes? Any substring is a prefix of some suffix, so if we can binary search over suffixes, we can locate any substring. For example, in the word "banana", the suffixes are banana, anana, nana, ana, na, a. Sorting them gives a, ana, anana, banana, na, nana, and the suffix array is their starting positions [5, 3, 1, 0, 4, 2]. Once sorted, if you want to find "ana", you compare it with the middle suffix, decide whether to go left or right, and keep narrowing, just like looking up a word in a dictionary. The LCP array is like the shared prefix between neighboring dictionary entries—if you know that two adjacent entries share many starting letters, you can avoid re-comparing those letters repeatedly. The doubling construction builds this ordering layer by layer: first by 1-character ranks, then by 2-character ranks, then 4, 8, and so on. At each step, suffixes that are equal on the first 2^k characters get the same rank; then we refine with the next 2^k block. Using counting sort to sort by these small integer ranks makes each layer linear, and since there are O(log n) layers, the total time is O(n log n).

03Formal Definition

Given a string s of length n over an ordered alphabet, its suffix array SA is a permutation of [0, 1, ..., n-1] such that s[SA[0]..n-1] < s[SA[1]..n-1] < ... < s[SA[n-1]..n-1] in lexicographic order. The LCP array is an array LCP of length n-1 where LCP[i] = lcp(s[SA[i]..], s[SA[i+1]..]) is the length of the longest common prefix between adjacent suffixes in SA. In the doubling algorithm, we maintain an array

r an k^{(k)}

such that

r an k^{(k)}

[i] is the equivalence class of the prefix s[i..i+2^k-1], where equality means identical substrings. We sort suffix indices by pairs (

r an k^{(k)}

[i],

r an k^{(k)}

[i+2^k]); stability ensures lexicographic tie-breaking by the second key. After sorting, we assign new classes

r an k^{(k + 1)}

. The process stops when all ranks are unique (i.e., the number of classes equals n) or when 2^

k \geq n

. Kasai’s algorithm builds LCP in linear time by leveraging the inverse permutation rank where rank[SA[i]] = i: it maintains a sliding window length

h = L CP [i - 1] - 1

and extends matches forward, ensuring each character is compared at most twice overall.

04When to Use

Use a suffix array when you need fast, memory-efficient substring operations on a static string. Typical use cases include: exact pattern matching across a large text with many queries (O(m log n) per query); counting or enumerating distinct substrings (using the relation with the LCP sum); finding the longest repeated substring (max of LCP); computing similarity between regions via LCP queries; lexicographic tasks such as finding the k-th lexicographic suffix or substring; and serving as a base for text indexing structures like the Burrows–Wheeler Transform and FM-index. If you have many online updates to the string, suffix arrays are not ideal because rebuilding is costly; consider dynamic structures instead. If you need the absolute fastest single-query matching and can afford more preprocessing and memory, suffix trees or suffix automata may be more suitable. If queries require LCP(i, j) across arbitrary positions, augment the LCP array with RMQ (e.g., Sparse Table) to get O(1) query time after O(n log n) preprocessing.

⚠️Common Mistakes

Common pitfalls include: 1) Incorrect stability in the radix/counting sort passes when sorting by pairs—always sort by the second key first, then by the first, both stably; otherwise the order is wrong. 2) Mishandling ranks for out-of-bounds offsets during doubling—use a sentinel rank like -1 and shift by +1 when indexing counts so that out-of-range compares as smaller than any valid rank. 3) Off-by-one errors in LCP: note that LCP has length n-1, and LCP[i] compares SA[i] with SA[i+1]. 4) Forgetting to decrement the running match length h in Kasai’s algorithm; this breaks the linear-time guarantee. 5) Using std::string comparisons or substrings directly inside binary search, which can introduce extra O(log n) factors or hidden copies—compare character-by-character without constructing substrings. 6) Mixing 0-based and 1-based indices in outputs, especially when translating from textbook definitions using a sentinel; stay consistent. 7) Overflow when counting substrings—use 64-bit integers because n(n+1)/2 can exceed 32-bit range. 8) Assuming Unicode code-point order equals lexicographic order for natural language; for competitive programming, stick to byte/ASCII order unless specified.

Key Formulas

Suffix Array Ordering

SA = perm (0, 1, \dots, n - 1) such that s [SA [0] ..] < s [SA [1] ..] < \dots < s [SA [n - 1] ..]

Explanation: The suffix array is a permutation of all starting positions that sorts the corresponding suffixes lexicographically. This formalizes the definition of SA.

LCP Definition

LCP [i] = max {k ∣ s [SA [i] .. SA [i] + k - 1] = s [SA [i + 1] .. SA [i + 1] + k - 1]}

Explanation: LCP[i] is the length of the longest common prefix between adjacent suffixes in the suffix array. It measures how similar neighboring suffixes are at the start.

SA Doubling Time

T (n) = O (n lo g n)

Explanation: The doubling construction does O( $lo g$ n) rounds, each taking O(n) with radix sort on integer ranks. Hence the total time is O(n log n).

Kasai's Time

L CP_Kasai (n) = O (n)

Explanation: Kasai's algorithm computes the entire LCP array in linear time by amortizing character comparisons across the string.

Total Substrings

# substr (s) = i = 1 \sum n i = \frac{n ( n + 1 )}{2}

Explanation: There are n choices for the end and up to n for the start, which simplifies to n(n+1)/2 total substrings counting duplicates.

Distinct Substrings via LCP

# distinct (s) = \frac{n ( n + 1 )}{2} - i = 0 \sum n - 2 LCP [i]

Explanation: The overlaps captured by LCP subtract out duplicated prefixes when counting unique substrings. Summing LCP removes exactly the shared prefixes of adjacent suffixes.

LCP of Two Suffixes via RMQ

LCP (i, j) = t \in [m i n (r_{i}, r_{j}), m a x (r_{i}, r_{j}) - 1] min LCP [t], r_{x} = rank [x]

Explanation: The LCP between any two suffixes equals the minimum LCP across the interval between their positions in SA. This reduces to a range minimum query on the LCP array.

Pattern Match Time

Q (m, n) = O (m lo g n)

Explanation: Binary searching over n suffixes makes O(log n) comparisons, each comparing up to m characters, giving O(m log n) per query.

Space for SA+LCP

S (n) = O (n)

Explanation: Both SA and LCP are arrays of length n (LCP is n-1), and ranks are also O(n). Additional buffers for counting sort are linear.

Complexity Analysis

The doubling construction builds the suffix array in O(n log n) time. Each round k sorts indices by rank pairs (rank[i], rank[i+2^k]). Since ranks are integers in [0, n), a pairwise radix sort can be done with two stable counting sorts in O(n) time per round. The number of rounds is at most ⌈log2 n⌉ until 2^

k \geq n

or all ranks become unique, yielding total time O(n log n). The counting sort uses O(n +

σ)

space where σ is the alphabet size (usually σ ≤ 256), so memory is O(n). After SA is built, Kasai’s algorithm computes the LCP array in O(n) time and O(n) space, amortizing character comparisons by reusing the previous LCP value minus one when moving to the next suffix. Pattern matching with SA uses binary search: each comparison of a pattern of length m to a suffix compares up to m characters, and the search does O(log n) comparisons, hence O(m log n) time per query with O(1) extra space (besides SA and LCP). For arbitrary LCP(i, j) queries, augmenting with a Sparse Table over the LCP array requires O(n log n) preprocessing time and space, answering each query in O(1). Counting distinct substrings uses the formula n(n+1)/2 − Σ LCP[i] and runs in O(n) once SA and LCP are available. Overall, the dominant costs are SA construction (O(n log n)) and optional RMQ preprocessing (O(n log n)); memory remains linear or near-linear, making suffix arrays highly practical for large strings in competitive programming.

Code Examples

Build Suffix Array (O(n log n) via Doubling + Radix Sort) and LCP (Kasai O(n))

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Stable counting sort helper: sorts sa by key rnk[sa[i]+k] (with -1 mapped to 0)
5 static void counting_sort_by_key(vector<int>& sa, const vector<int>& rnk, int k) {
6     int n = (int)sa.size();
7     int maxKey = max(256, n) + 1; // +1 for the sentinel bucket
8     vector<int> cnt(maxKey, 0), sa2(n);
9     // Count occurrences
10     for (int i = 0; i < n; ++i) {
11         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0; // -1 -> 0
12         ++cnt[key];
13     }
14     // Prefix sums for stable placement
15     for (int i = 1; i < maxKey; ++i) cnt[i] += cnt[i - 1];
16     // Build output (iterate backward to be stable)
17     for (int i = n - 1; i >= 0; --i) {
18         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
19         sa2[--cnt[key]] = sa[i];
20     }
21     sa.swap(sa2);
22 }
23 
24 // Build suffix array in O(n log n) using doubling and radix sort on pairs
25 vector<int> build_suffix_array(const string& s) {
26     int n = (int)s.size();
27     vector<int> sa(n), rnk(n);
28     if (n == 0) return sa;
29     // Initial ranks by single characters (byte values)
30     for (int i = 0; i < n; ++i) {
31         sa[i] = i;
32         rnk[i] = (unsigned char)s[i];
33     }
34     for (int k = 1; k < n; k <<= 1) {
35         // Sort by second key then first key (both stably)
36         counting_sort_by_key(sa, rnk, k);
37         counting_sort_by_key(sa, rnk, 0);
38         // Recompute ranks
39         vector<int> newr(n);
40         newr[sa[0]] = 0;
41         int classes = 1;
42         for (int i = 1; i < n; ++i) {
43             int a = sa[i - 1], b = sa[i];
44             int a1 = rnk[a], b1 = rnk[b];
45             int a2 = (a + k < n) ? rnk[a + k] : -1;
46             int b2 = (b + k < n) ? rnk[b + k] : -1;
47             if (a1 != b1 || a2 != b2) ++classes;
48             newr[b] = classes - 1;
49         }
50         rnk.swap(newr);
51         if (classes == n) break; // All ranks unique
52     }
53     return sa;
54 }
55 
56 // Kasai's algorithm: LCP[i] = lcp(s[sa[i]..], s[sa[i+1]..]) for i in [0..n-2]
57 vector<int> build_lcp(const string& s, const vector<int>& sa) {
58     int n = (int)s.size();
59     vector<int> lcp(max(0, n - 1)), rank(n);
60     if (n == 0) return lcp;
61     for (int i = 0; i < n; ++i) rank[sa[i]] = i;
62     int h = 0;
63     for (int i = 0; i < n; ++i) {
64         int r = rank[i];
65         if (r == n - 1) { // last suffix; no next neighbor
66             h = 0;
67             continue;
68         }
69         int j = sa[r + 1];
70         // Extend current LCP by reusing previous value h - 1
71         while (i + h < n && j + h < n && s[i + h] == s[j + h]) ++h;
72         lcp[r] = h;
73         if (h > 0) --h; // Prepare for next i
74     }
75     return lcp;
76 }
77 
78 int main() {
79     ios::sync_with_stdio(false);
80     cin.tie(nullptr);
81 
82     string s;
83     if (!(cin >> s)) return 0;
84 
85     auto sa = build_suffix_array(s);
86     auto lcp = build_lcp(s, sa);
87 
88     cout << "Suffix Array (0-based):\n";
89     for (int i = 0; i < (int)sa.size(); ++i) {
90         cout << i << ": " << sa[i] << " -> " << s.substr(sa[i]) << '\n';
91     }
92 
93     cout << "\nLCP array (between SA[i] and SA[i+1]):\n";
94     for (int i = 0; i < (int)lcp.size(); ++i) {
95         cout << "LCP[" << i << "] = " << lcp[i] << '\n';
96     }
97     return 0;
98 }
99

This program constructs the suffix array using the O(n log n) doubling algorithm with two-pass stable counting sorts (radix on pairs), and then computes the LCP array using Kasai’s O(n) algorithm. It prints each suffix in sorted order and the LCP values between adjacent suffixes.

Time: O(n log n) to build SA + O(n) to build LCPSpace: O(n) additional space for SA, ranks, temporary arrays, and LCP

Exact Pattern Matching with Suffix Array (O(m log n) per query)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 static void counting_sort_by_key(vector<int>& sa, const vector<int>& rnk, int k) {
5     int n = (int)sa.size();
6     int maxKey = max(256, n) + 1;
7     vector<int> cnt(maxKey, 0), sa2(n);
8     for (int i = 0; i < n; ++i) {
9         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
10         ++cnt[key];
11     }
12     for (int i = 1; i < maxKey; ++i) cnt[i] += cnt[i - 1];
13     for (int i = n - 1; i >= 0; --i) {
14         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
15         sa2[--cnt[key]] = sa[i];
16     }
17     sa.swap(sa2);
18 }
19 
20 vector<int> build_suffix_array(const string& s) {
21     int n = (int)s.size();
22     vector<int> sa(n), rnk(n);
23     if (n == 0) return sa;
24     for (int i = 0; i < n; ++i) sa[i] = i, rnk[i] = (unsigned char)s[i];
25     for (int k = 1; k < n; k <<= 1) {
26         counting_sort_by_key(sa, rnk, k);
27         counting_sort_by_key(sa, rnk, 0);
28         vector<int> newr(n);
29         newr[sa[0]] = 0;
30         int classes = 1;
31         for (int i = 1; i < n; ++i) {
32             int a = sa[i - 1], b = sa[i];
33             int a1 = rnk[a], b1 = rnk[b];
34             int a2 = (a + k < n) ? rnk[a + k] : -1;
35             int b2 = (b + k < n) ? rnk[b + k] : -1;
36             if (a1 != b1 || a2 != b2) ++classes;
37             newr[b] = classes - 1;
38         }
39         rnk.swap(newr);
40         if (classes == n) break;
41     }
42     return sa;
43 }
44 
45 // Compare suffix s[pos..] with pattern p. Return -1 if suffix < p, 0 if suffix starts with p, +1 if suffix > p.
46 int compare_suffix_with_pattern(const string& s, int pos, const string& p) {
47     int n = (int)s.size(), m = (int)p.size();
48     for (int i = 0; i < m; ++i) {
49         if (pos + i >= n) return -1; // suffix exhausted first => suffix < pattern
50         unsigned char sc = (unsigned char)s[pos + i];
51         unsigned char pc = (unsigned char)p[i];
52         if (sc < pc) return -1;
53         if (sc > pc) return +1;
54     }
55     return 0; // pattern fully matched as a prefix of suffix
56 }
57 
58 // Find the range [L, R) in SA such that all suffixes start with pattern p.
59 pair<int,int> find_occurrence_range(const string& s, const vector<int>& sa, const string& p) {
60     int n = (int)sa.size();
61     int lo = 0, hi = n; // lower_bound for first suffix >= p
62     while (lo < hi) {
63         int mid = (lo + hi) / 2;
64         int cmp = compare_suffix_with_pattern(s, sa[mid], p);
65         if (cmp >= 0) hi = mid; else lo = mid + 1;
66     }
67     int L = lo;
68     lo = 0; hi = n; // upper_bound for first suffix > p as prefix (i.e., not starting with p)
69     while (lo < hi) {
70         int mid = (lo + hi) / 2;
71         int cmp = compare_suffix_with_pattern(s, sa[mid], p);
72         if (cmp > 0) hi = mid; else lo = mid + 1;
73     }
74     int R = lo;
75     // Check if any match exists
76     if (L >= R) return {-1, -1};
77     return {L, R};
78 }
79 
80 int main() {
81     ios::sync_with_stdio(false);
82     cin.tie(nullptr);
83 
84     string text, pattern;
85     if (!(cin >> text >> pattern)) return 0;
86     auto sa = build_suffix_array(text);
87     auto range = find_occurrence_range(text, sa, pattern);
88 
89     if (range.first == -1) {
90         cout << "No occurrences found\n";
91     } else {
92         cout << "Occurrences at positions (0-based):\n";
93         for (int i = range.first; i < range.second; ++i) {
94             cout << sa[i] << (i + 1 < range.second ? ' ' : '\n');
95         }
96     }
97     return 0;
98 }
99

This program builds the suffix array for a given text and finds all occurrences of a pattern by binary searching the lexicographic range of suffixes that have the pattern as a prefix. Comparisons are done character-by-character without creating substrings, giving O(m log n) time per query.

Time: O(n log n) preprocessing, O(m log n) per pattern querySpace: O(n)

Count Distinct Substrings using SA + LCP

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 static void counting_sort_by_key(vector<int>& sa, const vector<int>& rnk, int k) {
5     int n = (int)sa.size();
6     int maxKey = max(256, n) + 1;
7     vector<int> cnt(maxKey, 0), sa2(n);
8     for (int i = 0; i < n; ++i) {
9         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
10         ++cnt[key];
11     }
12     for (int i = 1; i < maxKey; ++i) cnt[i] += cnt[i - 1];
13     for (int i = n - 1; i >= 0; --i) {
14         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
15         sa2[--cnt[key]] = sa[i];
16     }
17     sa.swap(sa2);
18 }
19 
20 vector<int> build_suffix_array(const string& s) {
21     int n = (int)s.size();
22     vector<int> sa(n), rnk(n);
23     if (n == 0) return sa;
24     for (int i = 0; i < n; ++i) sa[i] = i, rnk[i] = (unsigned char)s[i];
25     for (int k = 1; k < n; k <<= 1) {
26         counting_sort_by_key(sa, rnk, k);
27         counting_sort_by_key(sa, rnk, 0);
28         vector<int> newr(n);
29         newr[sa[0]] = 0;
30         int classes = 1;
31         for (int i = 1; i < n; ++i) {
32             int a = sa[i - 1], b = sa[i];
33             int a1 = rnk[a], b1 = rnk[b];
34             int a2 = (a + k < n) ? rnk[a + k] : -1;
35             int b2 = (b + k < n) ? rnk[b + k] : -1;
36             if (a1 != b1 || a2 != b2) ++classes;
37             newr[b] = classes - 1;
38         }
39         rnk.swap(newr);
40         if (classes == n) break;
41     }
42     return sa;
43 }
44 
45 vector<int> build_lcp(const string& s, const vector<int>& sa) {
46     int n = (int)s.size();
47     vector<int> lcp(max(0, n - 1)), rank(n);
48     if (n == 0) return lcp;
49     for (int i = 0; i < n; ++i) rank[sa[i]] = i;
50     int h = 0;
51     for (int i = 0; i < n; ++i) {
52         int r = rank[i];
53         if (r == n - 1) { h = 0; continue; }
54         int j = sa[r + 1];
55         while (i + h < n && j + h < n && s[i + h] == s[j + h]) ++h;
56         lcp[r] = h;
57         if (h) --h;
58     }
59     return lcp;
60 }
61 
62 int main() {
63     ios::sync_with_stdio(false);
64     cin.tie(nullptr);
65 
66     string s; cin >> s;
67     auto sa = build_suffix_array(s);
68     auto lcp = build_lcp(s, sa);
69 
70     long long n = (long long)s.size();
71     long long total = n * (n + 1) / 2; // total substrings (with duplicates)
72     long long overlap = 0;
73     for (int x : lcp) overlap += x;
74     long long distinct = total - overlap;
75 
76     cout << distinct << '\n';
77     return 0;
78 }
79

The number of distinct substrings equals total substrings minus duplicated prefixes captured by LCP. After building SA and LCP, we compute n(n+1)/2 − sum(LCP) using 64-bit integers to avoid overflow.

Time: O(n log n) to build SA + O(n) for LCP and summationSpace: O(n)

LCP of Any Two Suffixes via RMQ (Sparse Table over LCP)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 static void counting_sort_by_key(vector<int>& sa, const vector<int>& rnk, int k) {
5     int n = (int)sa.size();
6     int maxKey = max(256, n) + 1;
7     vector<int> cnt(maxKey, 0), sa2(n);
8     for (int i = 0; i < n; ++i) {
9         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
10         ++cnt[key];
11     }
12     for (int i = 1; i < maxKey; ++i) cnt[i] += cnt[i - 1];
13     for (int i = n - 1; i >= 0; --i) {
14         int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
15         sa2[--cnt[key]] = sa[i];
16     }
17     sa.swap(sa2);
18 }
19 
20 vector<int> build_suffix_array(const string& s) {
21     int n = (int)s.size();
22     vector<int> sa(n), rnk(n);
23     if (n == 0) return sa;
24     for (int i = 0; i < n; ++i) sa[i] = i, rnk[i] = (unsigned char)s[i];
25     for (int k = 1; k < n; k <<= 1) {
26         counting_sort_by_key(sa, rnk, k);
27         counting_sort_by_key(sa, rnk, 0);
28         vector<int> newr(n);
29         newr[sa[0]] = 0;
30         int classes = 1;
31         for (int i = 1; i < n; ++i) {
32             int a = sa[i - 1], b = sa[i];
33             int a1 = rnk[a], b1 = rnk[b];
34             int a2 = (a + k < n) ? rnk[a + k] : -1;
35             int b2 = (b + k < n) ? rnk[b + k] : -1;
36             if (a1 != b1 || a2 != b2) ++classes;
37             newr[b] = classes - 1;
38         }
39         rnk.swap(newr);
40         if (classes == n) break;
41     }
42     return sa;
43 }
44 
45 vector<int> build_lcp(const string& s, const vector<int>& sa) {
46     int n = (int)s.size();
47     vector<int> lcp(max(0, n - 1)), rank(n);
48     if (n == 0) return lcp;
49     for (int i = 0; i < n; ++i) rank[sa[i]] = i;
50     int h = 0;
51     for (int i = 0; i < n; ++i) {
52         int r = rank[i];
53         if (r == n - 1) { h = 0; continue; }
54         int j = sa[r + 1];
55         while (i + h < n && j + h < n && s[i + h] == s[j + h]) ++h;
56         lcp[r] = h;
57         if (h) --h;
58     }
59     return lcp;
60 }
61 
62 struct SparseTable {
63     // RMQ for min on LCP array
64     int n, K;
65     vector<int> lg;
66     vector<vector<int>> st;
67     SparseTable() {}
68     explicit SparseTable(const vector<int>& a) { build(a); }
69 
70     void build(const vector<int>& a) {
71         n = (int)a.size();
72         if (n == 0) return;
73         K = 32 - __builtin_clz(n); // ceil(log2(n))
74         lg.assign(n + 1, 0);
75         for (int i = 2; i <= n; ++i) lg[i] = lg[i/2] + 1;
76         st.assign(K, vector<int>(n));
77         st[0] = a;
78         for (int k = 1; k < K; ++k) {
79             int len = 1 << k;
80             int half = len >> 1;
81             for (int i = 0; i + len <= n; ++i) {
82                 st[k][i] = min(st[k-1][i], st[k-1][i + half]);
83             }
84         }
85     }
86 
87     // Query min on [l, r] inclusive
88     int query(int l, int r) const {
89         if (l > r) swap(l, r);
90         int len = r - l + 1;
91         int k = 31 - __builtin_clz(len);
92         return min(st[k][l], st[k][r - (1 << k) + 1]);
93     }
94 };
95 
96 int main() {
97     ios::sync_with_stdio(false);
98     cin.tie(nullptr);
99 
100     string s; cin >> s;
101     auto sa = build_suffix_array(s);
102     auto lcp = build_lcp(s, sa);
103 
104     int n = (int)s.size();
105     vector<int> rank(n);
106     for (int i = 0; i < n; ++i) rank[sa[i]] = i;
107 
108     SparseTable st(lcp); // RMQ over LCP
109 
110     int q; cin >> q;
111     // Each query: positions i, j (0-based). Output LCP of suffixes s[i..] and s[j..].
112     while (q--) {
113         int i, j; cin >> i >> j;
114         if (i == j) {
115             cout << (n - i) << '\n';
116             continue;
117         }
118         int ri = rank[i], rj = rank[j];
119         int L = min(ri, rj), R = max(ri, rj) - 1; // LCP indices are between SA positions
120         if ((int)lcp.size() == 0) { cout << 0 << '\n'; continue; }
121         int ans = st.query(L, R);
122         cout << ans << '\n';
123     }
124     return 0;
125 }
126

This program augments SA + LCP with a Sparse Table over the LCP array to answer LCP(i, j) for any two suffixes in O(1). The RMQ is performed on the interval between the ranks of i and j in SA.

Time: O(n log n) to build SA and the Sparse Table; O(1) per LCP querySpace: O(n log n) for the Sparse Table, O(n) for SA/LCP

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Stable counting sort helper: sorts sa by key rnk[sa[i]+k] (with -1 mapped to 0)
5	static void counting_sort_by_key(vector<int>& sa, const vector<int>& rnk, int k) {
6	int n = (int)sa.size();
7	int maxKey = max(256, n) + 1; // +1 for the sentinel bucket
8	vector<int> cnt(maxKey, 0), sa2(n);
9	// Count occurrences
10	for (int i = 0; i < n; ++i) {
11	int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0; // -1 -> 0
12	++cnt[key];
13	}
14	// Prefix sums for stable placement
15	for (int i = 1; i < maxKey; ++i) cnt[i] += cnt[i - 1];
16	// Build output (iterate backward to be stable)
17	for (int i = n - 1; i >= 0; --i) {
18	int key = (sa[i] + k < n) ? (rnk[sa[i] + k] + 1) : 0;
19	sa2[--cnt[key]] = sa[i];
20	}
21	sa.swap(sa2);
22	}
23
24	// Build suffix array in O(n log n) using doubling and radix sort on pairs
25	vector<int> build_suffix_array(const string& s) {
26	int n = (int)s.size();
27	vector<int> sa(n), rnk(n);
28	if (n == 0) return sa;
29	// Initial ranks by single characters (byte values)
30	for (int i = 0; i < n; ++i) {
31	sa[i] = i;
32	rnk[i] = (unsigned char)s[i];
33	}
34	for (int k = 1; k < n; k <<= 1) {
35	// Sort by second key then first key (both stably)
36	counting_sort_by_key(sa, rnk, k);
37	counting_sort_by_key(sa, rnk, 0);
38	// Recompute ranks
39	vector<int> newr(n);
40	newr[sa[0]] = 0;
41	int classes = 1;
42	for (int i = 1; i < n; ++i) {
43	int a = sa[i - 1], b = sa[i];
44	int a1 = rnk[a], b1 = rnk[b];
45	int a2 = (a + k < n) ? rnk[a + k] : -1;
46	int b2 = (b + k < n) ? rnk[b + k] : -1;
47	if (a1 != b1 \|\| a2 != b2) ++classes;
48	newr[b] = classes - 1;
49	}
50	rnk.swap(newr);
51	if (classes == n) break; // All ranks unique
52	}
53	return sa;
54	}
55
56	// Kasai's algorithm: LCP[i] = lcp(s[sa[i]..], s[sa[i+1]..]) for i in [0..n-2]
57	vector<int> build_lcp(const string& s, const vector<int>& sa) {
58	int n = (int)s.size();
59	vector<int> lcp(max(0, n - 1)), rank(n);
60	if (n == 0) return lcp;
61	for (int i = 0; i < n; ++i) rank[sa[i]] = i;
62	int h = 0;
63	for (int i = 0; i < n; ++i) {
64	int r = rank[i];
65	if (r == n - 1) { // last suffix; no next neighbor
66	h = 0;
67	continue;
68	}
69	int j = sa[r + 1];
70	// Extend current LCP by reusing previous value h - 1
71	while (i + h < n && j + h < n && s[i + h] == s[j + h]) ++h;
72	lcp[r] = h;
73	if (h > 0) --h; // Prepare for next i
74	}
75	return lcp;
76	}
77
78	int main() {
79	ios::sync_with_stdio(false);
80	cin.tie(nullptr);
81
82	string s;
83	if (!(cin >> s)) return 0;
84
85	auto sa = build_suffix_array(s);
86	auto lcp = build_lcp(s, sa);
87
88	cout << "Suffix Array (0-based):\n";
89	for (int i = 0; i < (int)sa.size(); ++i) {
90	cout << i << ": " << sa[i] << " -> " << s.substr(sa[i]) << '\n';
91	}
92
93	cout << "\nLCP array (between SA[i] and SA[i+1]):\n";
94	for (int i = 0; i < (int)lcp.size(); ++i) {
95	cout << "LCP[" << i << "] = " << lcp[i] << '\n';
96	}
97	return 0;
98	}
99