⚙️AlgorithmIntermediate

State Space Reduction

Key Points

•
State space reduction shrinks the number of dynamic programming or search states by keeping only the information that truly affects future decisions.
•
You can replace full values with summaries like parity, remainder, or a capped bound when larger values behave the same in transitions.
•
Merging equivalent states and using symmetry (canonical forms) prevents exploring many duplicate configurations.
•
Coordinate compression maps large or sparse numeric domains to a small contiguous range without changing order-based logic.
•
Discretization turns continuous values into a small finite set of buckets that preserves decision boundaries.
•
These tricks often transform infeasible O(n × MAX) solutions into O(n × n) or O(n × log n) ones.
•
Correctness relies on defining an equivalence relation or projection that preserves transition outcomes and objective values.
•
Common pitfalls include discarding information that affects transitions, incorrect canonicalization, and off-by-one errors in compression.

Prerequisites

→Dynamic Programming Basics — State space reduction modifies DP states; understanding recurrences and state definitions is essential.
→Equivalence Relations and Functions — Formal correctness relies on projecting states and preserving values under an equivalence relation.
→Hashing and Canonicalization — Merging symmetric states or memoizing reduced states requires robust hashing of canonical forms.
→Fenwick/Segment Trees — Coordinate compression is often paired with index-based data structures for O(log n) operations.
→Order Statistics and Sorting — Compression and discretization require sorting and ranking unique values.
→Big-O Complexity Analysis — Evaluating the benefit of reduction requires comparing asymptotic costs before and after.

Detailed Explanation

Tap terms for definitions

01Overview

State space reduction is a set of techniques that make dynamic programming (DP) and search algorithms practical by decreasing the number of states you must consider. Instead of tracking every detail about the past, you keep only the parts that influence future choices. For example, if a transition depends only on a value modulo k, you record the remainder, not the entire number. If very large weights have the same effect once they exceed a threshold, you cap them at that threshold. If two states are indistinguishable for all future decisions, you merge them into a single representative. In geometric or order-based problems, you can map large numeric coordinates to a small set of ranks (coordinate compression) without changing comparisons. For continuous inputs, discretization approximates the domain by a small finite grid that preserves decision outcomes. Hook → Concept → Example: Imagine navigating a city with only the relevant landmarks highlighted on your map; your route planning becomes much easier. Conceptually, this is what state space reduction does for algorithms: it hides irrelevant details while preserving everything that matters for optimal decisions. For example, classic knapsack with huge weight capacity W but modest total value can be solved by switching the DP dimension from weight to value, slashing states drastically. Similarly, a subset-sum divisible-by-k problem needs only k remainders, not all sums.

02Intuition & Analogies

Think of packing for a trip. You don’t bring your entire wardrobe—only what affects your trip: weather-appropriate clothing and essentials. Likewise, an algorithm doesn’t need the entire past history; it can keep a short “packing list” of facts that actually affect what happens next. This short list is called a sufficient summary or projection of the state. Another analogy is a library catalogue. You could uniquely describe a book by its entire text, but for finding it, a catalog number—like title, author, and a classification code—is enough. Those fields are the information that matters for the task of retrieval. Similarly, in DP, we keep only what matters for transitions and objective values: maybe a remainder modulo k, or whether something is already above a threshold. Symmetry is like recognizing two chess positions that are mirror images. If the rules and goals don’t distinguish left from right, you can flip a position to a canonical direction and treat all mirror pairs as the same. This immediately halves the number of positions. For huge numeric values, coordinate compression is like replacing street addresses spread over a vast city with small block numbers that preserve order. If you only compare which address is earlier or later, you don’t need the exact large numbers—only their relative order. Discretization is similar: you divide a continuous dial into a small number of meaningful ticks because decisions only change when you cross certain thresholds. In all cases, the guiding question is: what minimal information predicts every future outcome and cost? Keep that—and safely discard the rest.

03Formal Definition

Let S be the original state space and let T:

S \to 2^{S}

be the transition relation. Let V:

S \to R

be the objective value (e.g., DP cost or reward) defined by the recurrence. A state space reduction is a mapping (projection) p:

S \to S'

such that the value function factors through p, i.e., there exists V′:

S' \to R

with V(s) = V′(p(s)) for all s ∈ S. Equivalently, define an equivalence relation ~ on S by s₁ ~ s₂ iff for all admissible action sequences, their future transition sets and resulting objective values are indistinguishable. Then S′ is the quotient S/~ and p is the canonical projection. Practical constructions include: - Modulo projection: p(s) = s mod k, for transitions depending only on remainders. - Bounding/capping: define b(x) = min(x, B). If transitions and costs for

x \geq B

are identical, replace x by b(x). - Symmetry canonicalization: choose a canonical representative c(s) from each orbit under a symmetry group G (e.g., sort components, rotate/reflect to a minimal string), and use c(s) as the state. - Coordinate compression: for a set of values U ⊆ R, use comp(x) = rank(x) among U, preserving order comparisons. Correctness requires: for all s ∈ S and actions a, transitions and costs respect the projection, i.e., p(T(s, a)) is determined solely by p(s) (Markovian in the reduced state), and optimal values are preserved: V(s) = V′(p(s)).

04When to Use

Use state space reduction when a direct DP or search is infeasible due to a large or unbounded dimension, and you can identify a smaller summary that fully determines transitions.

Large numeric ranges with order-only logic: LIS, dominance counting, or DP that compares values; compress coordinates to 1..m.
Problems with periodicity: if transitions depend on values modulo k (remainders, parity), track mod k states only.
Threshold effects: if costs/transitions saturate beyond a bound B (e.g., counts above n behave the same), cap at B.
Symmetric structures: puzzles, games, or configurations invariant under permutations, rotations, reflections; canonicalize to avoid duplicates.
Value-based DP in knapsack: when capacity W is huge but sum of values V_sum is manageable, switch DP to the value axis.
Discretization: continuous inputs where decisions change at known breakpoints (e.g., piecewise linear cost, event times, interval endpoints); collect all critical points and compress them.
Pruning by dominance: when states are vectors (cost, benefit, …), maintain only Pareto-optimal states; dominated states cannot improve any solution.

⚠️Common Mistakes

Merging non-equivalent states: projecting away information that actually affects future transitions changes optimal values. Always justify that transitions and costs depend only on the projected state.
Incorrect bounds/capping: choosing B too small when behavior for x > B is not identical. Prove monotonic saturation or equivalence beyond B.
Incomplete canonicalization: using a canonical form that is not unique (e.g., forgetting to sort components) causes duplicate exploration or missed states. Always define and implement a consistent canonical representative.
Off-by-one in coordinate compression: forgetting 1-based indices for Fenwick/segment trees or mishandling duplicate values, which breaks order-sensitive logic.
Discretization granularity errors: using buckets that merge values across a decision boundary; ensure boundaries align with all critical points (e.g., all interval endpoints and event times).
Losing Pareto-optimal states: over-aggressive dominance pruning can remove states that are not truly dominated in all dimensions. Keep only states dominated in every relevant metric.
Memory-time tradeoff surprises: projections like modulo k shrink states but may increase transitions per state; check both time and space after reduction.

Key Formulas

Quotient Space

S^{'} = S / \sim

Explanation: State reduction can be formalized by partitioning the original state space S with an equivalence relation $\sim$ , producing the reduced state space S'. Each class contains states that are indistinguishable for future decisions.

Value Preservation

\forall s \in S, V (s) = V^{'} (p (s))

Explanation: The reduced value function V' composed with projection p must equal the original value V. This ensures that projecting states does not change optimal answers.

Capping/Bounding

b (x) = min (x, B)

Explanation: When transitions and costs for all x $\geq$ B are identical, replace x by b(x). This reduces the number of distinct states without affecting correctness.

Value-based Knapsack Complexity

V_{knap} (n, W) \Rightarrow O (n \cdot i = 1 \sum n v_{i})

Explanation: Switching from weight-indexed DP O(nW) to value-indexed DP yields time proportional to n times the sum of item values. This is advantageous when W is huge but the total value is moderate.

Remainder Transition

r^{'} = (r + a_{i}) mod k

Explanation: If only remainders mod k matter, adding item $a_{i}$ updates the state from remainder r to r'. This collapses infinitely many sums to k states.

Coordinate Compression Rank

rank (x) = 1 + ∣ {y \in U : y < x} ∣

Explanation: Compression maps each value to its order rank among unique values U, preserving all <, $\leq$ comparisons. This allows using small-indexed structures like Fenwick trees.

State Count Bounds (Informal)

∣ S^{'} ∣ \leq min {∣ S ∣, B \cdot n, k, m!}

Explanation: After reduction, the number of states is bounded by parameters like the cap B, modulo base k, number of unique coordinates m, or symmetry orbits; the exact bound depends on the specific projection.

Pareto-Optimal Set

Pareto filter: keep P = {s : ∄ t, t ⪯ s \land t \neq = s}

Explanation: Maintain only states not dominated component-wise by another state. This reduces the frontier while preserving potential optima.

Fenwick-based DP Complexity

T (n) = O (n lo g n), S (n) = O (n)

Explanation: Using a Fenwick tree over compressed coordinates yields logarithmic updates and queries per element, leading to O(n log n) time and linear space.

Triangular Number (for Summations)

i = 1 \sum n i = \frac{n ( n + 1 )}{2}

Explanation: Useful for reasoning about pairwise transitions or compressed ranks; it gives the count of pairs or cumulative operations in naive approaches to contrast with reduced solutions.

Complexity Analysis

State space reduction changes both the asymptotic and constant factors of DP/search. If the original state is a tuple (i, x) with i in [1..n] and x in a large domain, naive complexity O(n ×

∣ X ∣

) becomes O(n ×

∣ X^{'} ∣

) after a projection p:

X \to X^{'}

. For example, switching

\frac{0}{1}

knapsack from weight to value replaces

∣ X ∣

= W by

∣ X^{'} ∣

V_{s} u m

\sum

v_{i}

, so time drops from O(nW) to O(n

V_{s} u m

) and space from O(W) to O(

V_{s} u m

). When the problem exhibits periodic structure, modulo-k reduction shrinks an unbounded dimension to k states, yielding O(nk) instead of O(n × MA

X_{S} U M

). Symmetry/canonicalization reduces the branching factor by merging isomorphic states; the gain is proportional to the size of the symmetry group’s orbits (often a constant-factor speedup but can be dramatic). Coordinate compression maps big integers to 1..m where m is the number of distinct values; data structures like Fenwick or segment trees then achieve O(log m) per operation. This replaces potentially O(log U) (if using ordered maps) or infeasible array sizes O(U) by O(log m) time and O(m) space. Discretization to critical points similarly reduces continuous domains to O(m) where m is the number of events/boundaries. Tradeoffs: projections may add overhead—sorting for compression O(m log m), hashing canonical forms O(1) average per visit, or maintaining Pareto frontiers O(log k) per insertion if kept in balanced structures. However, these costs are typically dominated by the savings from the reduced state count. Care must ensure correctness: the projection must preserve transitions and optimal values; otherwise, the complexity improvement is irrelevant due to wrong answers.

Code Examples

Value-based DP for 0/1 Knapsack (Reduce from O(nW) to O(n · Σv))

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Value-based DP: dp[v] = minimum weight needed to achieve total value v
5 // This is effective when W is huge but the sum of values is moderate.
6 
7 int main() {
8     ios::sync_with_stdio(false);
9     cin.tie(nullptr);
10 
11     // Example: n=4, capacity W=10^7 (huge), values modest
12     int n = 4;
13     long long W = 10000000; // huge capacity
14     vector<long long> w = {6, 3, 4, 2};
15     vector<int> val = {6, 4, 5, 3};
16 
17     int Vsum = 0;
18     for (int v : val) Vsum += v;
19 
20     const long long INF = (1LL<<60);
21     vector<long long> dp(Vsum + 1, INF);
22     dp[0] = 0; // 0 value costs 0 weight
23 
24     // For each item, update dp from high to low values (0/1 items)
25     for (int i = 0; i < n; ++i) {
26         for (int v = Vsum; v >= val[i]; --v) {
27             if (dp[v - val[i]] != INF) {
28                 dp[v] = min(dp[v], dp[v - val[i]] + w[i]);
29             }
30         }
31     }
32 
33     int bestValue = 0;
34     for (int v = 0; v <= Vsum; ++v) {
35         if (dp[v] <= W) bestValue = v; // maximum value with weight <= W
36     }
37 
38     cout << "Best value achievable = " << bestValue << "\n";
39 
40     // Notes on state reduction:
41     // - We replaced the weight dimension (0..W) by value (0..Vsum).
42     // - If item values are small (e.g., <= 1000) and n is ~2000, Vsum is manageable.
43     // - Dominance: dp keeps only the minimum weight per value, discarding worse states.
44 
45     return 0;
46 }
47

The classic O(nW) DP is infeasible when W is large. We switch axes: for each total value v, store the minimum weight needed. This compresses the state space to Vsum = Σ values. The answer is the largest v with dp[v] ≤ W. This is a direct application of bounding and merging equivalent states: for each value v, only the minimal weight matters.

Time: O(n · Vsum) where Vsum = \sum v_iSpace: O(Vsum)

Subset Sum Divisible by k (Track Only Remainders)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Decide if there exists a non-empty subset whose sum is divisible by k.
5 // State reduction: only the remainder modulo k matters, not the full sum.
6 
7 int main() {
8     ios::sync_with_stdio(false);
9     cin.tie(nullptr);
10 
11     int n = 6, k = 7;
12     vector<int> a = {3, 1, 4, 2, 2, 1};
13 
14     vector<char> dp(k, 0), ndp(k, 0);
15     // dp[r] = can we achieve remainder r with some subset so far?
16     dp[0] = 0; // empty subset not counted as valid yet; we'll enforce non-empty by construction
17 
18     for (int x : a) {
19         fill(ndp.begin(), ndp.end(), 0);
20         // Option 1: skip x (carry over remainders)
21         for (int r = 0; r < k; ++r) if (dp[r]) ndp[r] = 1;
22         // Option 2: take x alone
23         ndp[x % k] = 1;
24         // Option 3: add x to previously achievable remainders
25         for (int r = 0; r < k; ++r) if (dp[r]) ndp[(r + x) % k] = 1;
26         dp.swap(ndp);
27     }
28 
29     cout << (dp[0] ? "YES" : "NO") << "\n";
30 
31     // State reduction:
32     // - We tracked only k remainders instead of potentially huge sums.
33     // - Transitions use r' = (r + x) mod k, which is closed on {0..k-1}.
34 
35     return 0;
36 }
37

The future transitions depend only on the sum modulo k, so the DP state can be reduced to k remainders. We build achievable remainders by either taking or skipping each number, and we also allow taking a number as the start of a subset to enforce non-emptiness. If remainder 0 becomes achievable, the answer is YES.

Time: O(nk)Space: O(k)

LIS with Coordinate Compression + Fenwick Tree

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Compute the length of the Longest Increasing Subsequence (strict) using
5 // coordinate compression and a Fenwick tree that stores prefix maximums.
6 
7 struct FenwickMax {
8     int n; vector<int> bit;
9     FenwickMax(int n=0): n(n), bit(n+1, 0) {}
10     void update(int idx, int val) {
11         for (; idx <= n; idx += idx & -idx) bit[idx] = max(bit[idx], val);
12     }
13     int query(int idx) { // max on [1..idx]
14         int res = 0;
15         for (; idx > 0; idx -= idx & -idx) res = max(res, bit[idx]);
16         return res;
17     }
18 };
19 
20 int main() {
21     ios::sync_with_stdio(false);
22     cin.tie(nullptr);
23 
24     vector<int> a = {10, 1, 20, 2, 3, 30, 4, 5};
25     int n = (int)a.size();
26 
27     // Coordinate compression: map values to ranks 1..m
28     vector<int> vals = a;
29     sort(vals.begin(), vals.end());
30     vals.erase(unique(vals.begin(), vals.end()), vals.end());
31     auto rank_of = [&](int x){ return (int)(lower_bound(vals.begin(), vals.end(), x) - vals.begin()) + 1; };
32     int m = (int)vals.size();
33 
34     FenwickMax fw(m);
35     int ans = 0;
36     for (int x : a) {
37         int r = rank_of(x);
38         // best LIS ending with value < x is query(r-1)
39         int best = fw.query(r - 1);
40         fw.update(r, best + 1);
41         ans = max(ans, best + 1);
42     }
43 
44     cout << "LIS length = " << ans << "\n";
45 
46     // State reduction:
47     // - Large values up to 1e9 are reduced to ranks 1..m via compression.
48     // - Fenwick tree stores DP[r] = best LIS length ending at rank r.
49 
50     return 0;
51 }
52

LIS decisions depend only on the relative order of values, not their magnitudes. We compress values to ranks and use a Fenwick tree to maintain, for each rank, the best LIS length. Each step queries the best value for all smaller ranks and updates the current rank, achieving O(n log n) time on a compressed domain.

Time: O(n log n)Space: O(n)

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Value-based DP: dp[v] = minimum weight needed to achieve total value v
5	// This is effective when W is huge but the sum of values is moderate.
6
7	int main() {
8	ios::sync_with_stdio(false);
9	cin.tie(nullptr);
10
11	// Example: n=4, capacity W=10^7 (huge), values modest
12	int n = 4;
13	long long W = 10000000; // huge capacity
14	vector<long long> w = {6, 3, 4, 2};
15	vector<int> val = {6, 4, 5, 3};
16
17	int Vsum = 0;
18	for (int v : val) Vsum += v;
19
20	const long long INF = (1LL<<60);
21	vector<long long> dp(Vsum + 1, INF);
22	dp[0] = 0; // 0 value costs 0 weight
23
24	// For each item, update dp from high to low values (0/1 items)
25	for (int i = 0; i < n; ++i) {
26	for (int v = Vsum; v >= val[i]; --v) {
27	if (dp[v - val[i]] != INF) {
28	dp[v] = min(dp[v], dp[v - val[i]] + w[i]);
29	}
30	}
31	}
32
33	int bestValue = 0;
34	for (int v = 0; v <= Vsum; ++v) {
35	if (dp[v] <= W) bestValue = v; // maximum value with weight <= W
36	}
37
38	cout << "Best value achievable = " << bestValue << "\n";
39
40	// Notes on state reduction:
41	// - We replaced the weight dimension (0..W) by value (0..Vsum).
42	// - If item values are small (e.g., <= 1000) and n is ~2000, Vsum is manageable.
43	// - Dominance: dp keeps only the minimum weight per value, discarding worse states.
44
45	return 0;
46	}
47

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Decide if there exists a non-empty subset whose sum is divisible by k.
5	// State reduction: only the remainder modulo k matters, not the full sum.
6
7	int main() {
8	ios::sync_with_stdio(false);
9	cin.tie(nullptr);
10
11	int n = 6, k = 7;
12	vector<int> a = {3, 1, 4, 2, 2, 1};
13
14	vector<char> dp(k, 0), ndp(k, 0);
15	// dp[r] = can we achieve remainder r with some subset so far?
16	dp[0] = 0; // empty subset not counted as valid yet; we'll enforce non-empty by construction
17
18	for (int x : a) {
19	fill(ndp.begin(), ndp.end(), 0);
20	// Option 1: skip x (carry over remainders)
21	for (int r = 0; r < k; ++r) if (dp[r]) ndp[r] = 1;
22	// Option 2: take x alone
23	ndp[x % k] = 1;
24	// Option 3: add x to previously achievable remainders
25	for (int r = 0; r < k; ++r) if (dp[r]) ndp[(r + x) % k] = 1;
26	dp.swap(ndp);
27	}
28
29	cout << (dp[0] ? "YES" : "NO") << "\n";
30
31	// State reduction:
32	// - We tracked only k remainders instead of potentially huge sums.
33	// - Transitions use r' = (r + x) mod k, which is closed on {0..k-1}.
34
35	return 0;
36	}
37

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Compute the length of the Longest Increasing Subsequence (strict) using
5	// coordinate compression and a Fenwick tree that stores prefix maximums.
6
7	struct FenwickMax {
8	int n; vector<int> bit;
9	FenwickMax(int n=0): n(n), bit(n+1, 0) {}
10	void update(int idx, int val) {
11	for (; idx <= n; idx += idx & -idx) bit[idx] = max(bit[idx], val);
12	}
13	int query(int idx) { // max on [1..idx]
14	int res = 0;
15	for (; idx > 0; idx -= idx & -idx) res = max(res, bit[idx]);
16	return res;
17	}
18	};
19
20	int main() {
21	ios::sync_with_stdio(false);
22	cin.tie(nullptr);
23
24	vector<int> a = {10, 1, 20, 2, 3, 30, 4, 5};
25	int n = (int)a.size();
26
27	// Coordinate compression: map values to ranks 1..m
28	vector<int> vals = a;
29	sort(vals.begin(), vals.end());
30	vals.erase(unique(vals.begin(), vals.end()), vals.end());
31	auto rank_of = [&](int x){ return (int)(lower_bound(vals.begin(), vals.end(), x) - vals.begin()) + 1; };
32	int m = (int)vals.size();
33
34	FenwickMax fw(m);
35	int ans = 0;
36	for (int x : a) {
37	int r = rank_of(x);
38	// best LIS ending with value < x is query(r-1)
39	int best = fw.query(r - 1);
40	fw.update(r, best + 1);
41	ans = max(ans, best + 1);
42	}
43
44	cout << "LIS length = " << ans << "\n";
45
46	// State reduction:
47	// - Large values up to 1e9 are reduced to ranks 1..m via compression.
48	// - Fenwick tree stores DP[r] = best LIS length ending at rank r.
49
50	return 0;
51	}
52