Problem Classification Patterns
Key Points
- â˘Many competitive programming problems map to a small set of classic patterns; recognizing keywords and constraints lets you pick the right tool fast.
- â˘Phrases like âminimize the maximumâ or âmaximize the minimumâ usually mean binary search on the answer over a monotonic predicate.
- â˘Counting subarrays or pairs with a property often suggests two pointers, prefix-sum + contribution, or Moâs algorithm if queries are offline.
- â˘Shortest path depends on edge weights: use BFS for unweighted, Dijkstra for non-negative weights, and Bellman-Ford for graphs with negative edges.
- â˘All-pairs shortest path is either FloydâWarshall for dense/small graphs or n times single-source shortest path for sparse/large graphs.
- â˘Tree path queries typically need LCA for ancestry questions, HLD for path aggregates, or centroid decomposition for divide-and-conquer on trees.
- â˘If you see optimal substructure or overlapping subproblems, design a DP; small n (like ) suggests bitmask DP or brute force.
- â˘Constraints guide complexity: often requires O(n n); many queries with moderate n suggest precompute O(), answer O(1).
Prerequisites
- âAsymptotic complexity (Big-O) â To match constraints to feasible algorithms and reason about time/space.
- âBasic graph representations â To apply BFS, Dijkstra, BellmanâFord, and FloydâWarshall properly.
- âBinary search fundamentals â Binary search on answer relies on monotone predicates and careful boundaries.
- âPrefix sums and sliding window â Counting subarrays/pairs depends on these for O(n) solutions.
- âTree basics â To use LCA, HLD, and other tree decompositions for path queries.
- âDynamic programming basics â To recognize optimal substructure and design states/transitions.
- âString preprocessing (KMP/Z/hash) â To choose the right matcher for pattern search tasks.
- âModular arithmetic and hashing â To implement rolling hashes safely and avoid collisions/overflow.
- âBit operations and GF(2) â To understand linear basis and XOR-related problems.
- âProblem reading and constraint analysis â To extract signals (keywords, sizes) that drive classification.
Detailed Explanation
Tap terms for definitions01Overview
Problem classification patterns are a mental toolkit for turning problem statements into algorithm choices. Instead of starting from scratch, you read the keywords and constraints, match them to known families (search, graph, DP, data structures, string algorithms, number theory), and immediately narrow the solution space. For example, âminimize the maximum timeâ signals a monotonic decision function and thus binary search on the answer; âshortest path with non-negative weightsâ points directly to Dijkstra; and âmany range updates/queriesâ suggests a segment tree or Fenwick tree. These associations are not randomâthey arise because many problem types share structural properties: monotonicity, optimal substructure, or local-to-global aggregation. The goal is to use constraints to predict viable time/space bounds and pick algorithms that fit. For instance, n ⤠10^5 rules out O(n^2) in most cases, while n ⤠20 invites bitmask DP. By mastering a handful of patterns and their telltale signs, you can prototype a solution plan quickly, write correct code faster, and avoid dead ends. This approach is especially useful in contests like Codeforces (1200â1600 rating) where problems are often crafted to be recognized and solved with a standard technique plus one key twist.
02Intuition & Analogies
Think of problem solving like choosing the right tool from a toolbox. If you need to turn a screw, you reach for a screwdriver, not a hammer. In the same way, the phrase âmaximize the minimum distanceâ is a special screw head that fits the âbinary search on answerâ screwdriver, because as the required distance gets larger, the feasibility only decreasesâthis monotonic behavior is what the screwdriver grips. Counting subarrays with a property is like scanning a shelf with a sliding magnifying glass: if all items are non-negative and the total size under the magnifier gets too big, you slide the left edge to shrink it; if itâs small enough, you expand the right edge and count all new valid windows at once. Thatâs the two-pointers technique. For routes on a map, the right pathfinding method depends on the roads. If every road is equally long (unweighted), the shortest path is just the fewest turnsâuse BFS. If roads have different but non-negative lengths, use Dijkstra to always pick the next closest unexplored spot, like choosing the nearest unvisited store on an errand run. If some roads are âdownhillâ with negative gain, you need BellmanâFordâs repeated relaxation to correctly account for these special shortcuts. When a task asks questions along a path in a tree, imagine tracing wires from one node to another. LCA tells you where two wires meet; Heavy-Light Decomposition breaks a long cable into a few thick segments so you can measure or update quickly. If a problemâs pieces fit together optimally from smaller pieces, thatâs the LEGO-like optimal substructure of DP. Finally, constraints are the instruction manual: small n lets you try all combinations; large n demands near-linear solutions; lots of queries with modest preprocessing hint at computing a big table once and answering instantly later.
03Formal Definition
04When to Use
Use binary search on the answer when the set of feasible answers is monotone with respect to a threshold: âminimize maximum,â âmaximize minimum,â or âcan we achieve value X?â Use two pointers when maintaining a sliding invariant over a sequence with non-negative contributions (e.g., sum ⤠S, at most K distinct). For offline counting with complex add/remove effects, use Moâs algorithm. Use BFS for shortest paths in unweighted graphs; use Dijkstra for non-negative weighted graphs; use BellmanâFord (or SPFA carefully) when negative edges exist or when you need negative cycle detection. For all-pairs shortest paths, use FloydâWarshall on small/dense graphs (n ⤠~500) or run n times SSSP on sparse graphs. For trees with path queries or updates, use LCA for ancestry and distances, Heavy-Light Decomposition to reduce path queries to O(log^2 n) segment queries, or centroid decomposition for divide-and-conquer over distances/paths. Apply DP when the problem exhibits optimal substructure and overlapping subproblems. For n ⤠20â25 with subset states, bitmask DP often fits. If constraints say n ⤠1000 and queries ⤠10^5, precompute O(n^2) answers and reply in O(1). For range updates/queries, pick segment tree or Fenwick tree; for strings, pick KMP/Z/rolling hash/suffix array depending on matching type; for XOR/maximization under XOR, use a linear basis over GF(2).
â ď¸Common Mistakes
- For binary search on the answer, forgetting to prove monotonicity of the feasibility predicate leads to wrong answers; also mixing up mid adjustment (l = mid vs l = mid + 1) causes infinite loops or off-by-one errors.
- Using two pointers when the maintained property is not monotone (e.g., sums with negative numbers) breaks correctness; instead consider prefix sums with hashmap or other techniques.
- Running Dijkstra with negative edges yields incorrect results; use BellmanâFord instead. Conversely, using BellmanâFord on large graphs without need is too slow.
- Applying FloydâWarshall on n â 2000 will time out due to O(n^3) time and O(n^2) memory; prefer repeated SSSP on sparse graphs.
- For tree path queries, recomputing paths per query is too slow; missing LCA or HLD yields O(n) per query instead of O(log n).
- Designing DP without defining states and transitions precisely creates double counting or missing cases; also forgetting base cases or iteration order can lead to using uninitialized values.
- For range queries, using a naive array with per-query O(n) work instead of a segment tree/Fenwick tree will not pass; also mixing 0/1-based indexing in BIT is a common bug.
- In string matching, incorrect rolling-hash mod or base handling causes collisions or overflow; not building prefix-function correctly in KMP leads to wrong borders.
- Misreading constraints: n ⤠10^5 usually implies O(n log n) or O(n); attempting O(n^2) often TLEs. Memory limits also matter: avoid O(n^2) arrays when n is large.
Key Formulas
Binary search iterations
Explanation: Binary search over an answer range of size R takes logarithmic iterations. Each iteration halves the remaining search interval.
Binary search on answer total time
Explanation: If each feasibility check runs in O(n), combining with O( R) iterations yields O(n log R) total time.
FloydâWarshall recurrence
Explanation: The shortest path between i and j using only vertices up to k is either the previous best or a path via k. Iterate k from 1 to n.
BFS time complexity
Explanation: Breadth-First Search visits each vertex and edge at most once, so it runs linear in the size of the graph.
Dijkstra with heap
Explanation: Using a binary heap priority queue, Dijkstraâs algorithm processes each vertex once and relaxes each edge with a logarithmic queue operation.
BellmanâFord time
Explanation: BellmanâFord relaxes all m edges over nâ1 rounds, which is O(nm), and can detect negative cycles in an extra pass.
FloydâWarshall time
Explanation: The triple loop over all k, i, j pairs leads to cubic time; memory is O() for the distance matrix.
Moâs algorithm complexity
Explanation: By arranging queries in blocks of size , each element is added/removed about times across all queries.
KMP prefix function
Explanation: The prefix function at i is the length of the longest proper prefix that is also a suffix ending at i. It enables linear-time string matching.
SpragueâGrundy
Explanation: The Grundy number of a game position is the minimal excluded non-negative integer among successors. XOR of independent game Grundy numbers decides the winner.
Linear basis bound
Explanation: For integers up to M, the size of a GF(2) basis of their bit-vectors is bounded by the number of bits, which limits basis operations.
DP cost rule of thumb
Explanation: Once you count how many states you have and the work per state, you can estimate the DPâs overall time complexity.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Problem: Given weights of packages and D days, find the minimal ship capacity 5 // so that all packages are shipped in order within D days. 6 // Pattern: "Minimize maximum" -> Binary search on answer over capacity. 7 8 bool feasibleCapacity(const vector<int>& w, int D, long long cap) { 9 long long curr = 0; // current load for the day 10 int days = 1; // start with day 1 11 for (int x : w) { 12 if (x > cap) return false; // capacity too small to carry single item 13 if (curr + x <= cap) { 14 curr += x; 15 } else { 16 days++; 17 curr = x; 18 if (days > D) return false; // too many days needed 19 } 20 } 21 return true; 22 } 23 24 long long minCapacityToShip(const vector<int>& w, int D) { 25 long long lo = 0, hi = 0; 26 for (int x : w) { 27 lo = max<long long>(lo, x); // lower bound: at least the max item 28 hi += x; // upper bound: sum of all items 29 } 30 // Invariant: answer in [lo, hi] 31 while (lo < hi) { 32 long long mid = lo + (hi - lo) / 2; // candidate capacity 33 if (feasibleCapacity(w, D, mid)) { 34 hi = mid; // try to minimize further 35 } else { 36 lo = mid + 1; // need more capacity 37 } 38 } 39 return lo; 40 } 41 42 int main() { 43 ios::sync_with_stdio(false); 44 cin.tie(nullptr); 45 46 int n, D; 47 // Example input: 48 // 5 3 49 // 1 2 3 4 5 50 if (!(cin >> n >> D)) { 51 // Fallback demo 52 vector<int> w = {1,2,3,4,5}; 53 cout << minCapacityToShip(w, 3) << "\n"; // Expected 6 54 return 0; 55 } 56 vector<int> w(n); 57 for (int i = 0; i < n; ++i) cin >> w[i]; 58 cout << minCapacityToShip(w, D) << "\n"; 59 return 0; 60 } 61
We binary-search the minimal capacity C such that shipping is feasible in D days. The feasibility check simulates packing in order: add items until exceeding C, then start a new day. The predicate is monotone: if capacity C works, any C' > C also works, enabling binary search.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Count subarrays with sum <= S when all numbers are non-negative. 5 // Pattern: "Count subarrays with property" + non-negative -> two pointers. 6 7 long long countAtMostSum(const vector<long long>& a, long long S) { 8 int n = (int)a.size(); 9 long long ans = 0, sum = 0; 10 int L = 0; 11 for (int R = 0; R < n; ++R) { 12 sum += a[R]; 13 // Shrink from left until invariant holds 14 while (L <= R && sum > S) { 15 sum -= a[L++]; 16 } 17 // All subarrays ending at R and starting in [L..R] are valid 18 ans += (R - L + 1); 19 } 20 return ans; 21 } 22 23 // If you need exactly sum == K on non-negative arrays, you can count 24 // at_most(K) - at_most(K-1). For general integers, use prefix sums + hashmap. 25 26 int main() { 27 ios::sync_with_stdio(false); 28 cin.tie(nullptr); 29 30 int n; long long S; 31 // Example input: 32 // 5 7 33 // 2 1 3 2 4 34 if (!(cin >> n >> S)) { 35 vector<long long> a = {2,1,3,2,4}; 36 cout << countAtMostSum(a, 7) << "\n"; // Expected 9 37 return 0; 38 } 39 vector<long long> a(n); 40 for (int i = 0; i < n; ++i) cin >> a[i]; 41 cout << countAtMostSum(a, S) << "\n"; 42 return 0; 43 } 44
Maintain a sliding window [L..R] with sum ⤠S. When the sum exceeds S, move L right until the invariant is restored. For each R, there are (RâL+1) valid subarrays ending at R. This works because non-negative elements ensure monotonicity of the window sum.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Edge { int to; long long w; }; 5 6 vector<long long> bfs_unweighted(const vector<vector<int>>& g, int s) { 7 int n = (int)g.size(); 8 const long long INF = (1LL<<60); 9 vector<long long> dist(n, INF); 10 queue<int> q; 11 dist[s] = 0; q.push(s); 12 while (!q.empty()) { 13 int u = q.front(); q.pop(); 14 for (int v : g[u]) { 15 if (dist[v] == INF) { 16 dist[v] = dist[u] + 1; 17 q.push(v); 18 } 19 } 20 } 21 return dist; 22 } 23 24 vector<long long> dijkstra(const vector<vector<Edge>>& g, int s) { 25 int n = (int)g.size(); 26 const long long INF = (1LL<<60); 27 vector<long long> dist(n, INF); 28 using P = pair<long long,int>; // (dist, node) 29 priority_queue<P, vector<P>, greater<P>> pq; 30 dist[s] = 0; pq.push({0, s}); 31 while (!pq.empty()) { 32 auto [d, u] = pq.top(); pq.pop(); 33 if (d != dist[u]) continue; // lazy deletion 34 for (auto e : g[u]) { 35 if (dist[e.to] > d + e.w) { 36 dist[e.to] = d + e.w; 37 pq.push({dist[e.to], e.to}); 38 } 39 } 40 } 41 return dist; 42 } 43 44 int main() { 45 ios::sync_with_stdio(false); 46 cin.tie(nullptr); 47 48 int n, m; int s; 49 // Input format: 50 // n m s 51 // m lines: u v w (0-indexed). If all w == 1, BFS used; else Dijkstra. 52 if (!(cin >> n >> m >> s)) { 53 // Fallback demo: unweighted triangle 54 vector<vector<int>> gU(3); 55 gU[0] = {1,2}; gU[1] = {2}; gU[2] = {}; 56 auto d = bfs_unweighted(gU, 0); 57 for (auto x : d) cout << (x >= (1LL<<50) ? -1 : x) << ' '; cout << '\n'; 58 return 0; 59 } 60 bool allOne = true; 61 vector<tuple<int,int,long long>> edges; 62 edges.reserve(m); 63 for (int i = 0; i < m; ++i) { 64 int u, v; long long w; cin >> u >> v >> w; 65 edges.emplace_back(u, v, w); 66 if (w != 1) allOne = false; 67 } 68 69 if (allOne) { 70 vector<vector<int>> g(n); 71 for (auto [u,v,w] : edges) g[u].push_back(v); 72 auto dist = bfs_unweighted(g, s); 73 for (int i = 0; i < n; ++i) cout << (dist[i] >= (1LL<<50) ? -1 : dist[i]) << (i+1==n?'\n':' '); 74 } else { 75 // Ensure no negative weights for Dijkstra 76 for (auto [u,v,w] : edges) if (w < 0) { cerr << "Negative edge detected. Use Bellman-Ford.\n"; return 0; } 77 vector<vector<Edge>> g(n); 78 for (auto [u,v,w] : edges) g[u].push_back({v, w}); 79 auto dist = dijkstra(g, s); 80 for (int i = 0; i < n; ++i) cout << (dist[i] >= (1LL<<50) ? -1 : dist[i]) << (i+1==n?'\n':' '); 81 } 82 return 0; 83 } 84
We pick BFS when all edges have weight 1 and Dijkstra otherwise (with the precondition that all weights are non-negative). BFS computes level distances in O(n+m). Dijkstra uses a priority queue to always expand the closest node, handling arbitrary non-negative weights.