⚙️AlgorithmIntermediate

Complexity Analysis Quick Reference

Key Points

•
Use an operation budget of about 10^8 simple operations per second on typical online judges; always multiply by the time limit and number of test files if known.
•
For $n \leq 20$ , O(2^n) is usually fine; for $n \leq 5000$ , target O( $n^{2}$ ); for $n \leq 1 0^{5} -1 0^{6}$ , aim for O(n $lo g$ n) or O(n).
•
Beware of hidden constants: std::map/std::set are log-time but with large constants; prefer sorting plus vectors when appropriate.
•
Bitsets and word-level operations can give a ~64× speedup for dense boolean DP and adjacency intersections.
•
Amortized costs (e.g., pus $h_{b} a c k$ , DSU) are safe if input is not adversarial; worst-case of hash tables can still be bad.
•
When multiple test cases exist, the total input size $\sum$ n across all cases is what matters for complexity.
•
Cache locality and memory layout can make O(n) with poor locality slower than a well-written O(n $lo g$ n).
•
Always check constraints, pick the tightest safe complexity, and code with low-constant-factor C++ patterns and fast I/O.

Prerequisites

→Asymptotic notation (O, \Theta, \Omega) — To reason about growth rates and compare algorithms independent of machine specifics.
→Logarithms and exponentials — To interpret O(n \log n), O(2^n), and apply change-of-base when analyzing recurrences.
→STL containers and their complexities — To choose between vector, map, set, and unordered_map based on performance characteristics.
→Basic dynamic programming — To recognize when DP states are feasible and when bitset packing can accelerate boolean DP.
→Divide and conquer — To understand O(n \log n) patterns like sorting and binary search and use the Master Theorem.
→Amortized analysis — To safely use data structures like vector and DSU within time limits.
→Fast I/O in C++ — To avoid I/O becoming the bottleneck when your algorithmic complexity is tight.
→Bit operations — To implement and reason about bitset optimizations and 64-bit word packing.

Detailed Explanation

Tap terms for definitions

01Overview

Complexity analysis is your speedometer for algorithms: it estimates how long your code will run as the input size grows. In competitive programming, you must quickly choose an approach that fits strict time limits—often around one second—given the problem’s constraints. A practical quick reference links input sizes (n) to safe complexities (like O(n), O(n \log n), O(n^2)), acknowledging about 10^8 simple operations per second on standard judges. This rule of thumb helps decide whether brute force, divide-and-conquer, dynamic programming, or graph algorithms are feasible. However, Big-O is only half the story. Hidden constants, memory access patterns, and data-structure overheads can make or break a solution. For instance, std::map has O(\log n) operations but higher constants than sorting a vector once; bitset operations can slash boolean DP runtime by 64×; and poorly localized memory can sink performance despite a theoretically fast complexity. Another key nuance is amortized vs worst-case time: push_back on vectors is usually O(1) amortized, but hash tables can degrade if adversarial inputs trigger collisions. This quick reference converts constraints into safe targets, highlights when to use certain STL containers, and flags common pitfalls (multiple test cases, I/O, cache effects). With it, you’ll build an instinct for picking the right tool fast, writing C++ that both compiles and completes in time.

02Intuition & Analogies

Imagine you are packing for a trip with a hard luggage weight limit. Each algorithm is like a set of clothes: some are light (O(n)), others bulky (O(n^2)), and some are huge winter coats (O(2^n), O(n!)). The problem constraints are the weight limit. If you bring too many heavy items, you exceed the limit and get stopped at the gate (TLE). Complexity analysis is the quick mental calculation that keeps your suitcase under the limit. You also have space-saving tricks: rolling clothes tightly (bitsets) fits more into the same space and reduces weight (constant factors). Choosing between a backpack with many small pockets (std::map) and a simple duffel (vector + sort) matters: the backpack keeps things organized but costs time to access each pocket, while the duffel is fast if you only need to dump, sort, and go. Similarly, a travel checklist (amortized analysis) says, “On average, packing is quick,” even if occasionally you need to rearrange everything (rehash or resize), which takes longer. Multiple flights (test cases) mean the total luggage across all flights matters, not the weight per flight. If the airline allows the sum of luggage to be 20 kg across all flights, you can split it unevenly; your plan must account for the total, not just per flight. Finally, airport layout (cache) influences your speed: walking straight through a corridor (contiguous vector access) is faster than zigzagging across terminals (pointer-heavy structures). Internal logistics (branch prediction, memory locality) can turn a theoretically fine plan into a practical delay or speedup.

03Formal Definition

Time complexity measures the number of elementary steps an algorithm performs as a function of input size n, typically expressed in asymptotic notation. For an algorithm with operation count f(n), we say it runs in O(g(n)) time if there exist constants

c > 0

and

n_{0}

such that for all

n \geq n_{0}

, f(n) ≤ c·g(n). Space complexity measures the additional memory used beyond the input. In competitive programming, we map asymptotic bounds to runtime budgets under a hardware model where roughly 10^8 simple operations per second is a workable estimate. Given time limit T seconds, an implementation-specific constant c (instructions per high-level operation), and operation count f(n), a rough feasibility condition is c·f(n) ≤ 10^8·T. Constants c depend on language, optimization level, data structure overhead, and memory behavior. Complexity classes commonly used: O(1), O(

lo g

n), O(n), O(n

lo g

n), O(

n^{2}

), O(

n^{3}

), O(2^n), O(n!), and polynomial-time variants like O(n

n

). For randomized or amortized structures, expected time E[f(n)] or amortized cost

\overline{f}

(n) per operation is used. With multiple test cases of sizes

n_{1}

, …,

n_{T}

, the total work is typically

\sum_{i = 1}^{T}

n_{i}

), and constraints often guarantee

\sum

n_{i} \leq N_{ma x}

, allowing overall bounds like O(

N_{ma x}

lo g

N_{ma x}

04When to Use

Use this quick reference the moment you read the constraints. If n ≤ 20–25, consider exhaustive search (bitmask DP, backtracking) in O(2^n) or even O(n!) with pruning. If n ≈ 500–5000, quadratic methods (O(n^2)) or cubic with small constants (matrix multiplication, Floyd–Warshall for n ≤ 500) can pass. If n ≈ 10^5–10^6, target O(n) or O(n \log n) with cache-friendly code and low-overhead containers. For boolean DP or dense adjacency intersections, consider bitsets to achieve an effective O(·/64) speedup. Prefer std::vector + sort + unique for tasks like counting distinct elements, frequency compression, or offline processing; it is often faster than std::map due to lower constant factors and contiguous memory. Use std::unordered_map for hash-based counting when online insertion/lookup is needed, but mitigate rehashing by reserving capacity and be aware of adversarial worst cases. For union–find operations or dynamic arrays, amortized analyses make them safe in practice. Apply multiple-test-case budgeting by verifying that the sum of n across all cases fits the target complexity. Combine this with fast I/O (sync_with_stdio(false), tie(nullptr)) and careful memory allocation (reserve, shrink_to_fit when needed).

⚠️Common Mistakes

• Ignoring total input size: Optimizing per test case but forgetting that \sum n across T cases may exceed your time budget. • Overusing heavy containers: Replacing a single sort + linear scan with thousands of std::map operations; the latter’s O(\log n) per op and pointers hurt cache locality, inflating constants. • Trusting average-case blindly: Using std::unordered_map without reserve, causing frequent rehashes, or being vulnerable to worst-case collisions on adversarial data. • Misjudging DP dimensions: Designing O(n·S) DP with S too large (e.g., 10^7) causing TLE or MLE; not using bitset or word-packing to cut by 64×. • Missing amortized/worst-case nuance: Assuming push_back is always O(1) but forgetting occasional resizes; or assuming hashing is always O(1) even under crafted inputs. • Poor memory access patterns: Jumping through pointers and linked structures leading to cache misses; a theoretically better complexity loses to a simpler, contiguous approach. • Underestimating I/O: Printing debug or flushing often; neglecting fast I/O in tight loops. • Overflow in operation counts: Using 32-bit ints for counts when n^2 can exceed 2·10^9; not bounding loops and causing hidden quadratic behavior.

Key Formulas

Big-O Definition

f (n) \in O (g (n)) \Leftrightarrow \exists c > 0, n_{0} : \forall n \geq n_{0}, f (n) \leq c g (n)

Explanation: Big-O provides an upper bound on growth rate. It formalizes that beyond some n0, f(n) grows no faster than a constant multiple of g(n).

Operation Budget

Ops Budget = 1 0^{8} \times T_{seconds}

Explanation: Estimate how many primitive operations you can perform under a typical 1e8 ops/sec model. Adjust for the judge and language.

Divide-and-Conquer Recurrence

T (n) = a T (\frac{n}{b}) + f (n)

Explanation: General form for recursive algorithms splitting into a subproblems, each of size n/b, plus combine cost f(n). Use the Master Theorem.

Merge Sort Recurrence

T (n) = 2 T (n /2) + O (n) \Rightarrow T (n) = O (n lo g n)

Explanation: Sorting by dividing into halves and merging each level costs linear time across $lo g$ n levels, giving n log n total.

Stirling's Approximation

n! \approx 2 πn (\frac{n}{e})^{n}

Explanation: Approximates factorial growth to gauge feasibility of O(n!) solutions. It grows faster than exponential and becomes infeasible quickly.

Exponential Growth

2^{n} = e^{n l n 2}

Explanation: Shows how doubling per element leads to explosive growth. Feasible for n up to about 20–25 in practice.

Sum Over Test Cases

i = 1 \sum T n_{i} \leq N_{m a x} \Rightarrow i = 1 \sum T n_{i} lo g n_{i} \leq N_{m a x} lo g N_{m a x}

Explanation: If the total size across tests is bounded, the total cost of sorting each case is bounded by sorting the concatenation.

Bitset DP Complexity

O (n S /64)

Explanation: Packing 64 states per 64-bit word reduces naive O(nS) boolean DP by a factor of 64 due to word-level parallelism.

Hash Table Costs

E [hash lookup] = O (1), worst-case = O (n)

Explanation: Expected-time constant lookups rely on good hashing and load factor; worst case degenerates to linear with collisions.

DSU Amortized Time

O (α (n))

Explanation: Union–find with path compression and union by rank runs in near-constant time per operation, where $α$ is inverse Ackermann.

Root-Decomposition Heuristic

O (n n)

Explanation: Appears in techniques like Mo’s algorithm, trading off block size $n$ to balance updates and queries.

Change of Base

lo g_{a} n = \frac{lo g _{b} n}{lo g _{b} a}

Explanation: Logarithm base changes only constant factors, so O( $lo g_{2}$ n) and O( $lo g_{10}$ n) are the same asymptotically.

Complexity Analysis

Mapping constraints to complexities uses an ops-per-second budget. A conservative rule is about 10^8 simple operations per second in C++ with -O2. Thus, with a 1 s limit: O(n) should handle n ≈ 10^8 (only with tiny constants and linear scans), O(n

lo g

n) comfortably fits n ≈ 10^5–10^6, O(

n^{2}

) fits n ≈ 5·10^3, O(

n^{3}

) fits n ≈ 500, and exhaustive O(2^n) caps around n ≈ 20–25. For factorial O(n!), n ≈ 10 is already near the edge. Hidden constants shift these thresholds significantly. Tree-based structures (std::map/std::set) have O(

lo g

n) time but incur pointer chasing and cache misses. Hash tables (std::unordere

d_{m} a p

) are O(1) expected but can trigger rehashing unless you reserve capacity, and adversarial inputs can degrade to O(n). Boolean DP benefits from word-level parallelism: bitset or manual 64-bit packing cuts O(nS) to O

\frac{n S}{64}

. Memory access patterns dominate performance: contiguous arrays outperform linked structures. For multiple test cases, always consider total size: if

\sum

n \leq 2 \cdot 1 0^{5}

, an O(n

lo g

n) algorithm over each case totals ≲ O(N

lo g

N). Use fast I/O (syn

c_{w} i t h

_stdio(false), cin.tie(nullptr)), preallocate with reserve, and avoid per-iteration allocations. In practice, prefer simple data flows (sort + scan), bitsets for dense boolean problems, and amortized-friendly structures (vector, DSU).

Code Examples

Budgeting across multiple test cases (sum of n) with sort+unique

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Reads T test cases; in each, reads n and an array, then outputs the number of distinct values.
5 // Total time is O(\sum n_i log n_i) \u2248 O(N log N), safe if \u2211 n_i \u2264 2e5..5e5.
6 int main() {
7     ios::sync_with_stdio(false);
8     cin.tie(nullptr);
9 
10     int T; 
11     if (!(cin >> T)) return 0;
12     long long totalN = 0;
13     while (T--) {
14         int n; 
15         cin >> n;
16         totalN += n; // budget on total, not per test case
17         vector<int> a(n);
18         for (int i = 0; i < n; ++i) cin >> a[i];
19 
20         // O(n log n) with excellent constants and cache locality
21         sort(a.begin(), a.end());
22         int distinct = int(unique(a.begin(), a.end()) - a.begin());
23         cout << distinct << '\n';
24     }
25     // Note: We don't print totalN; it's here to emphasize budgeting over all tests.
26     return 0;
27 }
28

This program demonstrates planning for multiple test cases by relying on the sum of n across all cases. Sorting each case individually leads to a total cost of \sum n_i log n_i, which is bounded by N log N when \sum n_i = N. Using vector + sort + unique provides low constants and good cache behavior.

Time: O(\sum n_i \log n_i) (typically summarized as O(N \log N))Space: O(1) extra beyond input (in-place sort)

Subset sum with manual 64-bit word bitset (\u224864× speedup)

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 // Shift-left OR: out = in | (in << shift)
5 // Represent bitset as vector<uint64_t>, LSB of word 0 is sum=0.
6 static void or_shift_left(vector<uint64_t>& dp, int S, int shift) {
7     if (shift < 0) return; 
8     const int W = (S + 64) / 64;
9     const int word = shift / 64;
10     const int bits = shift % 64;
11     vector<uint64_t> tmp(W, 0ull);
12     for (int i = W - 1; i >= 0; --i) {
13         uint64_t acc = 0ull;
14         int j = i - word; // source index for left shift
15         if (j >= 0) {
16             acc |= dp[j] << bits;
17             if (bits != 0 && j - 1 >= 0) acc |= dp[j - 1] >> (64 - bits);
18         }
19         tmp[i] = acc;
20     }
21     for (int i = 0; i < W; ++i) dp[i] |= tmp[i];
22     // Mask off bits beyond S in the last word
23     int lastBits = (S % 64) + 1; // valid bits in last word
24     if (lastBits != 64) {
25         uint64_t mask = (lastBits == 64 ? ~0ull : ((lastBits == 64) ? ~0ull : ((1ull << lastBits) - 1ull)));
26         dp[W - 1] &= ((1ull << lastBits) - 1ull);
27     }
28 }
29 
30 int main() {
31     ios::sync_with_stdio(false);
32     cin.tie(nullptr);
33 
34     int n, S;
35     if (!(cin >> n >> S)) return 0;
36     vector<int> w(n);
37     for (int i = 0; i < n; ++i) cin >> w[i];
38 
39     const int W = (S + 64) / 64;
40     vector<uint64_t> dp(W, 0ull);
41     dp[0] |= 1ull; // sum 0 is achievable
42 
43     for (int x : w) {
44         if (x > S) continue; // shifting beyond S has no effect
45         or_shift_left(dp, S, x);
46     }
47 
48     bool ok = (dp[S / 64] >> (S % 64)) & 1ull;
49     cout << (ok ? "YES" : "NO") << '\n';
50     return 0;
51 }
52

This is a boolean subset-sum DP using manual 64-bit word packing. Each OR-with-shift processes 64 states per machine word, reducing O(n·S) to O(n·S/64). It illustrates the constant-factor speedup that makes an otherwise borderline DP pass within time limits.

Time: O(n · S / 64)Space: O(S / 64)

Distinct elements: map vs unordered_map vs sort+unique

1 #include <bits/stdc++.h>
2 using namespace std;
3 
4 int distinct_sort_unique(vector<int> a) {
5     sort(a.begin(), a.end());
6     return int(unique(a.begin(), a.end()) - a.begin());
7 }
8 
9 int distinct_map(const vector<int>& a) {
10     map<int, int> mp; // ordered tree
11     for (int x : a) mp[x]++;
12     return (int)mp.size();
13 }
14 
15 int distinct_unordered_map(const vector<int>& a) {
16     unordered_map<int, int> mp; // hash table
17     mp.reserve(a.size() * 2); // avoid rehashes
18     mp.max_load_factor(0.7f);
19     for (int x : a) mp[x]++;
20     return (int)mp.size();
21 }
22 
23 int main() {
24     ios::sync_with_stdio(false);
25     cin.tie(nullptr);
26 
27     int n; 
28     if (!(cin >> n)) return 0;
29     vector<int> a(n);
30     for (int i = 0; i < n; ++i) cin >> a[i];
31 
32     int d1 = distinct_sort_unique(a);
33     int d2 = distinct_map(a);
34     int d3 = distinct_unordered_map(a);
35 
36     // All should match; the methods have different constants and guarantees.
37     cout << d1 << '\n' << d2 << '\n' << d3 << '\n';
38     return 0;
39 }
40

Three ways to count distinct elements. sort+unique is O(n log n) with low constants and great cache locality; std::map is O(n log n) but slower due to pointer-heavy trees; std::unordered_map is O(n) expected if you reserve capacity, but worst-case can degrade. Choose based on needs and constraints.

Time: sort+unique: O(n log n); map: O(n log n); unordered_map: O(n) expected, O(n^2) worst-caseSpace: sort+unique: O(1) extra; map/unordered_map: O(u) where u is number of distinct elements

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Reads T test cases; in each, reads n and an array, then outputs the number of distinct values.
5	// Total time is O(\sum n_i log n_i) \u2248 O(N log N), safe if \u2211 n_i \u2264 2e5..5e5.
6	int main() {
7	ios::sync_with_stdio(false);
8	cin.tie(nullptr);
9
10	int T;
11	if (!(cin >> T)) return 0;
12	long long totalN = 0;
13	while (T--) {
14	int n;
15	cin >> n;
16	totalN += n; // budget on total, not per test case
17	vector<int> a(n);
18	for (int i = 0; i < n; ++i) cin >> a[i];
19
20	// O(n log n) with excellent constants and cache locality
21	sort(a.begin(), a.end());
22	int distinct = int(unique(a.begin(), a.end()) - a.begin());
23	cout << distinct << '\n';
24	}
25	// Note: We don't print totalN; it's here to emphasize budgeting over all tests.
26	return 0;
27	}
28

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	// Shift-left OR: out = in \| (in << shift)
5	// Represent bitset as vector<uint64_t>, LSB of word 0 is sum=0.
6	static void or_shift_left(vector<uint64_t>& dp, int S, int shift) {
7	if (shift < 0) return;
8	const int W = (S + 64) / 64;
9	const int word = shift / 64;
10	const int bits = shift % 64;
11	vector<uint64_t> tmp(W, 0ull);
12	for (int i = W - 1; i >= 0; --i) {
13	uint64_t acc = 0ull;
14	int j = i - word; // source index for left shift
15	if (j >= 0) {
16	acc \|= dp[j] << bits;
17	if (bits != 0 && j - 1 >= 0) acc \|= dp[j - 1] >> (64 - bits);
18	}
19	tmp[i] = acc;
20	}
21	for (int i = 0; i < W; ++i) dp[i] \|= tmp[i];
22	// Mask off bits beyond S in the last word
23	int lastBits = (S % 64) + 1; // valid bits in last word
24	if (lastBits != 64) {
25	uint64_t mask = (lastBits == 64 ? ~0ull : ((lastBits == 64) ? ~0ull : ((1ull << lastBits) - 1ull)));
26	dp[W - 1] &= ((1ull << lastBits) - 1ull);
27	}
28	}
29
30	int main() {
31	ios::sync_with_stdio(false);
32	cin.tie(nullptr);
33
34	int n, S;
35	if (!(cin >> n >> S)) return 0;
36	vector<int> w(n);
37	for (int i = 0; i < n; ++i) cin >> w[i];
38
39	const int W = (S + 64) / 64;
40	vector<uint64_t> dp(W, 0ull);
41	dp[0] \|= 1ull; // sum 0 is achievable
42
43	for (int x : w) {
44	if (x > S) continue; // shifting beyond S has no effect
45	or_shift_left(dp, S, x);
46	}
47
48	bool ok = (dp[S / 64] >> (S % 64)) & 1ull;
49	cout << (ok ? "YES" : "NO") << '\n';
50	return 0;
51	}
52

1	#include <bits/stdc++.h>
2	using namespace std;
3
4	int distinct_sort_unique(vector<int> a) {
5	sort(a.begin(), a.end());
6	return int(unique(a.begin(), a.end()) - a.begin());
7	}
8
9	int distinct_map(const vector<int>& a) {
10	map<int, int> mp; // ordered tree
11	for (int x : a) mp[x]++;
12	return (int)mp.size();
13	}
14
15	int distinct_unordered_map(const vector<int>& a) {
16	unordered_map<int, int> mp; // hash table
17	mp.reserve(a.size() * 2); // avoid rehashes
18	mp.max_load_factor(0.7f);
19	for (int x : a) mp[x]++;
20	return (int)mp.size();
21	}
22
23	int main() {
24	ios::sync_with_stdio(false);
25	cin.tie(nullptr);
26
27	int n;
28	if (!(cin >> n)) return 0;
29	vector<int> a(n);
30	for (int i = 0; i < n; ++i) cin >> a[i];
31
32	int d1 = distinct_sort_unique(a);
33	int d2 = distinct_map(a);
34	int d3 = distinct_unordered_map(a);
35
36	// All should match; the methods have different constants and guarantees.
37	cout << d1 << '\n' << d2 << '\n' << d3 << '\n';
38	return 0;
39	}
40