📚TheoryIntermediate

Amortized Analysis

Key Points

•
Amortized analysis measures the average cost per operation over a worst-case sequence, not over random inputs.
•
It shows that even if some operations are expensive, the total cost spread across many operations is small.
•
There are three standard methods: aggregate, accounting (credits), and potential (energy) methods.
•
Dynamic arrays support pus $h_{b} a c k$ in O(1) amortized time thanks to occasional resizing.
•
A binary counter increments in O(1) amortized time even though some increments flip many bits.
•
Union–Find with path compression and union by rank runs in O( $α (n))$ amortized time for m operations.
•
Amortized analysis is different from average-case analysis because it guarantees performance for every sequence.
•
The key insight is that costly operations pay forward by making future operations cheaper.

Prerequisites

→Asymptotic notation (Big-O, Big-Theta, Big-Omega) — To interpret amortized bounds like O(1) or O(α(n)) and compare growth rates.
→Series and summations — Aggregate proofs sum costs across m operations, requiring bounds on sums.
→Basic data structures (arrays, stacks, trees) — Amortized examples often involve dynamic arrays, counters, and trees.
→Pointer-based memory and dynamic allocation in C++ — Implementing dynamic arrays and understanding resize costs needs memory management knowledge.
→Recursive functions and invariants — Path compression in Union–Find and potential method arguments rely on recursion and maintained invariants.
→Logarithms — Many bounds (e.g., splay trees, binary counters) use logarithmic terms.
→Graph basics — Union–Find is central to Kruskal’s algorithm and connectivity problems.

Detailed Explanation

Tap terms for definitions

01Overview

Amortized analysis is a technique for understanding the performance of algorithms and data structures over sequences of operations. Instead of asking, “What is the worst time for a single operation?” it asks, “If I perform a long sequence of operations, what is the average time per operation in the worst possible sequence?” This view is powerful when occasional expensive steps are balanced by many cheap ones. For example, when a dynamic array grows, a single push may copy many elements (expensive), but such resizes happen infrequently, so most pushes are cheap. The total work over many pushes is still linear, yielding constant average cost per push.

Amortized analysis comes in three flavors. The aggregate method directly sums the total work over m operations and divides by m. The accounting method assigns an artificial (amortized) cost to each operation, storing surplus as credits to pay for future expensive operations. The potential method formalizes credits via a potential function Φ(state) that measures stored work; the amortized cost equals the actual cost plus the change in potential. Each method proves a bound that holds for every sequence, unlike average-case analysis, which relies on a distribution of inputs. Amortized guarantees are essential for data structures like dynamic arrays, binary counters, splay trees, and the union–find structure used widely in graph algorithms and ML systems.

02Intuition & Analogies

Imagine running a cafeteria. Some days you wash a mountain of dishes when a rush hits; other days it’s quiet. If you save a few dollars from each sale during calm times (credits), you can pay for extra dishwashers on rush days. Over a month, your average expense per customer stays steady, even though individual days vary wildly. That’s amortized analysis: you smooth out spikes by budgeting across a sequence.

Another analogy is a rechargeable battery. You do small charges during idle periods (store potential) so that when you need a burst of power (an expensive operation), you spend that stored energy. After the burst, the battery is emptier, so the next few tasks are cheaper until you recharge again. In data structures, the state itself stores this “energy.” For a binary counter, the number of 1-bits is like stored energy: when you increment, you pay 1 to flip the lowest bit to 1, but if it was already 1 you must flip it to 0 and carry—spending stored energy. Since you “discharge” many 1s only occasionally, the total work across many increments remains low.

With dynamic arrays, think of arranging chairs in a hall. You start with 4 chairs. When more guests arrive, you move to a bigger room and rearrange the chairs once. That move is painful, but you don’t do it for every new guest—only after enough arrivals. Guests (pushes) are usually seated immediately; only occasionally do you pay the moving cost. Spreading that moving cost across many guests yields a small average per guest.

Finally, union–find is like forming clubs. Initially, everyone is in their own club (trees). Each union merges two clubs, and find queries ask for the leader. Over time, shortcuts (path compression) make everyone remember their leader, so future queries are cheaper: past effort improves the future.

03Formal Definition

Given a sequence of m operations on a data structure, let

c_{i}

be the actual cost of the i-th operation. The amortized cost

\overset{c_{i}}{^}

is a per-operation charge designed so that the total amortized cost upper-bounds the total actual cost across any prefix of the sequence. Formally, for all t in {1, …, m},

\sum_{i = 1}^{t}

\overset{c_{i}}{^}

\geq

\sum_{i = 1}^{t}

c_{i}

, and typically

\overset{c_{i}}{^}

are simple to reason about. Three equivalent methods define

\overset{c_{i}}{^}

: 1) Aggregate method: Prove a bound on the total cost C(m) =

\sum_{i = 1}^{m}

c_{i}

, then define average cost C(m)/m as the amortized cost. 2) Accounting method: Assign each operation an amortized charge. Surplus is stored as credit tokens attached to data structure elements. The invariant is that credits never go negative and can pay all future expensive steps. 3) Potential method: Define a potential function Φ mapping each state

D_{i}

to a nonnegative real number, with Φ(

D_{0}

) = 0. The amortized cost is

\overset{c_{i}}{^}

c_{i}

+ Φ(

D_{i}

) − Φ(

D_{i - 1}

). Telescoping gives

\sum_{i = 1}^{m}

\overset{c_{i}}{^}

\sum_{i = 1}^{m}

c_{i}

+ Φ(

D_{m}

) − Φ(

D_{0}

) ≥

\sum_{i = 1}^{m}

c_{i}

, since Φ(

D_{m}

) ≥ 0. These methods prove worst-case guarantees for every sequence, distinguishing amortized analysis from average-case analysis, which assumes a probability distribution over inputs. Classic results include O(1) amortized push for dynamic arrays, O(1) amortized increments for binary counters, O(log n) amortized access for splay trees, and O(

α (n))

amortized operations for union–find.

04When to Use

Use amortized analysis when operations come in sequences where occasional expensive steps are counterbalanced by many cheap ones, and you need guarantees that hold for every sequence. Typical scenarios include:

Dynamic arrays (vectors) that occasionally resize; you want push_back to be effectively O(1) over many inserts.
Binary counters or bitsets with carry chains; you want to show many increments collectively do little work.
Union–Find (Disjoint Set Union) with path compression and union by rank/size; you need near-constant-time merges and queries across large workloads (e.g., Kruskal’s algorithm, connected-component labeling in images).
Self-adjusting structures (e.g., splay trees) where restructuring after accesses gradually improves future costs, giving O(log n) amortized time per access without explicit balance metadata.
Batch workloads in systems and ML pipelines, where repeated updates to shared structures (feature indices, adjacency lists, parameter groups) experience sporadic reorganization that benefits future operations.

Prefer amortized analysis over average-case when input distributions are unknown or adversarial but you still want strong per-sequence guarantees. This is common in systems, databases, and AI/ML code where worst-case single operations can be expensive (e.g., a resize), yet the overall throughput remains high.

⚠️Common Mistakes

Confusing amortized with average-case analysis: average-case requires a probability model over inputs; amortized holds for every sequence and spreads cost across that sequence. Avoid this by explicitly stating whether your guarantee is per-sequence (amortized) or per-random-input (average-case).
Ignoring the nonnegativity requirement in accounting/potential methods: credits or potential must never go negative. Always verify Φ(D_i) ≥ 0 (and usually Φ(D_0)=0) to ensure your telescoping argument is valid.
Picking an unhelpful potential: a poor Φ can make proofs messy or incorrect. Choose Φ to reflect stored work, such as number of 1-bits (binary counter) or total log subtree sizes (splay trees).
Overstating single-operation guarantees: O(1) amortized push_back does not mean every push is O(1); occasional pushes are Θ(n). Communicate both worst-case and amortized costs clearly.
Forgetting implementation constants: resizing factor matters. Doubling capacity gives O(1) amortized; increasing by a constant does not (it yields O(n) per push on average). Ensure growth is geometric (e.g., ×2 or ×1.5).
Missing hidden costs: copying or destructing complex objects during resize may dominate. In C++, consider move semantics and exception safety when resizing.
Miscounting operations in aggregate proofs: carefully bound how often each element participates in expensive work (e.g., each element in a dynamic array is copied only O(1) times across doublings).

Key Formulas

Aggregate Amortized Cost

\overset{c}{^} = \frac{1}{m} i = 1 \sum m c_{i}

Explanation: The amortized cost per operation is the total actual cost over m operations divided by m. This bounds the average cost for any worst-case sequence of m operations.

Accounting Method Invariant

i = 1 \sum t \overset{c_{i}}{^} \geq i = 1 \sum t c_{i} and B_{t} \geq 0

Explanation: The cumulative amortized charges must cover the cumulative actual cost for every prefix t, and the stored credit (balance) $B_{t}$ must never be negative. This ensures validity of the assigned amortized costs.

Potential Method

\overset{c_{i}}{^} = c_{i} + Φ (D_{i}) - Φ (D_{i - 1}), Φ (D_{i}) \geq 0, Φ (D_{0}) = 0

Explanation: Define a nonnegative potential function on states. The amortized cost adds the change in potential to the actual cost, letting expensive operations be paid for by previously stored potential.

Telescoping with Potential

i = 1 \sum m \overset{c_{i}}{^} = i = 1 \sum m c_{i} + Φ (D_{m}) - Φ (D_{0}) \geq i = 1 \sum m c_{i}

Explanation: Summing amortized costs telescopes the potential terms, leaving total actual cost plus final potential. Since potential is nonnegative, the total amortized cost upper-bounds total actual cost.

Dynamic Array Copy Bound

k = 0 \sum ⌊ l o g_{2} n ⌋ 2^{k} = 2^{⌊ l o g_{2} n ⌋ + 1} - 1 < 2 n

Explanation: Across all doublings while inserting n elements, each capacity level contributes copies summing to less than 2n. Thus total copying is linear and the amortized insertion cost is O(1).

Binary Counter Flip Count

Flips (m) \leq k = 0 \sum ⌊ l o g_{2} m ⌋ ⌊ \frac{m}{2 ^{k}} ⌋ < 2 m

Explanation: Bit k flips every 2^k increments. Summing over all bits shows at most 2m total flips across m increments, giving O(1) amortized flips per increment.

Union–Find Complexity

T (m, n) = O (m α (n))

Explanation: For m operations (unions and finds) on n elements with path compression and union by rank/size, the total time is O(m $α (n)) .$ The inverse Ackermann function α grows so slowly it $i s < 5$ in practice.

Splay Tree Potential

Φ (T) = x \in T \sum lo g s (x)

Explanation: A common potential for splay trees sums the logs of subtree sizes. It proves O(log n) amortized time per access via the potential method.

Multipop Stack Bound

i = 1 \sum m pop_{i} \leq i = 1 \sum m push_{i}

Explanation: Each item can be popped at most once after being pushed, so over m operations, total pops are bounded by total pushes. Hence the aggregate time is linear in m.

Cost Caveat

O (1) amortized \neq = O (1) worst-case

Explanation: Amortized constant time means the average over any sequence is constant, not that every single operation is constant. Some operations may still take linear time.

Complexity Analysis

Amortized analysis focuses on the total work across sequences. For a dynamic array that doubles capacity, n pushes perform at most n writes for the elements themselves plus fewer than 2n additional copy writes across all resizes. Thus the total is O(n), giving O(1) amortized per push, even though a single push that triggers a resize may cost Θ(n). Space is O(n) for stored elements, with at most O(n) extra transiently during resize while both old and new buffers coexist. A binary counter of length ⌊log2 M⌋ + 1 that is incremented m times performs at most 2m bit flips. Bit k flips every 2^k steps, contributing ⌊m/2^k⌋ flips; summing

y i e l d s < 2 m

. Hence time per increment is O(1) amortized, while worst-case for a single increment is Θ(log M) when all low bits carry. Space is O(log M) bits to represent the maximum value seen. Union–Find with union by rank/size and path compression has total running time O(m

α (n))

for m operations on n elements, where α is the inverse Ackermann function. Because

α (n)

grows extraordinarily slowly (≤ 4 for any realistic n), this behaves like constant amortized time per operation in practice. Space is O(n) for parent and rank/size arrays. These results contrast with average-case analysis: amortized bounds hold for every sequence (even adversarial ones) and rely on structural invariants (credits or potential) rather than input randomness. The key is to pick a potential/credit scheme that captures how expensive steps reduce future work so totals stay within the claimed bound.

Code Examples

Dynamic Array with Doubling: O(1) Amortized push_back

1 #include <iostream>
2 #include <stdexcept>
3 #include <vector>
4 
5 // A minimal dynamic array for int that doubles capacity.
6 // Demonstrates why push_back is O(1) amortized even though resize is expensive.
7 class DynamicArray {
8 private:
9     int* data_;
10     size_t size_;
11     size_t capacity_;
12 
13     void grow() {
14         // Geometric growth: double the capacity (or set to 1 if currently 0)
15         size_t newCap = (capacity_ == 0) ? 1 : capacity_ * 2;
16         int* newData = new int[newCap];
17         // Copy existing elements (this is the expensive step during some pushes)
18         for (size_t i = 0; i < size_; ++i) newData[i] = data_[i];
19         // Delete old buffer and install the new one
20         delete[] data_;
21         data_ = newData;
22         capacity_ = newCap;
23     }
24 
25 public:
26     DynamicArray() : data_(nullptr), size_(0), capacity_(0) {}
27     ~DynamicArray() { delete[] data_; }
28 
29     // Disable copying for brevity (a full implementation would add copy/move semantics)
30     DynamicArray(const DynamicArray&) = delete;
31     DynamicArray& operator=(const DynamicArray&) = delete;
32 
33     void push_back(int x) {
34         if (size_ == capacity_) {
35             // Occasional expensive resize: O(size_)
36             grow();
37         }
38         // Usual cheap insertion: O(1)
39         data_[size_++] = x;
40     }
41 
42     int& operator[](size_t i) {
43         if (i >= size_) throw std::out_of_range("index out of range");
44         return data_[i];
45     }
46 
47     size_t size() const { return size_; }
48     size_t capacity() const { return capacity_; }
49 };
50 
51 int main() {
52     DynamicArray a;
53     // Insert many items; observe occasional capacity jumps (expensive events)
54     for (int i = 0; i < 20; ++i) {
55         size_t beforeCap = a.capacity();
56         a.push_back(i);
57         size_t afterCap = a.capacity();
58         if (afterCap != beforeCap) {
59             std::cout << "Resized: capacity from " << beforeCap << " to " << afterCap << "\n";
60         }
61     }
62 
63     // Verify contents
64     std::cout << "Values: ";
65     for (size_t i = 0; i < a.size(); ++i) std::cout << a[i] << ' ';
66     std::cout << "\n";
67 
68     // Amortized note: There were only O(log n) resizes; most pushes were O(1).
69     return 0;
70 }
71

The array doubles capacity when full, copying all elements once per resize. Each element is copied only O(1) times across all doublings, so n pushes perform O(n) total work. Most push_back calls are constant-time; only those that trigger growth are linear in the current size.

Time: O(1) amortized per push (O(n) worst-case when a resize occurs)Space: O(n) for stored elements (temporarily O(n) extra during resize)

Binary Counter Increment: O(1) Amortized Bit Flips

1 #include <iostream>
2 #include <vector>
3 #include <string>
4 
5 // A simple binary counter using least significant bit at index 0.
6 // Increment may flip many bits, but the amortized flips per increment are O(1).
7 class BinaryCounter {
8 public:
9     std::vector<int> bits; // store 0/1
10 
11     BinaryCounter() {}
12 
13     // Increment the counter and return the number of bit flips performed.
14     int increment() {
15         int flips = 0;
16         size_t i = 0;
17         // Propagate carry through trailing 1s: 1 -> 0 (each flip counted)
18         while (i < bits.size() && bits[i] == 1) {
19             bits[i] = 0;
20             ++flips;
21             ++i;
22         }
23         // Set the first 0 to 1 (or append a new 1 if carry extends past MSB)
24         if (i == bits.size()) bits.push_back(1);
25         else bits[i] = 1;
26         ++flips;
27         return flips;
28     }
29 
30     std::string str() const {
31         if (bits.empty()) return "0";
32         std::string s;
33         for (size_t i = bits.size(); i-- > 0; ) s.push_back('0' + bits[i]);
34         return s;
35     }
36 };
37 
38 int main() {
39     BinaryCounter c;
40     long long totalFlips = 0;
41     int m = 16; // perform 16 increments
42     for (int i = 0; i < m; ++i) {
43         int flips = c.increment();
44         totalFlips += flips;
45         std::cout << "After increment " << (i+1) << ": " << c.str()
46                   << ", flips this time = " << flips << "\n";
47     }
48     std::cout << "Total flips over " << m << " increments = " << totalFlips << "\n";
49     // Theory says total flips < 2m, so average flips per increment < 2 (O(1) amortized).
50     return 0;
51 }
52

Increment flips trailing 1s to 0 and then sets the first 0 to 1, sometimes appending a new bit. Although a single increment may flip Θ(log M) bits (worst-case), across m increments each bit k flips about m/2^k times. The total number of flips is less than 2m, yielding O(1) amortized time per increment.

Time: O(1) amortized per increment (O(log M) worst-case when a long carry occurs)Space: O(log M), where M is the maximum value reached

Union–Find (Disjoint Set Union) with Path Compression and Union by Rank

1 #include <iostream>
2 #include <vector>
3 
4 // Disjoint Set Union with path compression and union by rank.
5 // Supports near-constant amortized time per operation: O(alpha(n)).
6 class DSU {
7     std::vector<int> parent;
8     std::vector<int> rank; // approximate tree height
9 public:
10     explicit DSU(int n) : parent(n), rank(n, 0) {
11         for (int i = 0; i < n; ++i) parent[i] = i;
12     }
13 
14     int find(int x) {
15         // Path compression: recursively set parent to root
16         if (parent[x] != x) parent[x] = find(parent[x]);
17         return parent[x];
18     }
19 
20     bool unite(int a, int b) {
21         a = find(a); b = find(b);
22         if (a == b) return false;
23         // Union by rank: attach shallower tree under deeper one
24         if (rank[a] < rank[b]) std::swap(a, b);
25         parent[b] = a;
26         if (rank[a] == rank[b]) ++rank[a];
27         return true;
28     }
29 };
30 
31 int main() {
32     DSU dsu(10);
33     dsu.unite(1, 2);
34     dsu.unite(2, 3);
35     dsu.unite(4, 5);
36     dsu.unite(6, 7);
37     dsu.unite(5, 6);
38     // Multiple finds will become faster as paths compress
39     for (int i = 1; i <= 7; ++i) {
40         std::cout << "find(" << i << ") = " << dsu.find(i) << "\n";
41     }
42     // Repeated queries after compression are essentially O(1) amortized
43     for (int i = 1; i <= 7; ++i) {
44         std::cout << "find(" << i << ") again = " << dsu.find(i) << "\n";
45     }
46     return 0;
47 }
48

Union by rank keeps trees shallow and path compression shortens paths during finds. Tarjan’s analysis shows any sequence of m operations on n elements runs in O(m α(n)) time, where α is the inverse Ackermann function. In practice this is effectively constant amortized time per operation.

Time: O(α(n)) amortized per operation; O(m α(n)) over m operationsSpace: O(n) for parent and rank arrays

1	#include <iostream>
2	#include <stdexcept>
3	#include <vector>
4
5	// A minimal dynamic array for int that doubles capacity.
6	// Demonstrates why push_back is O(1) amortized even though resize is expensive.
7	class DynamicArray {
8	private:
9	int* data_;
10	size_t size_;
11	size_t capacity_;
12
13	void grow() {
14	// Geometric growth: double the capacity (or set to 1 if currently 0)
15	size_t newCap = (capacity_ == 0) ? 1 : capacity_ * 2;
16	int* newData = new int[newCap];
17	// Copy existing elements (this is the expensive step during some pushes)
18	for (size_t i = 0; i < size_; ++i) newData[i] = data_[i];
19	// Delete old buffer and install the new one
20	delete[] data_;
21	data_ = newData;
22	capacity_ = newCap;
23	}
24
25	public:
26	DynamicArray() : data_(nullptr), size_(0), capacity_(0) {}
27	~DynamicArray() { delete[] data_; }
28
29	// Disable copying for brevity (a full implementation would add copy/move semantics)
30	DynamicArray(const DynamicArray&) = delete;
31	DynamicArray& operator=(const DynamicArray&) = delete;
32
33	void push_back(int x) {
34	if (size_ == capacity_) {
35	// Occasional expensive resize: O(size_)
36	grow();
37	}
38	// Usual cheap insertion: O(1)
39	data_[size_++] = x;
40	}
41
42	int& operator[](size_t i) {
43	if (i >= size_) throw std::out_of_range("index out of range");
44	return data_[i];
45	}
46
47	size_t size() const { return size_; }
48	size_t capacity() const { return capacity_; }
49	};
50
51	int main() {
52	DynamicArray a;
53	// Insert many items; observe occasional capacity jumps (expensive events)
54	for (int i = 0; i < 20; ++i) {
55	size_t beforeCap = a.capacity();
56	a.push_back(i);
57	size_t afterCap = a.capacity();
58	if (afterCap != beforeCap) {
59	std::cout << "Resized: capacity from " << beforeCap << " to " << afterCap << "\n";
60	}
61	}
62
63	// Verify contents
64	std::cout << "Values: ";
65	for (size_t i = 0; i < a.size(); ++i) std::cout << a[i] << ' ';
66	std::cout << "\n";
67
68	// Amortized note: There were only O(log n) resizes; most pushes were O(1).
69	return 0;
70	}
71

1	#include <iostream>
2	#include <vector>
3	#include <string>
4
5	// A simple binary counter using least significant bit at index 0.
6	// Increment may flip many bits, but the amortized flips per increment are O(1).
7	class BinaryCounter {
8	public:
9	std::vector<int> bits; // store 0/1
10
11	BinaryCounter() {}
12
13	// Increment the counter and return the number of bit flips performed.
14	int increment() {
15	int flips = 0;
16	size_t i = 0;
17	// Propagate carry through trailing 1s: 1 -> 0 (each flip counted)
18	while (i < bits.size() && bits[i] == 1) {
19	bits[i] = 0;
20	++flips;
21	++i;
22	}
23	// Set the first 0 to 1 (or append a new 1 if carry extends past MSB)
24	if (i == bits.size()) bits.push_back(1);
25	else bits[i] = 1;
26	++flips;
27	return flips;
28	}
29
30	std::string str() const {
31	if (bits.empty()) return "0";
32	std::string s;
33	for (size_t i = bits.size(); i-- > 0; ) s.push_back('0' + bits[i]);
34	return s;
35	}
36	};
37
38	int main() {
39	BinaryCounter c;
40	long long totalFlips = 0;
41	int m = 16; // perform 16 increments
42	for (int i = 0; i < m; ++i) {
43	int flips = c.increment();
44	totalFlips += flips;
45	std::cout << "After increment " << (i+1) << ": " << c.str()
46	<< ", flips this time = " << flips << "\n";
47	}
48	std::cout << "Total flips over " << m << " increments = " << totalFlips << "\n";
49	// Theory says total flips < 2m, so average flips per increment < 2 (O(1) amortized).
50	return 0;
51	}
52

1	#include <iostream>
2	#include <vector>
3
4	// Disjoint Set Union with path compression and union by rank.
5	// Supports near-constant amortized time per operation: O(alpha(n)).
6	class DSU {
7	std::vector<int> parent;
8	std::vector<int> rank; // approximate tree height
9	public:
10	explicit DSU(int n) : parent(n), rank(n, 0) {
11	for (int i = 0; i < n; ++i) parent[i] = i;
12	}
13
14	int find(int x) {
15	// Path compression: recursively set parent to root
16	if (parent[x] != x) parent[x] = find(parent[x]);
17	return parent[x];
18	}
19
20	bool unite(int a, int b) {
21	a = find(a); b = find(b);
22	if (a == b) return false;
23	// Union by rank: attach shallower tree under deeper one
24	if (rank[a] < rank[b]) std::swap(a, b);
25	parent[b] = a;
26	if (rank[a] == rank[b]) ++rank[a];
27	return true;
28	}
29	};
30
31	int main() {
32	DSU dsu(10);
33	dsu.unite(1, 2);
34	dsu.unite(2, 3);
35	dsu.unite(4, 5);
36	dsu.unite(6, 7);
37	dsu.unite(5, 6);
38	// Multiple finds will become faster as paths compress
39	for (int i = 1; i <= 7; ++i) {
40	std::cout << "find(" << i << ") = " << dsu.find(i) << "\n";
41	}
42	// Repeated queries after compression are essentially O(1) amortized
43	for (int i = 1; i <= 7; ++i) {
44	std::cout << "find(" << i << ") again = " << dsu.find(i) << "\n";
45	}
46	return 0;
47	}
48