šŸ“šTheoryIntermediate

Approximation Algorithm Theory

Key Points

  • •
    Approximation algorithms deliver provably near-optimal solutions for NP-hard optimization problems within guaranteed factors.
  • •
    The approximation ratio α bounds how far an algorithm’s solution can be from the optimal, e.g., ALG/OPT ≤ α for minimization.
  • •
    PTAS returns a (1+ in time polynomial in input size for any fixed while FPTAS also scales polynomially in 1/
  • •
    Classic results include a 2-approximation for Vertex Cover, O(log n)-approximation for Set Cover, and 3/2-approximation for Metric TSP (Christofides).
  • •
    Linear programming (LP) relaxation with rounding and primal–dual methods are two powerful design paradigms for approximation.
  • •
    The PCP Theorem underpins many inapproximability results, explaining why some problems cannot be closely approximated unless P=NP.
  • •
    In AI/ML, approximation algorithms are practical for large-scale combinatorial subproblems such as clustering, feature selection, and scheduling.
  • •
    Choosing ε involves a trade-off: smaller ε yields better accuracy but often increases running time and memory usage.

Prerequisites

  • →Time complexity and Big-O notation — To understand polynomial-time guarantees and trade-offs like O(log n) or O(n^2/ε).
  • →NP-completeness basics — To know why exact solutions are infeasible and why approximations are needed.
  • →Graph theory fundamentals — Vertex Cover, Set Cover incidence graphs, matchings, and MSTs rely on graph concepts.
  • →Greedy algorithms and analysis — Greedy methods underpin classic approximations like Set Cover and Vertex Cover.
  • →Dynamic programming — Needed for FPTAS approaches (e.g., knapsack value-scaling DP).
  • →Linear programming duality (intro) — To understand LP relaxation, integrality gaps, and primal–dual approximation methods.
  • →Basic probability (optional) — Useful for randomized rounding and probabilistic analyses.
  • →Metric spaces and triangle inequality — Essential for TSP approximations like double-tree and Christofides.

Detailed Explanation

Tap terms for definitions

01Overview

Approximation Algorithm Theory studies how to compute near-optimal solutions for optimization problems that are computationally intractable to solve exactly (NP-hard). Instead of insisting on the best possible solution, we accept solutions within a provable factor of optimal and aim for algorithms that run in polynomial time. The key performance guarantee is the approximation ratio, which quantifies the worst-case distance between the algorithm’s output and the optimal solution. Powerful general techniques—such as linear programming relaxation plus rounding, primal–dual schema, local search with exchange arguments, and metric embeddings—let us design guarantees across diverse problems. Foundational results show both possibilities and limits: many classic NP-hard problems admit tight approximation factors (e.g., Vertex Cover at factor 2, Set Cover at O(log n), and Metric TSP at 3/2), while the PCP Theorem and related hardness-of-approximation results explain why better guarantees are unlikely unless P=NP. Two important families of schemes, PTAS and FPTAS, allow users to tune accuracy via ε>0; a PTAS runs in poly(n) time for any fixed ε, and an FPTAS runs in time polynomial in both n and 1/ε. In practice, approximation algorithms are robust, scalable baselines for large optimization subroutines in systems, networks, and machine learning.

02Intuition & Analogies

Imagine planning a road trip visiting many cities. Finding the absolute shortest tour (Traveling Salesman Problem) could take prohibitively long on a big map. However, if you accept a tour that’s, say, at most 50% longer than the shortest possible, you can use fast, clever shortcuts: build a tree that touches all cities cheaply, duplicate its edges to make everything even-degree, quickly traverse it, and then skip repeats—voilĆ , a tour within a small factor of optimal. This mindset—settling for a near-best solution to finish on time—is the essence of approximation. Consider shopping on a budget with discount bundles (Set Cover). Each bundle covers several items you need. The greedy rule ā€œbuy the bundle that gives you the most new items per dollarā€ doesn’t always pick the perfect combination, but it ensures you won’t overspend by more than a logarithmic factor compared to the best shopper in the world. Or think of security patrols (Vertex Cover): placing guards on street intersections so every road is watched. If you greedily place guards at the ends of uncovered roads, you may place more than necessary, but never more than twice the optimal number. In all these cases, we use simple, fast strategies that come with safety rails: mathematical proofs guarantee we won’t drift too far from the true optimum. When we can’t afford exactness, we dial in acceptable error (ε). Smaller ε means closer to optimal but more computation—like spending more time planning to save a few extra miles on the trip. These trade-offs make approximation algorithms a practical, principled middle ground between speed and solution quality.

03Formal Definition

For an optimization problem with feasible solutions S and objective cost function c(Ā·), let OPT(I) be the optimal value for instance I. An algorithm A has approximation ratio for a minimization problem if for all instances I, its output satisfies c(A(I)) ≤ For a maximization problem, the ratio flips: OPT(I) ≤ or equivalently c(A(I)) ≄ OPT(I)/ The worst-case ratio over all instances is the performance guarantee. We often denote the instance-specific ratio by ρ_A(I) and the global bound by su ρ_A(I). A Polynomial-Time Approximation Scheme (PTAS) is a family of algorithms {A_ε | ε>0} such that for any fixed A_ε runs in time polynomial in input size n and returns a (1+ solution for minimization (or (1āˆ’ for maximization). A Fully Polynomial-Time Approximation Scheme (FPTAS) further requires runtime polynomial in both n and 1/ The class APX contains problems admitting some constant-factor approximation; APX-hard problems likely have no PTAS unless . Reductions tailored to approximation (e.g., L-reductions) preserve approximability. In LP-based approaches, we compare the integral optimum to a relaxed fractional optimum; the integrality gap quantifies the worst-case ratio between the relaxed and integral solutions. Hardness-of-approximation results, often grounded in the PCP Theorem, set lower bounds on α unless .

04When to Use

Use approximation algorithms when the exact version is NP-hard and instance sizes are large enough that exact exponential or even pseudo-polynomial algorithms are impractical. They are ideal when you need predictable solution quality and running time (e.g., service-level objectives), or when you must solve many similar instances repeatedly (e.g., daily logistics, real-time bidding). Approximation is also valuable as a baseline or warm-start for stronger but slower methods (e.g., feeding a rounded LP solution to a local search or MIP solver). In AI/ML, many subproblems are NP-hard: clustering (k-median/k-means), feature selection (sparse subset selection), graphical model inference, dictionary learning, facility location, and scheduling for distributed training. Approximation algorithms deliver scalable, theory-backed solutions when data sizes are huge. They also help with interpretability: guarantees like O(log n) or (1+ε) make trade-offs explicit. Choose PTAS/FPTAS when you can tune ε and accept extra computation for slightly better quality; use constant-factor or logarithmic approximations for massive, time-critical contexts. Avoid approximation algorithms when exact optimality is required for correctness, when problem sizes are tiny (exact is cheap), or when structure allows exact polynomial-time algorithms (e.g., metric special cases, matroids).

āš ļøCommon Mistakes

• Mixing up ratio directions: for minimization use ALG/OPT ≤ α; for maximization use OPT/ALG ≤ α. Reporting the inverse leads to incorrect claims. • Ignoring preconditions: the 3/2 (Christofides) and 2-approx (double-tree) TSP guarantees require the triangle inequality (metric). On arbitrary distances, guarantees break. Similarly, greedy Set Cover’s bound assumes nonnegative costs and counts uncovered elements correctly. • Confusing heuristics with approximations: a heuristic with no worst-case bound is not an approximation algorithm. Claims like ā€œworks well in practiceā€ do not imply a provable ratio. • Mismanaging ε in PTAS/FPTAS: forgetting that runtime may blow up as ε→0, or mis-scaling values in knapsack FPTAS so that K=0 or the DP state explodes. • Overlooking integrality gaps: rounding a strong LP may still be far from integral optimum; ignoring the gap can produce invalid guarantees. • Failing to ensure maximality or feasibility: the 2-approx Vertex Cover argument relies on choosing both endpoints of edges that form a maximal matching; if you accidentally allow overlapping edges without care, the argument fails. For Set Cover, not correctly tracking uncovered elements can silently invalidate the bound. • Misreading hardness: PCP-based lower bounds often rule out any α-better-than-X approximation unless P=NP; claiming such an algorithm requires either new complexity assumptions or a proof error.

Key Formulas

Approximation Ratio (Minimization)

Explanation: For any instance I, the algorithm’s cost is at most α times the optimal cost. This bounds worst-case suboptimality from above.

Approximation Ratio (Maximization)

Explanation: For maximization, the algorithm’s value is at least 1/α of optimal. It lower-bounds solution quality.

PTAS Definition

Explanation: A PTAS provides a tunable accuracy parameter For any fixed runtime is polynomial in input size.

FPTAS Definition

Explanation: An FPTAS has polynomial dependence on both input size and the inverse accuracy parameter. This makes fine-grained accuracy still feasible.

Harmonic Number Bound

Explanation: The harmonic number grows logarithmically. It’s used to show greedy Set Cover achieves O(log n)-approximation.

Greedy Set Cover Approximation

Explanation: Here f is the maximum frequency an element appears in sets (or use n for universe size). This bounds greedy’s cost by a logarithmic factor.

Vertex Cover 2-Approximation

Explanation: Selecting both endpoints of each edge in a maximal matching yields a cover of size at most twice the minimum cover, since any cover must hit all matching edges.

Metric TSP 2-Approximation

Explanation: Twice the MST plus shortcutting yields a tour at most twice optimal under the triangle inequality. It’s simpler than Christofides’ algorithm.

Christofides’ Bound

Explanation: For metric TSP, MST + minimum perfect matching on odd-degree vertices + Euler tour + shortcutting gives a 1.5-approximation.

Integrality Gap

Explanation: Measures worst-case loss when rounding an LP solution. A large gap limits the best achievable approximation via that relaxation.

L-Reduction

Explanation: Relates objective values and errors between problems 1 and 2 via mappings f and g. Preserves hardness of approximation factors.

Knapsack FPTAS Scaling

Explanation: Scale item values so DP over total scaled value has size O(n/ This yields a (1āˆ’ solution in polynomial time.

PCP Theorem (Canonical Form)

Explanation: Problems in NP have proofs verifiable with logarithmic randomness and constant queries. This yields strong inapproximability results.

Complexity Analysis

Approximation algorithms typically offer polynomial runtimes with explicit quality–time trade-offs. For Vertex Cover’s 2-approx via maximal matching, a single scan of edges suffices: time O(m) and space O(n), where n is vertices and m is edges. Greedy Set Cover, implemented naively by recomputing ā€œnewly coveredā€ counts each iteration, runs in O(mn) to O(m) depending on data structures; with bitsets or incremental bookkeeping, it can be reduced to about O on word-RAM (w is word size). Space is O(mn) in the worst case to store sets, but practical representations (sparse vectors) use O(total input size). For the knapsack FPTAS via value scaling, we select K = so the total scaled value is O(n/ A standard 1D DP over scaled value then runs in O(n Ā· (n/ = O(/ time and uses O(n/ space. The output value (after rescaling) is within a factor (1āˆ’ of optimal. Smaller ε improves accuracy but increases both time and memory roughly inversely with In general, PTAS algorithms may run in time for some function f, which is polynomial for any fixed ε but can be large if ε is very small. Some approximation algorithms also depend on structural parameters (e.g., maximum frequency in Set Cover), so the practical performance hinges on both input size and problem structure. Importantly, all guarantees are worst-case; typical performance may be better.

Code Examples

Vertex Cover 2-Approximation via Maximal Matching
1#include <bits/stdc++.h>
2using namespace std;
3
4// 2-approximation for Vertex Cover on an undirected graph
5// Idea: Scan edges; whenever we find an uncovered edge (u,v),
6// add both u and v to the cover. The set of such edges is a matching,
7// so |cover| <= 2 * |maximum matching| <= 2 * |optimal cover|.
8
9vector<int> vertex_cover_2approx(int n, const vector<pair<int,int>>& edges) {
10 vector<char> in_cover(n, false);
11 for (auto [u, v] : edges) {
12 if (!in_cover[u] && !in_cover[v]) {
13 in_cover[u] = in_cover[v] = true; // pick both endpoints
14 }
15 }
16 vector<int> cover;
17 for (int i = 0; i < n; ++i) if (in_cover[i]) cover.push_back(i);
18 return cover;
19}
20
21int main() {
22 // Example graph (0-based): edges form a triangle 0-1-2-0
23 int n = 4;
24 vector<pair<int,int>> edges = {{0,1},{1,2},{2,0},{2,3}};
25
26 auto cover = vertex_cover_2approx(n, edges);
27 cout << "Approximate vertex cover (size=" << cover.size() << "):";
28 for (int v : cover) cout << ' ' << v;
29 cout << "\n";
30
31 // Verify every edge is covered (for demonstration)
32 bool ok = true;
33 vector<char> in_cover(n,false);
34 for (int v : cover) in_cover[v] = true;
35 for (auto [u,v] : edges) if (!in_cover[u] && !in_cover[v]) ok = false;
36 cout << "All edges covered? " << (ok ? "yes" : "no") << "\n";
37 return 0;
38}
39

The algorithm greedily builds a maximal matching implicitly by selecting both endpoints of any still-uncovered edge. Any vertex cover must include at least one endpoint from each matching edge, so the cover size is at most twice optimal.

Time: O(m) for m edgesSpace: O(n) for n vertices
Greedy Set Cover (O(log n)-approximation)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Greedy Set Cover: repeatedly pick the set with the best (newly covered elements)/cost ratio.
5// For unit costs, this is equivalent to picking the set that covers the largest number of uncovered elements.
6// Guarantee: O(log n)-approximation (n = universe size) via harmonic number bound.
7
8struct SetEntry {
9 int id;
10 double cost; // can be 1.0 for unit-cost
11 vector<int> elems; // elements covered by this set (0..n-1)
12};
13
14vector<int> greedy_set_cover(int n, const vector<SetEntry>& sets) {
15 vector<char> covered(n, false);
16 int covered_cnt = 0;
17 vector<int> chosen;
18
19 // To speed up counting newly covered elements, maintain a per-set count updated each round.
20 vector<vector<int>> elem_in_sets(n);
21 for (const auto& s : sets) for (int e : s.elems) elem_in_sets[e].push_back(s.id);
22
23 vector<int> remaining_new(sets.size(), 0);
24 for (size_t j = 0; j < sets.size(); ++j) {
25 remaining_new[j] = (int)sets[j].elems.size();
26 }
27
28 // Greedy iterations until all covered or no progress
29 while (covered_cnt < n) {
30 int best = -1; double best_score = -1.0; // score = newly_covered / cost
31 for (size_t j = 0; j < sets.size(); ++j) {
32 if (remaining_new[j] <= 0) continue; // no benefit
33 double score = remaining_new[j] / max(1e-12, sets[j].cost);
34 if (score > best_score) { best_score = score; best = (int)j; }
35 }
36 if (best == -1) break; // cannot cover remaining elements
37
38 // Choose set 'best'
39 chosen.push_back(best);
40 // Update covered and remaining_new counts
41 for (int e : sets[best].elems) {
42 if (!covered[e]) {
43 covered[e] = true;
44 ++covered_cnt;
45 // Elements newly covered reduce remaining_new for all sets containing e
46 for (int sId : elem_in_sets[e]) {
47 if (remaining_new[sId] > 0) remaining_new[sId]--;
48 }
49 }
50 }
51 // Mark this set as spent
52 remaining_new[best] = 0;
53 }
54
55 // Optional: verify feasibility
56 if (covered_cnt < n) {
57 cerr << "Warning: Universe not fully covered (uncoverable elements?).\n";
58 }
59 return chosen;
60}
61
62int main() {
63 // Universe U = {0,1,2,3,4}
64 int n = 5;
65 vector<SetEntry> sets;
66 // Define sets with unit cost
67 sets.push_back({0, 1.0, {0,1}});
68 sets.push_back({1, 1.0, {1,2,3}});
69 sets.push_back({2, 1.0, {3,4}});
70 sets.push_back({3, 1.0, {0,2,4}});
71
72 auto chosen = greedy_set_cover(n, sets);
73 double total_cost = 0.0;
74 cout << "Chosen sets:";
75 for (int id : chosen) { cout << ' ' << id; total_cost += sets[id].cost; }
76 cout << "\nTotal cost: " << total_cost << "\n";
77
78 return 0;
79}
80

The greedy algorithm iteratively picks the set with the best cost-effectiveness (new coverage per cost). The harmonic-number analysis shows the total cost is within O(log n) of the optimal set cover.

Time: O(R + n + m + IĀ·m) where R is input size, m=#sets, and I is iterations (≤n). Naively O(mn); with better data structures it can be reduced.Space: O(R + n + m), proportional to input size plus bookkeeping
Knapsack FPTAS via Value Scaling (1āˆ’Īµ approximation)
1#include <bits/stdc++.h>
2using namespace std;
3
4// FPTAS for 0/1 Knapsack using value scaling.
5// Given n items (value v[i], weight w[i]) and capacity W, accuracy epsilon,
6// we scale values by K = (epsilon * sum v_i) / n to limit DP state to O(n/epsilon).
7// DP[v'] = minimum weight needed to achieve total scaled value v'.
8// The achieved (rescaled) value is within (1 - epsilon) of OPT.
9
10struct Item { long long value; long long weight; };
11
12pair<long long,long long> knapsack_fptas(const vector<Item>& items, long long W, double epsilon) {
13 int n = (int)items.size();
14 long long sumV = 0; for (auto &it : items) sumV += it.value;
15 if (sumV == 0) return {0LL, 0LL};
16 if (epsilon <= 0) epsilon = 0.1; // fallback if user passes non-positive epsilon
17
18 double Kd = (epsilon * (double)sumV) / max(1, n);
19 if (Kd < 1e-9) Kd = 1e-9; // avoid division by zero
20
21 vector<int> sv(n); // scaled values
22 long long sumSV = 0;
23 for (int i = 0; i < n; ++i) {
24 sv[i] = (int)floor(items[i].value / Kd);
25 if (sv[i] < 0) sv[i] = 0;
26 sumSV += sv[i];
27 }
28 if (sumSV == 0) {
29 // All values scaled to zero due to tiny K; pick any feasible item with max original value greedily
30 long long bestV = 0, bestW = 0;
31 for (auto &it : items) if (it.weight <= W && it.value > bestV) { bestV = it.value; bestW = it.weight; }
32 return {bestV, bestW};
33 }
34
35 const long long INF = (1LL<<60);
36 vector<long long> dp(sumSV + 1, INF);
37 dp[0] = 0;
38
39 // 1D DP over scaled value
40 for (int i = 0; i < n; ++i) {
41 for (long long v = sumSV - sv[i]; v >= 0; --v) {
42 if (dp[v] == INF) continue;
43 dp[v + sv[i]] = min(dp[v + sv[i]], dp[v] + items[i].weight);
44 }
45 }
46
47 // Find best achievable scaled value within capacity W
48 long long bestSV = 0;
49 for (long long v = 0; v <= sumSV; ++v) if (dp[v] <= W) bestSV = v;
50
51 long long approxValue = (long long) floor(bestSV * Kd); // lower bound on true value
52 long long minWeightForBest = dp[bestSV];
53 return {approxValue, minWeightForBest};
54}
55
56int main() {
57 // Example: capacity W=10, items with (value, weight)
58 vector<Item> items = {{10, 5}, {7, 3}, {12, 7}, {8, 4}};
59 long long W = 10;
60 double eps = 0.2; // 20% accuracy parameter
61
62 auto [approxV, usedW] = knapsack_fptas(items, W, eps);
63 cout << fixed << setprecision(2);
64 cout << "Approximate value >= (1-" << eps << ")*OPT: " << approxV << " (weight used: " << usedW << ")\n";
65 return 0;
66}
67

Values are scaled so the DP state is the total scaled value, which is O(n/ε). The DP computes the minimum weight to realize each scaled value, and the best feasible state yields a (1āˆ’Īµ)-approximate solution when rescaled.

Time: O(n · (n/ε)) = O(n^2/ε)Space: O(n/ε)