Approximation Algorithm Theory
Key Points
- ā¢Approximation algorithms deliver provably near-optimal solutions for NP-hard optimization problems within guaranteed factors.
- ā¢The approximation ratio α bounds how far an algorithmās solution can be from the optimal, e.g., ALG/OPT ⤠α for minimization.
- ā¢PTAS returns a (1+ in time polynomial in input size for any fixed while FPTAS also scales polynomially in 1/
- ā¢Classic results include a 2-approximation for Vertex Cover, O(log n)-approximation for Set Cover, and 3/2-approximation for Metric TSP (Christofides).
- ā¢Linear programming (LP) relaxation with rounding and primalādual methods are two powerful design paradigms for approximation.
- ā¢The PCP Theorem underpins many inapproximability results, explaining why some problems cannot be closely approximated unless P=NP.
- ā¢In AI/ML, approximation algorithms are practical for large-scale combinatorial subproblems such as clustering, feature selection, and scheduling.
- ā¢Choosing ε involves a trade-off: smaller ε yields better accuracy but often increases running time and memory usage.
Prerequisites
- āTime complexity and Big-O notation ā To understand polynomial-time guarantees and trade-offs like O(log n) or O(n^2/ε).
- āNP-completeness basics ā To know why exact solutions are infeasible and why approximations are needed.
- āGraph theory fundamentals ā Vertex Cover, Set Cover incidence graphs, matchings, and MSTs rely on graph concepts.
- āGreedy algorithms and analysis ā Greedy methods underpin classic approximations like Set Cover and Vertex Cover.
- āDynamic programming ā Needed for FPTAS approaches (e.g., knapsack value-scaling DP).
- āLinear programming duality (intro) ā To understand LP relaxation, integrality gaps, and primalādual approximation methods.
- āBasic probability (optional) ā Useful for randomized rounding and probabilistic analyses.
- āMetric spaces and triangle inequality ā Essential for TSP approximations like double-tree and Christofides.
Detailed Explanation
Tap terms for definitions01Overview
Approximation Algorithm Theory studies how to compute near-optimal solutions for optimization problems that are computationally intractable to solve exactly (NP-hard). Instead of insisting on the best possible solution, we accept solutions within a provable factor of optimal and aim for algorithms that run in polynomial time. The key performance guarantee is the approximation ratio, which quantifies the worst-case distance between the algorithmās output and the optimal solution. Powerful general techniquesāsuch as linear programming relaxation plus rounding, primalādual schema, local search with exchange arguments, and metric embeddingsālet us design guarantees across diverse problems. Foundational results show both possibilities and limits: many classic NP-hard problems admit tight approximation factors (e.g., Vertex Cover at factor 2, Set Cover at O(log n), and Metric TSP at 3/2), while the PCP Theorem and related hardness-of-approximation results explain why better guarantees are unlikely unless P=NP. Two important families of schemes, PTAS and FPTAS, allow users to tune accuracy via ε>0; a PTAS runs in poly(n) time for any fixed ε, and an FPTAS runs in time polynomial in both n and 1/ε. In practice, approximation algorithms are robust, scalable baselines for large optimization subroutines in systems, networks, and machine learning.
02Intuition & Analogies
Imagine planning a road trip visiting many cities. Finding the absolute shortest tour (Traveling Salesman Problem) could take prohibitively long on a big map. However, if you accept a tour thatās, say, at most 50% longer than the shortest possible, you can use fast, clever shortcuts: build a tree that touches all cities cheaply, duplicate its edges to make everything even-degree, quickly traverse it, and then skip repeatsāvoilĆ , a tour within a small factor of optimal. This mindsetāsettling for a near-best solution to finish on timeāis the essence of approximation. Consider shopping on a budget with discount bundles (Set Cover). Each bundle covers several items you need. The greedy rule ābuy the bundle that gives you the most new items per dollarā doesnāt always pick the perfect combination, but it ensures you wonāt overspend by more than a logarithmic factor compared to the best shopper in the world. Or think of security patrols (Vertex Cover): placing guards on street intersections so every road is watched. If you greedily place guards at the ends of uncovered roads, you may place more than necessary, but never more than twice the optimal number. In all these cases, we use simple, fast strategies that come with safety rails: mathematical proofs guarantee we wonāt drift too far from the true optimum. When we canāt afford exactness, we dial in acceptable error (ε). Smaller ε means closer to optimal but more computationālike spending more time planning to save a few extra miles on the trip. These trade-offs make approximation algorithms a practical, principled middle ground between speed and solution quality.
03Formal Definition
04When to Use
Use approximation algorithms when the exact version is NP-hard and instance sizes are large enough that exact exponential or even pseudo-polynomial algorithms are impractical. They are ideal when you need predictable solution quality and running time (e.g., service-level objectives), or when you must solve many similar instances repeatedly (e.g., daily logistics, real-time bidding). Approximation is also valuable as a baseline or warm-start for stronger but slower methods (e.g., feeding a rounded LP solution to a local search or MIP solver). In AI/ML, many subproblems are NP-hard: clustering (k-median/k-means), feature selection (sparse subset selection), graphical model inference, dictionary learning, facility location, and scheduling for distributed training. Approximation algorithms deliver scalable, theory-backed solutions when data sizes are huge. They also help with interpretability: guarantees like O(log n) or (1+ε) make trade-offs explicit. Choose PTAS/FPTAS when you can tune ε and accept extra computation for slightly better quality; use constant-factor or logarithmic approximations for massive, time-critical contexts. Avoid approximation algorithms when exact optimality is required for correctness, when problem sizes are tiny (exact is cheap), or when structure allows exact polynomial-time algorithms (e.g., metric special cases, matroids).
ā ļøCommon Mistakes
⢠Mixing up ratio directions: for minimization use ALG/OPT ⤠α; for maximization use OPT/ALG ⤠α. Reporting the inverse leads to incorrect claims. ⢠Ignoring preconditions: the 3/2 (Christofides) and 2-approx (double-tree) TSP guarantees require the triangle inequality (metric). On arbitrary distances, guarantees break. Similarly, greedy Set Coverās bound assumes nonnegative costs and counts uncovered elements correctly. ⢠Confusing heuristics with approximations: a heuristic with no worst-case bound is not an approximation algorithm. Claims like āworks well in practiceā do not imply a provable ratio. ⢠Mismanaging ε in PTAS/FPTAS: forgetting that runtime may blow up as εā0, or mis-scaling values in knapsack FPTAS so that K=0 or the DP state explodes. ⢠Overlooking integrality gaps: rounding a strong LP may still be far from integral optimum; ignoring the gap can produce invalid guarantees. ⢠Failing to ensure maximality or feasibility: the 2-approx Vertex Cover argument relies on choosing both endpoints of edges that form a maximal matching; if you accidentally allow overlapping edges without care, the argument fails. For Set Cover, not correctly tracking uncovered elements can silently invalidate the bound. ⢠Misreading hardness: PCP-based lower bounds often rule out any α-better-than-X approximation unless P=NP; claiming such an algorithm requires either new complexity assumptions or a proof error.
Key Formulas
Approximation Ratio (Minimization)
Explanation: For any instance I, the algorithmās cost is at most α times the optimal cost. This bounds worst-case suboptimality from above.
Approximation Ratio (Maximization)
Explanation: For maximization, the algorithmās value is at least 1/α of optimal. It lower-bounds solution quality.
PTAS Definition
Explanation: A PTAS provides a tunable accuracy parameter For any fixed runtime is polynomial in input size.
FPTAS Definition
Explanation: An FPTAS has polynomial dependence on both input size and the inverse accuracy parameter. This makes fine-grained accuracy still feasible.
Harmonic Number Bound
Explanation: The harmonic number grows logarithmically. Itās used to show greedy Set Cover achieves O(log n)-approximation.
Greedy Set Cover Approximation
Explanation: Here f is the maximum frequency an element appears in sets (or use n for universe size). This bounds greedyās cost by a logarithmic factor.
Vertex Cover 2-Approximation
Explanation: Selecting both endpoints of each edge in a maximal matching yields a cover of size at most twice the minimum cover, since any cover must hit all matching edges.
Metric TSP 2-Approximation
Explanation: Twice the MST plus shortcutting yields a tour at most twice optimal under the triangle inequality. Itās simpler than Christofidesā algorithm.
Christofidesā Bound
Explanation: For metric TSP, MST + minimum perfect matching on odd-degree vertices + Euler tour + shortcutting gives a 1.5-approximation.
Integrality Gap
Explanation: Measures worst-case loss when rounding an LP solution. A large gap limits the best achievable approximation via that relaxation.
L-Reduction
Explanation: Relates objective values and errors between problems 1 and 2 via mappings f and g. Preserves hardness of approximation factors.
Knapsack FPTAS Scaling
Explanation: Scale item values so DP over total scaled value has size O(n/ This yields a (1ā solution in polynomial time.
PCP Theorem (Canonical Form)
Explanation: Problems in NP have proofs verifiable with logarithmic randomness and constant queries. This yields strong inapproximability results.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // 2-approximation for Vertex Cover on an undirected graph 5 // Idea: Scan edges; whenever we find an uncovered edge (u,v), 6 // add both u and v to the cover. The set of such edges is a matching, 7 // so |cover| <= 2 * |maximum matching| <= 2 * |optimal cover|. 8 9 vector<int> vertex_cover_2approx(int n, const vector<pair<int,int>>& edges) { 10 vector<char> in_cover(n, false); 11 for (auto [u, v] : edges) { 12 if (!in_cover[u] && !in_cover[v]) { 13 in_cover[u] = in_cover[v] = true; // pick both endpoints 14 } 15 } 16 vector<int> cover; 17 for (int i = 0; i < n; ++i) if (in_cover[i]) cover.push_back(i); 18 return cover; 19 } 20 21 int main() { 22 // Example graph (0-based): edges form a triangle 0-1-2-0 23 int n = 4; 24 vector<pair<int,int>> edges = {{0,1},{1,2},{2,0},{2,3}}; 25 26 auto cover = vertex_cover_2approx(n, edges); 27 cout << "Approximate vertex cover (size=" << cover.size() << "):"; 28 for (int v : cover) cout << ' ' << v; 29 cout << "\n"; 30 31 // Verify every edge is covered (for demonstration) 32 bool ok = true; 33 vector<char> in_cover(n,false); 34 for (int v : cover) in_cover[v] = true; 35 for (auto [u,v] : edges) if (!in_cover[u] && !in_cover[v]) ok = false; 36 cout << "All edges covered? " << (ok ? "yes" : "no") << "\n"; 37 return 0; 38 } 39
The algorithm greedily builds a maximal matching implicitly by selecting both endpoints of any still-uncovered edge. Any vertex cover must include at least one endpoint from each matching edge, so the cover size is at most twice optimal.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Greedy Set Cover: repeatedly pick the set with the best (newly covered elements)/cost ratio. 5 // For unit costs, this is equivalent to picking the set that covers the largest number of uncovered elements. 6 // Guarantee: O(log n)-approximation (n = universe size) via harmonic number bound. 7 8 struct SetEntry { 9 int id; 10 double cost; // can be 1.0 for unit-cost 11 vector<int> elems; // elements covered by this set (0..n-1) 12 }; 13 14 vector<int> greedy_set_cover(int n, const vector<SetEntry>& sets) { 15 vector<char> covered(n, false); 16 int covered_cnt = 0; 17 vector<int> chosen; 18 19 // To speed up counting newly covered elements, maintain a per-set count updated each round. 20 vector<vector<int>> elem_in_sets(n); 21 for (const auto& s : sets) for (int e : s.elems) elem_in_sets[e].push_back(s.id); 22 23 vector<int> remaining_new(sets.size(), 0); 24 for (size_t j = 0; j < sets.size(); ++j) { 25 remaining_new[j] = (int)sets[j].elems.size(); 26 } 27 28 // Greedy iterations until all covered or no progress 29 while (covered_cnt < n) { 30 int best = -1; double best_score = -1.0; // score = newly_covered / cost 31 for (size_t j = 0; j < sets.size(); ++j) { 32 if (remaining_new[j] <= 0) continue; // no benefit 33 double score = remaining_new[j] / max(1e-12, sets[j].cost); 34 if (score > best_score) { best_score = score; best = (int)j; } 35 } 36 if (best == -1) break; // cannot cover remaining elements 37 38 // Choose set 'best' 39 chosen.push_back(best); 40 // Update covered and remaining_new counts 41 for (int e : sets[best].elems) { 42 if (!covered[e]) { 43 covered[e] = true; 44 ++covered_cnt; 45 // Elements newly covered reduce remaining_new for all sets containing e 46 for (int sId : elem_in_sets[e]) { 47 if (remaining_new[sId] > 0) remaining_new[sId]--; 48 } 49 } 50 } 51 // Mark this set as spent 52 remaining_new[best] = 0; 53 } 54 55 // Optional: verify feasibility 56 if (covered_cnt < n) { 57 cerr << "Warning: Universe not fully covered (uncoverable elements?).\n"; 58 } 59 return chosen; 60 } 61 62 int main() { 63 // Universe U = {0,1,2,3,4} 64 int n = 5; 65 vector<SetEntry> sets; 66 // Define sets with unit cost 67 sets.push_back({0, 1.0, {0,1}}); 68 sets.push_back({1, 1.0, {1,2,3}}); 69 sets.push_back({2, 1.0, {3,4}}); 70 sets.push_back({3, 1.0, {0,2,4}}); 71 72 auto chosen = greedy_set_cover(n, sets); 73 double total_cost = 0.0; 74 cout << "Chosen sets:"; 75 for (int id : chosen) { cout << ' ' << id; total_cost += sets[id].cost; } 76 cout << "\nTotal cost: " << total_cost << "\n"; 77 78 return 0; 79 } 80
The greedy algorithm iteratively picks the set with the best cost-effectiveness (new coverage per cost). The harmonic-number analysis shows the total cost is within O(log n) of the optimal set cover.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // FPTAS for 0/1 Knapsack using value scaling. 5 // Given n items (value v[i], weight w[i]) and capacity W, accuracy epsilon, 6 // we scale values by K = (epsilon * sum v_i) / n to limit DP state to O(n/epsilon). 7 // DP[v'] = minimum weight needed to achieve total scaled value v'. 8 // The achieved (rescaled) value is within (1 - epsilon) of OPT. 9 10 struct Item { long long value; long long weight; }; 11 12 pair<long long,long long> knapsack_fptas(const vector<Item>& items, long long W, double epsilon) { 13 int n = (int)items.size(); 14 long long sumV = 0; for (auto &it : items) sumV += it.value; 15 if (sumV == 0) return {0LL, 0LL}; 16 if (epsilon <= 0) epsilon = 0.1; // fallback if user passes non-positive epsilon 17 18 double Kd = (epsilon * (double)sumV) / max(1, n); 19 if (Kd < 1e-9) Kd = 1e-9; // avoid division by zero 20 21 vector<int> sv(n); // scaled values 22 long long sumSV = 0; 23 for (int i = 0; i < n; ++i) { 24 sv[i] = (int)floor(items[i].value / Kd); 25 if (sv[i] < 0) sv[i] = 0; 26 sumSV += sv[i]; 27 } 28 if (sumSV == 0) { 29 // All values scaled to zero due to tiny K; pick any feasible item with max original value greedily 30 long long bestV = 0, bestW = 0; 31 for (auto &it : items) if (it.weight <= W && it.value > bestV) { bestV = it.value; bestW = it.weight; } 32 return {bestV, bestW}; 33 } 34 35 const long long INF = (1LL<<60); 36 vector<long long> dp(sumSV + 1, INF); 37 dp[0] = 0; 38 39 // 1D DP over scaled value 40 for (int i = 0; i < n; ++i) { 41 for (long long v = sumSV - sv[i]; v >= 0; --v) { 42 if (dp[v] == INF) continue; 43 dp[v + sv[i]] = min(dp[v + sv[i]], dp[v] + items[i].weight); 44 } 45 } 46 47 // Find best achievable scaled value within capacity W 48 long long bestSV = 0; 49 for (long long v = 0; v <= sumSV; ++v) if (dp[v] <= W) bestSV = v; 50 51 long long approxValue = (long long) floor(bestSV * Kd); // lower bound on true value 52 long long minWeightForBest = dp[bestSV]; 53 return {approxValue, minWeightForBest}; 54 } 55 56 int main() { 57 // Example: capacity W=10, items with (value, weight) 58 vector<Item> items = {{10, 5}, {7, 3}, {12, 7}, {8, 4}}; 59 long long W = 10; 60 double eps = 0.2; // 20% accuracy parameter 61 62 auto [approxV, usedW] = knapsack_fptas(items, W, eps); 63 cout << fixed << setprecision(2); 64 cout << "Approximate value >= (1-" << eps << ")*OPT: " << approxV << " (weight used: " << usedW << ")\n"; 65 return 0; 66 } 67
Values are scaled so the DP state is the total scaled value, which is O(n/ε). The DP computes the minimum weight to realize each scaled value, and the best feasible state yields a (1āε)-approximate solution when rescaled.