Numerical Integration & Monte Carlo
Key Points
- •Numerical integration approximates the area under a curve when an exact antiderivative is unknown, using deterministic quadrature rules or random sampling (Monte Carlo).
- •Composite trapezoid and Simpson’s rules sample a function at evenly spaced points and combine them with weights to approximate the integral with errors O() and O(), respectively.
- •Adaptive quadrature refines the step size where the function changes quickly, balancing accuracy and efficiency automatically.
- •Monte Carlo integration estimates integrals by averaging random samples; its error shrinks like O(1/) and does not explode with dimension.
- •Variance reduction techniques such as importance sampling and stratified sampling can dramatically improve Monte Carlo accuracy for the same number of samples.
- •Deterministic quadrature excels for smooth low-dimensional integrals; Monte Carlo is often superior in high dimensions or with irregular domains.
- •Always monitor error: use error bounds for quadrature and confidence intervals (via sample variance) for Monte Carlo.
- •Function evaluation cost often dominates; choose methods that minimize evaluations in expensive integrand scenarios.
Prerequisites
- →Definite integrals and the Fundamental Theorem of Calculus — Understanding what an integral represents and why antiderivatives are not always available motivates numerical methods.
- →Derivatives and smoothness — Error bounds for quadrature rules depend on second and fourth derivatives.
- →Big-O notation — To interpret convergence rates and computational complexity.
- →Basic probability (expectation, variance) — Monte Carlo views integrals as expectations and relies on variance for error estimation.
- →Random number generation — Implementing Monte Carlo requires i.i.d. pseudo-random samples and proper seeding.
- →Floating-point arithmetic — Numerical stability, precision, and accumulation errors affect accuracy.
- →Recursion and divide-and-conquer — Adaptive quadrature uses recursion to refine intervals.
- →Change of variables (Jacobian) — Essential for mapping complex or infinite domains to simpler ones in both quadrature and Monte Carlo.
- →Basic C++ standard library — Using std::function, <random>, and numeric utilities for implementations.
Detailed Explanation
Tap terms for definitions01Overview
Numerical integration is about estimating definite integrals when finding an exact antiderivative is impractical or impossible. Two major families are deterministic quadrature rules and stochastic (Monte Carlo) methods. Quadrature rules, like the trapezoidal and Simpson’s rules, approximate the integrand with simple polynomials over subintervals and sum the resulting areas. They are very effective in one or a few dimensions, especially for smooth functions, and offer predictable error behavior as the mesh is refined. Monte Carlo integration approaches the same goal differently: it treats the integral as an expectation and approximates it by averaging random samples. Its hallmark is dimension-agnostic convergence in rate: the standard error decreases as 1/\sqrt{N}, regardless of how many dimensions you integrate over. This makes Monte Carlo particularly attractive in high-dimensional problems where grid-based quadrature suffers from the curse of dimensionality. Adaptive methods further refine deterministic rules by allocating more samples where the function changes rapidly and fewer where it is smooth. On the stochastic side, variance reduction techniques like importance sampling and stratified sampling can dramatically reduce the number of samples needed for a desired accuracy. Choosing between these methods depends on the integrand’s smoothness, dimensionality, domain shape, and whether reliable error bars are needed.
02Intuition & Analogies
Imagine you want to measure the area of a curved, uneven garden bed. One approach is to lay a row of planks across it, measuring the width at many spots and averaging—the trapezoidal rule. If you fit small parabolic arches between neighboring measurements, you can follow the curve more closely—this is Simpson’s rule. If you notice that one patch is very wiggly while others are flat, you’d take more measurements in the wiggly area and fewer elsewhere—adaptive quadrature. Monte Carlo is like estimating the garden’s area by tossing random pebbles over a bounding rectangle and counting how many land inside the bed. The fraction that land inside, multiplied by the rectangle’s area, estimates the garden’s area. The more pebbles you toss, the better your estimate, and the error shrinks with the square root of the number of pebbles. If the garden is oddly shaped or occupies only a small fraction of the rectangle, you can aim your tosses more cleverly—importance sampling—so more pebbles land where they matter. In one or two dimensions with smooth edges, careful tape measurements (deterministic quadrature) are fast and precise. But if your garden is actually a maze with many twists in ten dimensions (think probabilities over many variables), grid-based measurements explode in cost, while pebble tossing (Monte Carlo) keeps the same reliable convergence rhythm. The art is choosing the right tool and, when using random pebbles, throwing them smartly to reduce wasted effort.
03Formal Definition
04When to Use
- Smooth, low-dimensional (1–3D) integrals over simple intervals: choose composite Simpson’s rule or Gaussian quadrature for high accuracy with few evaluations.
- Functions with localized sharp features or boundary layers: adaptive quadrature allocates effort where needed, often outperforming fixed-step methods.
- High-dimensional integrals (d ≥ 5), integrals over complex domains, or integrals defined as expectations in probability/statistics: Monte Carlo or quasi–Monte Carlo typically scale better than grid-based quadrature.
- When reliable error bars are required (e.g., in uncertainty quantification): Monte Carlo provides natural standard error estimates; deterministic rules need problem-specific error bounds or a refinement study.
- Expensive integrand evaluations (e.g., simulations): prefer methods requiring fewer evaluations for a given accuracy—often adaptive quadrature in 1D or variance-reduced Monte Carlo in higher dimensions.
- Infinite or semi-infinite domains: use change-of-variables, specialized quadrature rules, or importance sampling (e.g., exponential proposals for decaying tails).
⚠️Common Mistakes
- Using Simpson’s rule with an odd number of subintervals (it requires an even count). Always check that n is even for composite Simpson.
- Ignoring smoothness assumptions: applying high-order rules to non-smooth or discontinuous functions can degrade accuracy; prefer adaptive or Monte Carlo in such cases.
- Forgetting Jacobians in change-of-variables: when mapping domains (e.g., infinite to finite intervals), always multiply by the absolute derivative of the transformation.
- Mismanaging random numbers: not seeding the RNG properly, using low-quality RNGs, or unintentionally correlating samples (e.g., reusing RNG state across threads without care) biases or inflates variance.
- Reporting Monte Carlo means without uncertainty: always compute sample variance and provide confidence intervals; otherwise, results can be misleading.
- Overlooking the curse of dimensionality in deterministic grids: cost explodes as m^d; switch to Monte Carlo or sparse grids for higher d.
- Numerical issues: using float instead of double for delicate integrals, summing in naive order (avoid catastrophic cancellation), or failing to guard adaptive recursion against infinite loops or stack overflow.
- Poor importance sampling choices: choosing g too small where f is large causes huge weights and variance; design g to mimic the shape of |f|.
Key Formulas
Definite integral
Explanation: The area under the curve f between a and b. This is what numerical integration seeks to approximate.
General quadrature
Explanation: Any quadrature rule computes a weighted sum of function values at specific nodes. Choosing nodes and weights defines the method.
Composite trapezoidal rule
Explanation: Approximates the integral by trapezoids over n subintervals of width h=(b−a)/n. Error decreases quadratically with h for smooth functions.
Composite Simpson’s rule
Explanation: Uses parabolic fits over pairs of subintervals (n even). Error decreases as for sufficiently smooth f.
Trapezoid error (global)
Explanation: For some in [a,b], the trapezoid error behaves like a constant times times the second derivative. Smaller h or smaller curvature reduces error.
Simpson error (global)
Explanation: For some in [a,b], Simpson’s error is proportional to and the fourth derivative. It drops much faster than trapezoid for smooth f.
Integral as expectation
Explanation: Rewrites the integral over D as the volume times the average of f over uniformly drawn points. This enables Monte Carlo estimation.
Monte Carlo estimator
Explanation: A simple unbiased estimator of the integral using N i.i.d. uniform samples on D. Its variance shrinks as 1/N.
Monte Carlo variance
Explanation: Shows that the standard error scales as 1/. Doubling accuracy typically needs quadruple the samples.
CLT for Monte Carlo
Explanation: By the Central Limit Theorem, the estimator is approximately normal for large N, supporting confidence intervals.
Importance sampling identity
Explanation: Sampling from a density g that resembles f and reweighting by f/g can reduce variance dramatically if g is well chosen.
Change of variables
Explanation: Transforming the domain requires multiplying by the absolute Jacobian determinant. This is essential for mapping infinite to finite intervals.
Grid growth (curse of dimensionality)
Explanation: Using m points per dimension in d dimensions requires function evaluations, which becomes impractical as d grows.
Gaussian quadrature exactness
Explanation: An n-point Gauss–Legendre rule integrates any polynomial up to degree 2n−1 exactly, explaining its high accuracy on smooth functions.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Composite trapezoidal rule on [a,b] with n subintervals (n >= 1) 5 double composite_trapezoid(const function<double(double)>& f, double a, double b, int n) { 6 if (n <= 0) throw invalid_argument("n must be positive"); 7 double h = (b - a) / n; 8 double sum = 0.5 * (f(a) + f(b)); 9 for (int i = 1; i < n; ++i) { 10 sum += f(a + i * h); 11 } 12 return h * sum; 13 } 14 15 // Composite Simpson's rule on [a,b] with n subintervals (n must be even) 16 double composite_simpson(const function<double(double)>& f, double a, double b, int n) { 17 if (n <= 0 || (n % 2 != 0)) throw invalid_argument("n must be positive and even for Simpson's rule"); 18 double h = (b - a) / n; 19 double sum = f(a) + f(b); 20 // Odd indices get weight 4; even indices (except endpoints) get weight 2 21 for (int i = 1; i < n; ++i) { 22 double coeff = (i % 2 == 1) ? 4.0 : 2.0; 23 sum += coeff * f(a + i * h); 24 } 25 return (h / 3.0) * sum; 26 } 27 28 int main() { 29 // Example: integrate sin(x) on [0, pi]; exact value is 2 30 auto f = [](double x) { return sin(x); }; 31 double a = 0.0, b = M_PI; // M_PI is a common extension; if missing, use acos(-1) 32 if (!isfinite(b)) b = acos(-1.0); 33 34 for (int n : {10, 20, 40, 80}) { 35 double T = composite_trapezoid(f, a, b, n); 36 double S = composite_simpson(f, a, b, (n % 2 == 0) ? n : n + 1); // ensure even for Simpson 37 cout.setf(ios::fixed); cout << setprecision(10); 38 cout << "n=" << n << ": Trapezoid= " << T << ", Simpson= " << S << ", Error_T= " << fabs(T - 2.0) 39 << ", Error_S= " << fabs(S - 2.0) << "\n"; 40 } 41 return 0; 42 } 43
This program implements composite trapezoidal and Simpson’s rules for a given function on an interval. It integrates sin(x) over [0, π], whose exact integral is 2, and prints the approximations and absolute errors for increasing n. Simpson’s rule converges much faster than trapezoid for this smooth function, as predicted by their error orders.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Basic Simpson evaluation on [a,b] 5 static inline double simpson_eval(const function<double(double)>& f, double a, double b) { 6 double c = 0.5 * (a + b); 7 double fa = f(a), fb = f(b), fc = f(c); 8 return (b - a) * (fa + 4.0 * fc + fb) / 6.0; 9 } 10 11 // Recursive adaptive Simpson with function caching for fa, fm, fb 12 static double adaptive_simpson_rec(const function<double(double)>& f, double a, double b, 13 double eps, double S, double fa, double fm, double fb, 14 int depth, int maxDepth, int& evals) { 15 double c = 0.5 * (a + b); 16 double left_m = 0.5 * (a + c); 17 double right_m = 0.5 * (c + b); 18 19 double f_left_m = f(left_m); ++evals; 20 double f_right_m = f(right_m); ++evals; 21 22 double Sleft = (c - a) * (fa + 4.0 * f_left_m + fm) / 6.0; 23 double Sright = (b - c) * (fm + 4.0 * f_right_m + fb) / 6.0; 24 double S2 = Sleft + Sright; 25 26 // Error estimate: |S2 - S|/15 for Simpson 27 if (depth >= maxDepth || fabs(S2 - S) <= 15.0 * eps) { 28 // Richardson extrapolation for improved estimate 29 return S2 + (S2 - S) / 15.0; 30 } 31 // Recurse on left and right with halved tolerance (balanced) 32 double left = adaptive_simpson_rec(f, a, c, eps * 0.5, Sleft, fa, f_left_m, fm, depth + 1, maxDepth, evals); 33 double right = adaptive_simpson_rec(f, c, b, eps * 0.5, Sright, fm, f_right_m, fb, depth + 1, maxDepth, evals); 34 return left + right; 35 } 36 37 // Public driver 38 pair<double,int> adaptive_simpson(const function<double(double)>& f, double a, double b, double eps, int maxDepth=20) { 39 int evals = 0; 40 double c = 0.5 * (a + b); 41 double fa = f(a); ++evals; 42 double fb = f(b); ++evals; 43 double fm = f(c); ++evals; 44 double S = (b - a) * (fa + 4.0 * fm + fb) / 6.0; 45 double res = adaptive_simpson_rec(f, a, b, eps, S, fa, fm, fb, 0, maxDepth, evals); 46 return {res, evals}; 47 } 48 49 int main() { 50 // Integrate exp(-x^2) from 0 to 1; reference: 0.5*sqrt(pi)*erf(1) 51 auto f = [](double x) { return exp(-x * x); }; 52 double a = 0.0, b = 1.0; 53 double eps = 1e-10; // desired absolute error 54 55 auto [I, evals] = adaptive_simpson(f, a, b, eps, 30); 56 57 // Reference using erf 58 double ref = 0.5 * sqrt(numbers::pi_v<double>) * erf(1.0); 59 60 cout.setf(ios::fixed); cout << setprecision(12); 61 cout << "Adaptive Simpson result = " << I << "\n"; 62 cout << "Reference = " << ref << "\n"; 63 cout << "Abs error = " << fabs(I - ref) << "\n"; 64 cout << "Function evaluations = " << evals << "\n"; 65 return 0; 66 } 67
This implementation of adaptive Simpson’s rule refines intervals until the estimated local error is below a given tolerance or a maximum depth is reached. It caches function values to avoid recomputation and returns both the integral estimate and the number of function evaluations. The example integrates e^{−x^2} on [0,1] and compares with a reference using the error function.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct MCResult { double estimate; double stderr; double ci_low; double ci_high; }; 5 6 // Simple Monte Carlo for integral over [0,1]: I = int_0^1 f(x) dx 7 MCResult mc_unit_interval(function<double(double)> f, uint64_t N, mt19937_64& gen) { 8 uniform_real_distribution<double> U(0.0, 1.0); 9 long double sum = 0.0L, sum2 = 0.0L; 10 for (uint64_t i = 0; i < N; ++i) { 11 double x = U(gen); 12 double y = f(x); 13 sum += y; 14 sum2 += (long double)y * y; 15 } 16 long double mean = sum / (long double)N; 17 long double var = (sum2 / (long double)N) - mean * mean; // population variance estimate 18 if (var < 0) var = 0; // guard numerical noise 19 double stderr = sqrt((double)var / (double)N); // SE of the mean 20 double est = (double)mean; // since volume([0,1]) = 1 21 double z = 1.96; // ~95% CI 22 return {est, stderr, est - z * stderr, est + z * stderr}; 23 } 24 25 // Importance sampling on [0, +inf) for I = int_0^inf f(x) dx using proposal g(x) = Exp(1) 26 MCResult mc_importance_exp(function<double(double)> f_over_g, uint64_t N, mt19937_64& gen) { 27 // We expect f_over_g(x) = f(x)/g(x); for g(x)=e^{-x} 1_{x>=0}, sampling X=-ln(U) 28 uniform_real_distribution<double> U(0.0, 1.0); 29 long double sum = 0.0L, sum2 = 0.0L; 30 for (uint64_t i = 0; i < N; ++i) { 31 double u = U(gen); 32 // Inverse-CDF sampling for Exp(1) 33 double x = -log(1.0 - u); // numerically stable; equivalent to -ln(U) 34 double w = f_over_g(x); // equals f(x)/g(x) 35 sum += w; 36 sum2 += (long double)w * w; 37 } 38 long double mean = sum / (long double)N; 39 long double var = (sum2 / (long double)N) - mean * mean; 40 if (var < 0) var = 0; 41 double stderr = sqrt((double)var / (double)N); 42 double est = (double)mean; // Volume factor is 1 for expectation under g 43 double z = 1.96; 44 return {est, stderr, est - z * stderr, est + z * stderr}; 45 } 46 47 int main() { 48 random_device rd; mt19937_64 gen(rd()); 49 50 // Example 1: I = int_0^1 sin(pi x) dx = 2/pi 51 auto f1 = [](double x) { return sin(acos(-1.0) * x); }; // sin(pi*x) 52 uint64_t N1 = 1'000'000; 53 auto R1 = mc_unit_interval(f1, N1, gen); 54 cout.setf(ios::fixed); cout << setprecision(8); 55 cout << "Integral of sin(pi x) on [0,1]:\n"; 56 cout << "MC estimate = " << R1.estimate << ", SE = " << R1.stderr 57 << ", 95% CI = [" << R1.ci_low << ", " << R1.ci_high << "]\n"; 58 cout << "Exact = " << (2.0 / acos(-1.0)) << "\n\n"; 59 60 // Example 2: Importance sampling for I = int_0^inf x^2 e^{-x} dx = 2 61 // With g(x)=Exp(1), f(x)/g(x) = x^2 e^{-x} / e^{-x} = x^2 (low variance vs naive) 62 auto f_over_g = [](double x) { return x * x; }; 63 uint64_t N2 = 500'000; 64 auto R2 = mc_importance_exp(f_over_g, N2, gen); 65 cout << "Integral of x^2 e^{-x} on [0,inf):\n"; 66 cout << "IS estimate = " << R2.estimate << ", SE = " << R2.stderr 67 << ", 95% CI = [" << R2.ci_low << ", " << R2.ci_high << "]\n"; 68 cout << "Exact = 2.0\n"; 69 70 return 0; 71 } 72
The first routine estimates ∫_0^1 f(x)dx by averaging f(U) for U∼Uniform(0,1) and reports a 95% confidence interval using the sample variance. The example integrates sin(πx). The second routine performs importance sampling for I=∫_0^{∞} x^2 e^{−x} dx using g(x)=Exp(1). Because f(x)/g(x)=x^2, the estimator has much smaller variance than naive sampling over a truncated domain. Both use C++’s Mersenne Twister RNG and produce standard errors via the CLT.