Stratified & Latin Hypercube Sampling
Key Points
- •Stratified sampling reduces Monte Carlo variance by dividing the domain into non-overlapping regions (strata) and sampling within each region.
- •Latin Hypercube Sampling (LHS) ensures uniform coverage in every dimension by taking exactly one sample from each 1/n interval per dimension and permuting independently across dimensions.
- •Both methods keep marginal distributions uniform while improving coverage, leading to more stable estimates for the same number of samples.
- •In practice, stratified sampling needs correct weighting when strata have different sizes; otherwise estimates become biased.
- •LHS is especially effective in moderate dimensions when the function varies smoothly or monotonically along each coordinate.
- •Time complexity for generating n samples in d dimensions is O(n d) and space complexity is typically O(n d).
- •Random jitter inside each stratum prevents grid artifacts and preserves unbiasedness.
- •Use independent shuffles per dimension for LHS; reusing the same permutation can reintroduce unwanted correlations.
Prerequisites
- →Uniform random number generation — Sampling within strata and jittering rely on drawing uniform random variables.
- →Basic probability and expectation — Understanding unbiased estimators and variance reduction requires the concepts of expectation and variance.
- →Monte Carlo integration — Stratification and LHS are alternative designs for Monte Carlo estimators.
- →Permutations and shuffling (Fisher–Yates) — LHS construction needs independent random permutations per dimension.
- →Asymptotic notation — To analyze time and space complexity and compare methods using O(·) notation.
- →Numerical integration over intervals and boxes — Provides context for what is being estimated and why coverage matters.
Detailed Explanation
Tap terms for definitions01Overview
Hook: Imagine surveying a city: if you only ask people downtown at noon, you miss commuters and night-shift workers. A better plan is to split the city into neighborhoods and sample each one. That way, your average reflects the whole city. Concept: Stratified sampling formalizes this idea for Monte Carlo—split the domain into strata, then sample within each stratum and combine results with appropriate weights. Latin Hypercube Sampling (LHS) goes a step further for multi-dimensional problems by enforcing even coverage along every coordinate, so you don’t accidentally cluster samples in the same x- or y-interval. Example: To estimate an integral over [0,1], we can divide it into n equal subintervals and pick one random point per subinterval. In 2D, stratified sampling might use an s×s grid; in d dimensions, LHS chooses exactly one point from each of the n subintervals along every axis, pairing them via independent permutations to keep marginals uniform but reduce clumping. The result is lower estimator variance for many functions compared to naive Monte Carlo with the same sample budget.
02Intuition & Analogies
Hook: Filling a chocolate box evenly is easier if you place one piece in each slot rather than tossing them randomly and hoping every slot gets filled. Stratified and Latin hypercube sampling do exactly that for the space of possibilities. Concept: Random sampling can bunch points in some regions and leave gaps in others purely by chance. Stratification is like drawing a grid over the space and promising yourself you’ll visit every cell. That promise reduces randomness where it hurts (coverage) while keeping enough randomness (jitter) to avoid bias. LHS adds a clever twist for high dimensions: instead of a full grid (which explodes combinatorially), it ensures one sample per 1/n slice of each axis and then shuffles those slices independently per dimension. That way, each coordinate is well spread, but you still get n total samples, not n^d. Example: Suppose you’re testing a video game across screen sizes and frame rates. Pure random tests might accidentally cluster around mid-range screens and 60 FPS. Stratified sampling would split screen sizes and frame rates into bins and test each bin. LHS would ensure that across n tests, every screen-size bin and every FPS bin appears exactly once, paired randomly, giving broad coverage without an exponential blowup of combinations.
03Formal Definition
04When to Use
Hook: When randomness alone leaves holes, force coverage. Concept: Use stratified sampling whenever your function’s behavior differs across regions and you can define meaningful strata (e.g., time-of-day segments, spatial tiles, parameter bins). It excels in Monte Carlo integration and simulation where variance matters: rendering (anti-aliasing, soft shadows), rare-event estimation with region-aware allocation, and survey sampling with known subpopulations. LHS is ideal for moderate-dimensional (say d up to a few dozen) uncertainty quantification and design of experiments—calibration, sensitivity analysis, and surrogate modeling—because it spreads samples evenly in each coordinate without an exponential grid. Example: • Estimating \int_{[0,1]^{d}} f(x) dx when f changes faster near boundaries—use more/unequal strata there. • Running n expensive simulations that depend on d uncertain inputs—use LHS to cover each input’s range well with just n runs. • Generating blue-noise-like sample sets for 2D images—use a jittered grid (stratified 2D) for aesthetically uniform points.
⚠️Common Mistakes
Hook: Good coverage can be undone by small implementation slips. Concept: Common pitfalls include (1) forgetting weights when strata have different sizes, which biases the estimator; (2) using too few samples per stratum, making within-stratum variance estimates unstable; (3) reusing the same permutation across dimensions in LHS, which creates artificial correlations; (4) skipping random jitter and placing points on cell centers, which can bias estimates if f is aligned with the grid; (5) not randomizing across repeated trials—LHS without re-randomization can look deceptively precise; (6) mismatched boundaries where samples fall outside strata because of floating-point rounding; and (7) assuming LHS samples are independent—they are not, and this affects variance formulas. Example fixes: • Always compute W_{h} from geometric/measure proportions and multiply stratum means accordingly. • For unequal strata or heterogeneous variability, use Neyman allocation (n_{h} \propto W_{h} \sigma_{h}) rather than equal n_{h}. • In LHS, generate an independent shuffle for each dimension and a fresh set of jitters per run. • Use half-open intervals to avoid overlap, and clamp numerically to [0,1].
Key Formulas
Population mean over a domain
Explanation: This defines the true mean value of f over domain D with respect to measure . Monte Carlo methods aim to estimate this quantity.
Stratified estimator
Explanation: Average within each stratum and combine using stratum weights. This estimator is unbiased if are sampled uniformly within each .
Variance of stratified estimator
Explanation: The total variance is the weighted sum of within-stratum variances divided by their sample counts. It is typically smaller than naive Monte Carlo when strata isolate variability.
Neyman allocation
Explanation: Given total budget n, allocating samples proportional to times the stratum standard deviation minimizes the estimator variance.
Baseline Monte Carlo mean and variance
Explanation: With i.i.d. uniform samples, the sample mean is unbiased and its variance decreases as 1/n. Stratification aims to reduce the constant factor compared to this baseline.
LHS construction
Explanation: For each dimension j, pick an independent permutation and jitter . This guarantees exactly one sample in each 1/n interval of each coordinate.
Variance comparison (monotone case)
Explanation: For many monotone functions, LHS does not increase variance and often reduces it compared to i.i.d. Monte Carlo. The inequality formalizes this for a common class.
Complexity of LHS/stratified generation
Explanation: Generating n points in d dimensions with stratification or LHS takes time proportional to the number of coordinates produced and linear space to store them.
Approximate confidence interval
Explanation: Using an estimate of the stratified variance and the normal quantile z, you can form an approximate (1-) confidence interval for the mean.
Weights for unequal strata
Explanation: When strata have different sizes, compute weights from their measures to keep the estimator unbiased.
Complexity Analysis
Code Examples
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Function to integrate: f(x) = exp(-x^2) 5 static inline double f(double x) { return exp(-x * x); } 6 7 // Plain Monte Carlo estimator with n samples on [0,1] 8 double mc_integral_1d(size_t n, mt19937_64 &rng) { 9 uniform_real_distribution<double> U(0.0, 1.0); 10 double sum = 0.0; 11 for (size_t i = 0; i < n; ++i) { 12 double x = U(rng); 13 sum += f(x); 14 } 15 return sum / static_cast<double>(n); 16 } 17 18 // Stratified estimator with n strata (one sample per stratum) on [0,1] 19 double stratified_integral_1d(size_t n, mt19937_64 &rng) { 20 uniform_real_distribution<double> U(0.0, 1.0); 21 double sum = 0.0; 22 for (size_t k = 0; k < n; ++k) { 23 // Stratum interval: ((k)/n, (k+1)/n]; use jitter u in (0,1] 24 double u = U(rng); // jitter in [0,1) 25 double x = (k + u) / static_cast<double>(n); // uniform in stratum 26 sum += f(x); 27 } 28 // All strata are equal length => weight is 1/n for each 29 return sum / static_cast<double>(n); 30 } 31 32 int main() { 33 ios::sync_with_stdio(false); 34 cin.tie(nullptr); 35 36 const size_t n = 10000; // number of samples/strata 37 mt19937_64 rng(123456789ULL); // fixed seed for reproducibility 38 39 double mc = mc_integral_1d(n, rng); 40 // Re-seed or use separate RNG stream if you want independent runs 41 mt19937_64 rng2(987654321ULL); 42 double strat = stratified_integral_1d(n, rng2); 43 44 // True integral (for reference) is not elementary; compare approximations only 45 cout.setf(std::ios::fixed); cout << setprecision(6); 46 cout << "Plain MC estimate: " << mc << "\n"; 47 cout << "Stratified estimate: " << strat << "\n"; 48 cout << "(Both estimate ∫_0^1 exp(-x^2) dx)\n"; 49 return 0; 50 } 51
We approximate ∫_0^1 e^{-x^2} dx. The plain Monte Carlo uses i.i.d. uniform x. The stratified version divides [0,1] into n equal subintervals and draws one jittered point from each, averaging f(x). Because coverage is enforced, stratified sampling typically shows lower variance than plain Monte Carlo for the same n.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 struct Pt { double x, y; }; 5 6 // Generate an s x s jittered grid of points over [0,1]^2 7 vector<Pt> jittered_grid_2d(size_t s, mt19937_64 &rng) { 8 uniform_real_distribution<double> U(0.0, 1.0); 9 vector<Pt> pts; pts.reserve(s * s); 10 for (size_t i = 0; i < s; ++i) { 11 for (size_t j = 0; j < s; ++j) { 12 // Cell ((i)/s,(i+1)/s] x ((j)/s,(j+1)/s] 13 double ux = U(rng), uy = U(rng); 14 double x = (i + ux) / static_cast<double>(s); 15 double y = (j + uy) / static_cast<double>(s); 16 pts.push_back({x, y}); 17 } 18 } 19 return pts; 20 } 21 22 int main() { 23 ios::sync_with_stdio(false); 24 cin.tie(nullptr); 25 26 size_t s = 16; // 16x16 = 256 points 27 mt19937_64 rng(42); 28 auto pts = jittered_grid_2d(s, rng); 29 30 cout.setf(std::ios::fixed); cout << setprecision(5); 31 for (size_t k = 0; k < min<size_t>(pts.size(), 20); ++k) { 32 cout << pts[k].x << ", " << pts[k].y << "\n"; 33 } 34 cerr << "Generated " << pts.size() << " stratified points over [0,1]^2\n"; 35 return 0; 36 } 37
This generates an s×s stratified (jittered) grid in 2D. Each cell contributes exactly one random point, providing even spatial coverage suitable for rendering, sampling textures, or initializing k-means centers. You can stream the points without storing them if memory is tight.
1 #include <bits/stdc++.h> 2 using namespace std; 3 4 // Generate n d-dimensional LHS samples in [0,1]^d 5 vector<vector<double>> lhs(size_t n, size_t d, mt19937_64 &rng) { 6 uniform_real_distribution<double> U(0.0, 1.0); 7 vector<vector<double>> X(n, vector<double>(d, 0.0)); 8 9 // For each dimension, make a permutation and jitter within bins 10 for (size_t j = 0; j < d; ++j) { 11 vector<size_t> pi(n); 12 iota(pi.begin(), pi.end(), 0); // 0..n-1 13 shuffle(pi.begin(), pi.end(), rng); // independent permutation per dimension 14 for (size_t i = 0; i < n; ++i) { 15 double u = U(rng); // jitter in [0,1) 16 // Map to ((k)/n,(k+1)/n] with k = pi[i] 17 X[i][j] = (static_cast<double>(pi[i]) + u) / static_cast<double>(n); 18 // Clamp (defensive): ensure in [0,1] 19 if (X[i][j] >= 1.0) X[i][j] = nextafter(1.0, 0.0); 20 } 21 } 22 return X; 23 } 24 25 // Test function: f(x) = exp(-sum_j x_j). True mean over [0,1]^d is (1 - e^{-1})^d 26 static inline double f(const vector<double> &x) { 27 double s = 0.0; 28 for (double v : x) s += v; 29 return exp(-s); 30 } 31 32 int main() { 33 ios::sync_with_stdio(false); 34 cin.tie(nullptr); 35 36 size_t n = 5000; // number of samples 37 size_t d = 5; // dimensionality 38 mt19937_64 rng(2024); 39 40 auto X = lhs(n, d, rng); 41 42 double sum = 0.0; 43 for (const auto &x : X) sum += f(x); 44 double estimate = sum / static_cast<double>(n); 45 46 // True value for comparison: (1 - e^{-1})^d 47 double truth = pow(1.0 - exp(-1.0), static_cast<int>(d)); 48 49 cout.setf(std::ios::fixed); cout << setprecision(8); 50 cout << "LHS estimate: " << estimate << "\n"; 51 cout << "True value: " << truth << "\n"; 52 cout << "Abs error: " << fabs(estimate - truth) << "\n"; 53 return 0; 54 } 55
This implements Latin Hypercube Sampling by using an independent random permutation per dimension and jittering within each 1/n bin. We estimate E[f(X)] for f(x) = exp(-∑ x_j) over [0,1]^d, whose true value is (1 - e^{-1})^d. LHS provides uniform marginals and improved coverage versus plain i.i.d. sampling for many functions.