MathIntermediate

Linear Sieve

Key Points

  • The linear sieve builds all primes up to n in O(n) time by ensuring each composite is marked exactly once by its smallest prime factor (SPF).
  • It stores SPF for every number, enabling O(log n) or better integer factorization queries afterward.
  • By exploiting multiplicativity and prime-power rules, the same pass can compute arithmetic functions like φ (Euler's totient), μ (Möbius), d (divisor count), and σ (sum of divisors).
  • The core update rule splits into two cases: when the current prime divides i and when it does not, which preserves the linear-time guarantee.
  • Compared to the classical sieve of Eratosthenes, the linear sieve typically uses similar memory but is faster when many multiplicative functions are needed.
  • For competitive programming, it is ideal when you need many factorization or function queries after a single O(n) precomputation.
  • Care must be taken to avoid overflow in p * i, to break the inner loop correctly when p divides i, and to use 64-bit integers for
  • With SPF, you can factor any quickly and compute function values from its prime factorization efficiently.

Prerequisites

  • Prime numbers and divisibilityUnderstanding primes, composites, and divisibility is essential for why sieves work and how SPF is defined.
  • Fundamental Theorem of ArithmeticUnique prime factorization underlies multiplicative function formulas and SPF-based factorization.
  • Arithmetic functions (φ, μ, d, σ)Knowing definitions and prime-power formulas lets you apply the sieve’s update rules correctly.
  • Big-O notationTo analyze and compare O(n) linear sieve versus O(n log log n) classical sieve.
  • Basic C++ arrays/vectorsEfficient and correct implementation relies on contiguous arrays and safe indexing.

Detailed Explanation

Tap terms for definitions

01Overview

The linear sieve is an algorithm to generate all primes up to a given integer n in linear time O(n). Unlike the classical sieve of Eratosthenes, which may mark each composite multiple times, the linear sieve guarantees that every composite number is processed exactly once—specifically by its smallest prime factor (SPF). This key invariant makes the total work proportional to n. Beyond just listing primes, the linear sieve naturally yields additional structure: the smallest prime factor for each number. With the SPF array, you can factor any number ≤ n in near O(log n) time by repeatedly dividing by its SPF. Even more powerfully, because many number-theoretic functions are multiplicative and have simple formulas on prime powers, you can compute them alongside the sieve with no extra asymptotic cost. Common examples include Euler’s totient φ(n), the Möbius function μ(n), the divisor-count function d(n), and the sum-of-divisors function σ(n). Implementation-wise, the algorithm maintains a dynamic list of primes. For each integer i from 2 to n, it may be recognized as prime (if not marked yet), appended to the prime list, and then combined with primes p in increasing order to produce i·p. The combination rules split into two cases: p divides i (so p is the SPF of i·p), and p does not divide i (so SPF(i·p) = p). Properly handling these two cases ensures linear complexity and makes it straightforward to propagate multiplicative function values.

02Intuition & Analogies

Imagine a factory where every product (a positive integer) must receive a single official stamp indicating its smallest defect type (its smallest prime factor). There are many defect types (primes), and each inspector (prime) starts visiting products in increasing order. The rule is: inspector p only stamps product i·p the first time the product appears, and only when p is the smallest defect present. If a product i already has a smaller defect than p, then p is not allowed to stamp i·p, because that would be redundant. Also, if inspector p notices that p already divides i, then the smallest defect of i·p is still p, and no other larger inspector is allowed to stamp that same product later. This one-stamp-per-product policy keeps the total number of stamps linear in the number of products. Now layer on top another idea: Some quality scores of a product depend in simple ways on its defects. For example, the totient φ(n) counts how many settings are coprime to n; it behaves predictably when you multiply by a new defect p that either is new to the product or increases an existing p-power. The Möbius function μ(n) flips signs when you add a new square-free prime and becomes zero if a square appears. The number of divisors d(n) and the sum of divisors σ(n) both follow neat formulas from the exponents of each prime defect. Because each product’s next state (i·p) depends only on whether p is new or repeats, we can update all these scores on the fly using local rules—no need to recompute from scratch. This mirrors real workflows where one extra feature or defect changes a product’s rating in a predictable way. With these two intuitions—one-stamp-per-product and local update rules—you can see why the linear sieve is both fast and multifunctional. The algorithm touches each composite exactly once, and propagates function values with constant work per touch.

03Formal Definition

Let n be a positive integer. The linear sieve computes: 1) The list of primes . 2) The smallest prime factor array spf[2..n], where spf[x] is the least prime dividing x. It maintains a dynamic list of primes and iterates i from 2 to n. If spf[i] = 0, then i is prime; set spf[i] = i and append i to P. Then for each prime p ∈ P in increasing order, stop if or i· otherwise set spf[i·p] = p. This ensures that each composite m is generated exactly once as with . Simultaneously, for multiplicative functions f, we exploit two update cases: - If p ∤ i (coprime case), then f(i·p) = f(i)·f(p), where f(p) is the prime contribution (e.g., = p − 1, = −1, d(p) = 2, = 1 + p). - If p | i (prime-power extension), write with gcd(m, p) = 1. Then i·, so we apply the exact prime-power rule for f at exponent a + 1 (e.g., φ(m·) = φ(m·)·p; μ(m·) = 0; d multiplies the exponent factor from (a + 1) to (a + 2); σ extends the geometric sum by one more term). By restricting the inner loop to primes , we guarantee that p is the smallest prime dividing i·p and prevent later primes from generating the same composite. This yields an overall O(n) time and O(n) space algorithm.

04When to Use

Use the linear sieve when you need a fast precomputation of primes and number-theoretic function values for all integers up to a limit n, followed by many queries. Common competitive programming scenarios include:

  • Bulk factorization: You must factor many numbers ≤ n repeatedly. With spf, each factorization is quick.
  • Function tables: You need φ(n), μ(n), d(n), or σ(n) for all n ≤ N to answer queries or to build prefix sums or convolutions.
  • Multiplicative DP: Problems that sum multiplicative functions over ranges or use them in convolutions (e.g., Dirichlet convolutions) benefit from precomputing values with a linear sieve.
  • Constraints like n up to 10^7: A single O(n) pass with careful memory usage fits within typical time limits and memory budgets of Codeforces 1400–1800 difficulty problems. Avoid it when:
  • You only need a primality test for one large number (use Miller–Rabin instead).
  • n is so large that O(n) memory/time is infeasible (consider segmented sieves or on-demand factorization for smaller ranges).
  • You need primes in a huge upper range without storing all spf values (use segmented sieve without the spf table).

⚠️Common Mistakes

  1. Forgetting the break when p divides i: The linear sieve’s inner loop must break once p | i to keep each composite generated once. If you continue, you lose the linear-time property and may assign wrong spf values.
  2. Overflow in i·p: Always check i ≤ n / p or cast to 64-bit before multiplying to avoid overflow when computing i·p.
  3. Wrong spf initialization: Initialize spf with zeros and set spf[i] = i only when you discover a new prime i. Do not prefill spf[i] = i for all i.
  4. Incorrect multiplicative updates: For d and σ under the p | i case, you must update the exponent-based factor. For example, if v_p(i) = a, then d(i·p) replaces (a + 1) with (a + 2), not multiply blindly by 2.
  5. Using 32-bit for σ: σ(n) can exceed 32-bit range quickly. Use 64-bit (long long) to store σ and intermediate p-powers.
  6. Starting from 1 or including 0: The sieve starts from i = 2. Values for 0 and 1 are special; define them explicitly if needed (e.g., φ(1) = 1, μ(1) = 1, d(1) = 1, σ(1) = 1).
  7. Memory spikes: For very large n, arrays for spf, φ, μ, d, σ, cnt, p_pow may exceed memory limits. Only compute the arrays you need.
  8. Confusing Eratosthenes and linear sieve loops: The classical sieve marks multiples p·p, 2p, 3p, ... per prime. The linear sieve loops over i and combines with primes until p > spf[i], a different pattern that preserves linear complexity.

Key Formulas

Prime Factorization

Explanation: Every integer can be expressed uniquely as a product of prime powers. The exponents and primes drive all multiplicative function formulas.

Totient on Prime Powers

Explanation: For a prime power, φ removes multiples of p from the count. This is the building block for φ on general n via multiplicativity.

Totient Multiplicativity

Explanation: When a and b are coprime, φ distributes over the product. In the sieve, this applies when adding a new prime factor p to i with p ∤ i.

Mf6bius Function

Explanation: is zero if n has any squared prime factor; otherwise it alternates sign based on the number of distinct prime factors. The sieve updates μ by checking if p divides i.

Divisor Count

Explanation: Each exponent in the factorization of n contributes ( + 1) choices for the power in a divisor. Multiply these choices across primes.

Sum of Divisors

Explanation: For each prime power , the sum of its divisors is the geometric series 1 + p + ... + . Multiply across distinct primes.

Prime Number Theorem (rough form)

Explanation: The number of primes up to n is about n / log n. This helps anticipate the number of stored primes and iteration counts.

Linear Sieve Complexity

Explanation: Each composite is generated exactly once, so total time is linear in n, and arrays like spf and function tables take linear space.

Coprime Update Rules

Explanation: When adding a new prime p to i, multiplicativity applies directly. Each function gets multiplied by its prime contribution.

Prime-Power Extension Rules

Explanation: When increasing the exponent of an existing prime factor p from a to a+1, φ multiplies by p, μ becomes 0, d adjusts its exponent factor, and σ extends its geometric sum by .

Complexity Analysis

The linear sieve achieves O(n) time by enforcing that each composite m is generated exactly once as where p is the smallest prime factor of m. In the algorithm, for each i from 2 to n, we iterate over primes p in increasing order while and i·. This restriction ensures no composite is produced by two different pairs (i, p). Consequently, the total number of inner-loop iterations equals the count of composites up to n, plus the count of primes discovered, which is Θ(n). Space usage is O(n) because we store arrays of size n for spf and, if needed, function tables like d, and any auxiliary arrays such as the exponent counter cnt and prime-power . The number of primes is about n / log n, but they are kept in a dynamic list whose total size is also O(n). In practice, the constant factors are low: each composite is assigned once, and each prime is appended once. Factorization queries using spf take O(number of prime factors) steps, which is O(log n) in the worst case since each step divides by at least 2. When computing additional multiplicative functions in the same pass, the time remains O(n) because each update is O(1). Care must be taken to avoid overflow in i·p by checking and to use 64-bit integers for σ and intermediate products.

Code Examples

Core linear sieve: primes + SPF + fast factorization
1#include <bits/stdc++.h>
2using namespace std;
3
4struct LinearSieve {
5 int n;
6 vector<int> spf; // smallest prime factor
7 vector<int> primes; // list of primes
8
9 LinearSieve(int N) : n(N), spf(N + 1, 0) {
10 primes.reserve(N / 10); // heuristic to reduce reallocations
11 build();
12 }
13
14 void build() {
15 for (int i = 2; i <= n; ++i) {
16 if (spf[i] == 0) { // i is prime
17 spf[i] = i;
18 primes.push_back(i);
19 }
20 for (int p : primes) {
21 long long ip = 1LL * i * p;
22 if (ip > n) break;
23 spf[(int)ip] = p; // set SPF for composite i*p
24 if (p == spf[i]) { // p divides i -> must break to keep linearity
25 break;
26 }
27 }
28 }
29 }
30
31 // Factorize x (2 <= x <= n) using spf in O(number of prime factors)
32 vector<pair<int,int>> factorize(int x) const {
33 vector<pair<int,int>> fac;
34 while (x > 1) {
35 int p = spf[x];
36 int cnt = 0;
37 while (x % p == 0) {
38 x /= p;
39 ++cnt;
40 }
41 fac.push_back({p, cnt});
42 }
43 return fac;
44 }
45};
46
47int main() {
48 ios::sync_with_stdio(false);
49 cin.tie(nullptr);
50
51 int N = 100; // demo bound
52 LinearSieve sieve(N);
53
54 cout << "Primes up to " << N << ":\n";
55 for (size_t i = 0; i < sieve.primes.size(); ++i) {
56 cout << sieve.primes[i] << (i + 1 == sieve.primes.size() ? '\n' : ' ');
57 }
58
59 // Demo factorization of a few numbers
60 vector<int> nums = {60, 84, 97, 100};
61 for (int x : nums) {
62 auto fac = sieve.factorize(x);
63 cout << x << " = ";
64 for (size_t i = 0; i < fac.size(); ++i) {
65 auto [p, e] = fac[i];
66 cout << p << "^" << e << (i + 1 == fac.size() ? '\n' : ' ');
67 }
68 }
69 return 0;
70}
71

This program builds the linear sieve up to N, recording both the prime list and the smallest prime factor for each integer. The inner loop stops when the current prime equals spf[i], ensuring each composite is generated once. The factorize method repeatedly divides by spf[x] to extract prime powers efficiently.

Time: O(n) to build; O(log n) per factorization query on averageSpace: O(n) for spf and O(n / log n) for primes
Linear sieve computing φ (totient) and μ (Mf6bius) simultaneously
1#include <bits/stdc++.h>
2using namespace std;
3
4struct PhiMuSieve {
5 int n;
6 vector<int> spf, primes;
7 vector<int> phi; // Euler's totient
8 vector<int> mu; // Mobius function
9
10 PhiMuSieve(int N) : n(N), spf(N + 1, 0), phi(N + 1, 0), mu(N + 1, 0) {
11 build();
12 }
13
14 void build() {
15 phi[1] = 1; mu[1] = 1; // conventional base values
16 for (int i = 2; i <= n; ++i) {
17 if (spf[i] == 0) {
18 spf[i] = i;
19 primes.push_back(i);
20 phi[i] = i - 1; // phi(p) = p-1
21 mu[i] = -1; // mu(p) = -1
22 }
23 for (int p : primes) {
24 long long ip = 1LL * i * p;
25 if (ip > n) break;
26 spf[(int)ip] = p;
27 if (p == spf[i]) {
28 // p | i : extend prime power
29 phi[(int)ip] = phi[i] * p; // phi(i*p) = p * phi(i)
30 mu[(int)ip] = 0; // a squared factor appears
31 break; // must break for linearity
32 } else {
33 // p does not divide i : multiplicative combination
34 phi[(int)ip] = phi[i] * (p - 1);
35 mu[(int)ip] = -mu[i];
36 }
37 }
38 }
39 }
40};
41
42int main() {
43 ios::sync_with_stdio(false);
44 cin.tie(nullptr);
45
46 int N = 50;
47 PhiMuSieve sieve(N);
48
49 // Print some values
50 cout << "n phi mu\n";
51 for (int i = 1; i <= N; ++i) {
52 cout << i << " " << sieve.phi[i] << " " << sieve.mu[i] << '\n';
53 }
54 return 0;
55}
56

This builds φ and μ in the same O(n) loop as primes. Two cases are handled: when p divides i (prime-power extension) and when it does not (coprime multiplicativity). The break when p | i prevents duplicates and preserves linear complexity.

Time: O(n)Space: O(n) for spf, phi, and mu
Full package: φ, μ, d, σ with exponent tracking (cnt, p_pow)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct FullArithmeticSieve {
5 int n;
6 vector<int> primes, spf;
7 vector<int> phi, mu, d, cnt; // cnt[x] = exponent of spf[x] in x
8 vector<long long> sigma, p_pow; // p_pow[x] = (spf[x])^{cnt[x]}
9
10 FullArithmeticSieve(int N) : n(N), spf(N + 1, 0), phi(N + 1, 0), mu(N + 1, 0), d(N + 1, 0), cnt(N + 1, 0), sigma(N + 1, 0), p_pow(N + 1, 0) {
11 build();
12 }
13
14 void build() {
15 // Base case for 1 (optional depending on use)
16 phi[1] = 1; mu[1] = 1; d[1] = 1; sigma[1] = 1;
17 for (int i = 2; i <= n; ++i) {
18 if (spf[i] == 0) {
19 // i is prime: initialize all functions at a prime
20 spf[i] = i; primes.push_back(i);
21 phi[i] = i - 1; // phi(p)
22 mu[i] = -1; // mu(p)
23 d[i] = 2; // divisors: 1 and p
24 sigma[i] = 1 + (long long)i; // 1 + p
25 cnt[i] = 1; // exponent of spf(i) in i
26 p_pow[i] = i; // p^1
27 }
28 for (int p : primes) {
29 long long ip = 1LL * i * p;
30 if (ip > n) break;
31 int m = (int)ip;
32 spf[m] = p;
33 if (p == spf[i]) {
34 // Case: p divides i (extend prime power)
35 // Let i = base * p^a
36 int a = cnt[i];
37 int base = i / (int)p_pow[i];
38
39 // Update cnt and p_pow
40 cnt[m] = a + 1;
41 p_pow[m] = p_pow[i] * p; // p^{a+1}
42
43 // phi: multiply by p
44 phi[m] = phi[i] * p;
45
46 // mu: a square now divides, becomes 0
47 mu[m] = 0;
48
49 // d: replace factor (a+1) with (a+2)
50 d[m] = d[i] / (a + 1) * (a + 2);
51
52 // sigma: extend geometric sum by one more term
53 // sigma(i) = sigma(base) * (1 + p + ... + p^a)
54 // sigma(m) = sigma(base) * (1 + p + ... + p^{a+1})
55 // => sigma(m) = sigma(i) + p^{a+1} * sigma(base)
56 sigma[m] = sigma[i] + p_pow[m] * sigma[base];
57
58 break; // maintain linearity
59 } else {
60 // Case: p does not divide i (coprime product)
61 cnt[m] = 1; // new prime factor p^1
62 p_pow[m] = p; // p^1
63
64 phi[m] = phi[i] * (p - 1);
65 mu[m] = -mu[i];
66 d[m] = d[i] * 2; // exponent for new p goes from 0 to 1
67 sigma[m] = sigma[i] * (1 + (long long)p);
68 }
69 }
70 }
71 }
72};
73
74int main() {
75 ios::sync_with_stdio(false);
76 cin.tie(nullptr);
77
78 int N = 200;
79 FullArithmeticSieve sieve(N);
80
81 cout << "n phi mu d sigma\n";
82 for (int i = 1; i <= N; ++i) {
83 cout << i << ' ' << sieve.phi[i] << ' ' << sieve.mu[i] << ' ' << sieve.d[i] << ' ' << sieve.sigma[i] << '\n';
84 }
85 return 0;
86}
87

This sieve computes φ, μ, d, and σ in one pass. It tracks cnt[i] = v_{spf[i]}(i) and p_pow[i] = (spf[i])^{cnt[i]} to update the prime-power case efficiently. For σ, it uses sigma(i·p) = sigma(i) + p^{a+1} · sigma(base) when p | i, where base = i / p^a.

Time: O(n)Space: O(n) for all arrays (spf, phi, mu, d, sigma, cnt, p_pow)