MathAdvanced

Pollard's Rho Factorization

Key Points

  • Pollard's Rho is a randomized algorithm that finds a non-trivial factor of a composite integer by walking a pseudorandom sequence modulo n and extracting a factor with a gcd.
  • It uses the function f(x) = + c mod n and detects a cycle collision modulo a hidden prime factor to reveal that factor.
  • The algorithm’s expected running time is O() for semiprimes q with p ≈ q, or more precisely O() where p is the smallest prime factor.
  • Cycle detection is done efficiently with Floyd’s tortoise–hare or Brent’s method, and the gcd(, n) uncovers a divisor when a collision occurs modulo a factor.
  • In practice, combine Pollard’s Rho with Miller–Rabin primality testing to completely factor 64-bit integers quickly.
  • Correct modular arithmetic is crucial; use 128-bit intermediates for safe multiplication modulo n on 64-bit inputs.
  • Random restarts with different constants c and seeds avoid pathological cycles and improve reliability.
  • Pollard’s Rho is excellent for competitive programming tasks up to 10^18 but not suitable for cryptographic-sized integers.

Prerequisites

  • Modular arithmeticPollard’s Rho operates entirely modulo n; understanding residues, modular addition, and multiplication is essential.
  • Greatest common divisor (Euclid’s algorithm)The factor is extracted as gcd(|x − y|, n); fast gcd is a core subroutine.
  • Fast exponentiationRequired to implement Miller–Rabin primality testing efficiently.
  • Probabilistic primality testing (Miller–Rabin)Used to stop factoring when a prime is reached and to avoid running Rho on primes.
  • Randomization and pseudorandom generatorsRho relies on randomized constants and seeds to avoid bad cycles and ensure expected performance.
  • Cycle detection (Floyd/Brent)The heart of the algorithm detects collisions efficiently to trigger the gcd factor reveal.
  • Bit complexity and overflow handlingSafe modular multiplication for 64-bit inputs requires wider intermediates or special techniques.
  • Prime factorization basicsUnderstanding factor trees and recursion is necessary to build a complete factorizer.
  • Chinese Remainder TheoremExplains why collisions modulo a prime factor reveal that factor when taking gcd with n.

Detailed Explanation

Tap terms for definitions

01Overview

Hook: Imagine stirring marbles in two sealed boxes at different speeds. Even though you cannot see inside, when the same marble positions reappear in one box, something predictable has happened that you can detect from the outside. Concept: Pollard’s Rho factorization turns factoring into a hidden-cycle detection problem. It constructs a pseudorandom sequence x_{i+1} = f(x_i) mod n with a simple polynomial like f(x) = x^2 + c, then uses a cycle-finding trick to make two positions land at the same value modulo an unknown prime factor p of n. When that happens, the difference of the two sequence elements is divisible by p, and computing gcd(|x_i − x_j|, n) reveals a non-trivial factor. Example: Suppose n = 91 = 7 × 13 and f(x) = x^2 + 1. Generate x: 2, 5, 26, 39, 73, ... Now track two runners: x (one step) and y (two steps). Eventually, x ≡ y (mod 7) even if not equal mod 13. Then |x − y| is a multiple of 7, so gcd(|x − y|, 91) = 7 exposes a factor. Pollard’s Rho is fast in practice for 64-bit integers: expected time grows like the square root of the smallest prime factor, which is typically manageable. Paired with Miller–Rabin to test primality and recursive splitting, you get a complete factorization routine suitable for programming contests.

02Intuition & Analogies

Hook: Think of two people jogging around two tracks painted over each other but with different lap lengths you can’t see. If you photograph them occasionally, the moments when they appear at the same painted mark on one of the tracks hint at that track’s lap length. Concept: Working modulo n blends together the cycles modulo each prime factor (by the Chinese Remainder Theorem). The map f(x) = x^2 + c acts like stirring: it spreads values around in a way that looks random. If you look modulo a particular prime p | n, the sequence is just a random walk over finitely many residues, so by the birthday paradox it will collide quickly—after about sqrt(p) steps. Example: Let n = p q with unknown p and q. Run two pointers through the generated sequence: the tortoise x moves once each turn; the hare y moves twice. When viewed modulo p, both move through a small ring of size about p. Because the hare laps the tortoise, they will eventually meet modulo p (x ≡ y mod p), even if not equal modulo q. The difference d = |x − y| is then divisible by p but probably not by q, so gcd(d, n) = p. Why it works: Collisions are easier to get modulo smaller sets. A meeting modulo the smallest prime factor p is most likely to show up first. Taking gcd with n filters out the shared divisor. Random restarts: Sometimes the walk wanders into an unhelpful cycle that doesn’t expose a factor quickly. Changing the constant c or the starting seed shakes the system, like restarting the joggers at new positions, and usually yields a quick success.

03Formal Definition

Let be composite with an unknown non-trivial factor p, 1 < . Choose a polynomial f: , typically f(x) = + c n with c 0, -2. Define the sequence and ). Consider the projections modulo each prime factor: p and q. Since is finite, the sequence (x_) eventually becomes periodic and, under heuristic pseudorandomness, exhibits a collision after O() terms (birthday paradox). Using Floyd’s method, maintain x and y with updates x f(x), y f(f(y)). At each step, compute d = (, n). If 1 < , d is a non-trivial factor. If , restart with different parameters (e.g., new c or seed). If , continue. The expected running time is O() arithmetic operations, where p is the smallest prime factor of n. For full factorization, recursively apply this process to d and n/d, using a primality test such as Miller–Rabin to terminate on primes. Correctness relies on the fact that if x y \pmod p then p (x - y), hence p (x - y, n), while with high probability q (x - y), so the gcd recovers p rather than n.

04When to Use

Use Pollard’s Rho when factoring integers up to about 10^18 (64-bit) in competitive programming, cryptographic toy problems, or number theory utilities. It excels when the smallest prime factor p is not too large, since the expected time is O(p^{1/2}). Typical scenarios: factoring semiprimes near 10^{12}–10^{18}, removing small or moderate prime factors from large composites, or producing a complete prime factorization quickly. Always combine with a fast probabilistic primality test (Miller–Rabin) to avoid wasting time trying to split primes. It is also effective as a subroutine to strip factors before using heavier methods. Prefer Brent’s cycle detection variant if you want fewer gcd computations and often faster practical performance. Avoid Pollard’s Rho for cryptographic-size numbers (hundreds to thousands of bits): even O(p^{1/2}) becomes infeasible. In such cases, algorithms like Pollard’s p−1, ECM (Elliptic Curve Method), or the Quadratic Sieve/Number Field Sieve are more appropriate. In contest settings, implement a robust 64-bit version with deterministic Miller–Rabin bases, modular multiplication via 128-bit intermediates, and randomized restarts; this provides near-instant results for most inputs you will encounter.

⚠️Common Mistakes

• Overflow in modular multiplication: Computing (a · b) mod n with 64-bit a, b can overflow 64-bit before reduction. Fix by using unsigned __int128 or Montgomery multiplication. • Not handling trivial cases: Always check n < 2, even n (return 2), and primality before invoking Rho to avoid infinite loops on primes. • Poor choice of c or seed: Using c = 0 or c = −2 often leads to short, degenerate cycles; randomize c in [1, n−1] excluding those values and restart when needed. • Missing restart policy: If d = n (gcd equals n), the walk hit a trivial collision mod n; you must restart with new parameters. • Forgetting to divide out found factors: After finding a factor d, recursively factor both d and n/d, and accumulate prime powers correctly (including repeats). • Using signed modulo incorrectly: In C++, negative values with % can be surprising; always normalize additions to stay in [0, n−1]. • Too few Miller–Rabin bases: For 64-bit safety, use a known deterministic base set (e.g., {2,3,5,7,11,13,17}) to avoid false primes. • Not randomizing or reseeding the RNG: Deterministic seeds can repeat bad behavior; use a good source such as std::random_device combined with std::mt19937_64. • Unbounded recursion depth or stack issues: For adversarial inputs, tail recursion might be deep; consider iterative collections or ensure reasonable depth given 64-bit limits. • Ignoring time when smallest factor is large: If p is large (close to n), Rho slows; in practice, multiple restarts and switching to Brent’s method help, but cryptographic-size inputs remain out of scope.

Key Formulas

Pollard's Rho iteration

Explanation: Defines the pseudorandom walk modulo n. The constant c is chosen at random (excluding degenerate values) to avoid short trivial cycles.

Factor extraction

Explanation: If \pmod p for some prime divisor p of n, then p divides , so the gcd uncovers a non-trivial factor with high probability.

Collision time modulo p

Explanation: By the birthday paradox, a collision in a set of size p occurs after about the square root of p elements. This governs the expected time to reveal factor p.

Expected runtime for semiprimes

Explanation: When n is the product of two similarly sized primes, the smallest factor p is about , giving O() = O() expected time.

Miller–Rabin condition

Explanation: The strong probable-prime test for base a. If neither condition holds, n is definitely composite; otherwise n is a probable prime for base a.

Modular multiplication

Explanation: Used in every step of the iteration and in modular exponentiation. Care must be taken to avoid overflow in intermediate products.

CRT decomposition

Explanation: Arithmetic modulo n splits into independent components modulo p and modulo q. A collision in one component induces a gcd revealing that factor.

Complexity Analysis

Let n be composite and let p be its smallest prime factor. Heuristically, the sequence ) modulo p behaves like a random mapping on a set of size p. By the birthday paradox, the expected number of steps to encounter a collision modulo p is on the order of sqrt(p). Each step performs O(1) modular arithmetic operations: one or two modular multiplications and additions for f(x), and periodic gcd computations. With Floyd’s method, a gcd is done on every step; Brent’s variant amortizes gcds across blocks and reduces the constant factor. Consequently, the expected running time is O() modular operations. For semiprimes q with p ≈ q ≈ sqrt(n), this becomes O(). Space usage is O(1): we store only a handful of 64-bit integers (current x, y, constants, and temporary products). For full factorization, we combine Pollard’s Rho with Miller–Rabin. The Miller–Rabin test for 64-bit numbers with a fixed small base set runs in O(log n) modular multiplications per test, negligible compared to factor search. The recursive factorization performs multiple calls to Rho—one per distinct prime factor (counting multiplicities), so the total expected time sums across the smallest factors encountered. In adversarial cases where the smallest factor is large (e.g., close to n), the algorithm slows dramatically; still, for 64-bit inputs with randomized restarts and Brent’s method, practice shows sub-second performance on a wide range of composites. Overall complexities: time ≈ O( lo n) counting bit complexity of arithmetic, and space O(1) beyond recursion stack and factor storage.

Code Examples

Complete 64-bit factorization with Miller–Rabin + Pollard’s Rho (Floyd)
1#include <bits/stdc++.h>
2using namespace std;
3
4using u128 = unsigned __int128;
5using u64 = unsigned long long;
6
7// Multiply (a * b) % mod with 128-bit to avoid overflow for 64-bit mod
8static inline u64 mul_mod(u64 a, u64 b, u64 mod) {
9 return (u128)a * b % mod;
10}
11
12// Fast modular exponentiation: (a^e) % mod
13static inline u64 pow_mod(u64 a, u64 e, u64 mod) {
14 u64 res = 1 % mod;
15 u64 base = a % mod;
16 while (e) {
17 if (e & 1) res = mul_mod(res, base, mod);
18 base = mul_mod(base, base, mod);
19 e >>= 1;
20 }
21 return res;
22}
23
24// Deterministic Miller–Rabin for 64-bit integers
25static inline bool isPrime(u64 n) {
26 if (n < 2) return false;
27 static u64 smallPrimes[] = {2ULL,3ULL,5ULL,7ULL,11ULL,13ULL,17ULL,19ULL,23ULL,29ULL,31ULL,37ULL};
28 for (u64 p : smallPrimes) {
29 if (n % p == 0) return n == p;
30 }
31 u64 d = n - 1, s = 0;
32 while ((d & 1) == 0) { d >>= 1; ++s; }
33 auto check = [&](u64 a) {
34 if (a % n == 0) return true;
35 u64 x = pow_mod(a, d, n);
36 if (x == 1 || x == n - 1) return true;
37 for (u64 r = 1; r < s; ++r) {
38 x = mul_mod(x, x, n);
39 if (x == n - 1) return true;
40 }
41 return false;
42 };
43 // These bases are sufficient for 64-bit determinism
44 for (u64 a : smallPrimes) if (!check(a)) return false;
45 return true;
46}
47
48// One iteration function f(x) = x^2 + c mod n
49static inline u64 f_rho(u64 x, u64 c, u64 n) {
50 // (x*x + c) % n computed safely
51 return (mul_mod(x, x, n) + c) % n;
52}
53
54// Random engine for choosing c and seeds
55static mt19937_64 rng((u64)chrono::high_resolution_clock::now().time_since_epoch().count() ^ (u64)(uintptr_t)new int);
56
57// Pollard's Rho using Floyd's cycle detection; returns a non-trivial factor or n on failure
58u64 pollard_rho(u64 n) {
59 if ((n & 1ULL) == 0ULL) return 2ULL;
60 if (n % 3ULL == 0ULL) return 3ULL;
61 uniform_int_distribution<u64> dist_c(1ULL, n - 1);
62 uniform_int_distribution<u64> dist_x(0ULL, n - 1);
63 while (true) {
64 u64 c = dist_c(rng);
65 if (c == n - 2) continue; // avoid c == -2 mod n
66 u64 x = dist_x(rng);
67 u64 y = x;
68 u64 d = 1;
69 // Floyd: one step for x, two for y
70 while (d == 1) {
71 x = f_rho(x, c, n);
72 y = f_rho(f_rho(y, c, n), c, n);
73 u64 diff = x > y ? x - y : y - x;
74 d = std::gcd(diff, n);
75 }
76 if (d == n) {
77 // Failure; retry with new parameters
78 continue;
79 }
80 return d; // non-trivial factor found
81 }
82}
83
84void factor(u64 n, vector<u64>& fac) {
85 if (n == 1) return;
86 if (isPrime(n)) { fac.push_back(n); return; }
87 u64 d = pollard_rho(n);
88 factor(d, fac);
89 factor(n / d, fac);
90}
91
92int main() {
93 ios::sync_with_stdio(false);
94 cin.tie(nullptr);
95
96 // Demo: factor several numbers, including a 64-bit semiprime-like value
97 vector<unsigned long long> nums = {
98 91ULL, // 7 * 13
99 600851475143ULL, // classic example (71 * 839 * 1471 * 6857)
100 1000000000039ULL * 1000003ULL, // product of two primes ~1e12 and 1e6+3
101 18446744073709551557ULL // a 64-bit prime near 2^64
102 };
103
104 for (auto n : nums) {
105 cout << "n = " << n << "\n";
106 vector<u64> fac;
107 factor(n, fac);
108 sort(fac.begin(), fac.end());
109 cout << "prime factors:";
110 for (auto p : fac) cout << ' ' << p;
111 // Verify product
112 __int128 prod = 1;
113 for (auto p : fac) prod *= p;
114 cout << "\ncheck product == n: " << (prod == ( __int128) n ? "OK" : "FAIL") << "\n\n";
115 }
116 return 0;
117}
118

This program implements safe 64-bit modular arithmetic, a deterministic Miller–Rabin primality test for 64-bit integers, and Pollard’s Rho with Floyd’s cycle detection. The factor() function recursively splits n until all prime factors are found. Random restarts in pollard_rho choose different constants c and seeds to avoid degeneracy. Example inputs include small, medium, and prime cases; the product check confirms correctness.

Time: Expected O(p^{1/2}) per successful factor discovery, which is O(n^{1/4}) for semiprimes n = p q with p ≈ q; Miller–Rabin adds O(log n) per primality check.Space: O(1) auxiliary space (besides recursion and storage of output factors).
Pollard’s Rho using Brent’s cycle detection (batched gcd) with full factorization
1#include <bits/stdc++.h>
2using namespace std;
3using u64 = unsigned long long;
4using u128 = unsigned __int128;
5
6static inline u64 mul_mod(u64 a, u64 b, u64 m) { return (u128)a * b % m; }
7static inline u64 pow_mod(u64 a, u64 e, u64 m) {
8 u64 r = 1 % m, x = a % m;
9 while (e) { if (e & 1) r = mul_mod(r, x, m); x = mul_mod(x, x, m); e >>= 1; }
10 return r;
11}
12
13static inline bool isPrime(u64 n) {
14 if (n < 2) return false;
15 static u64 testPrimes[] = {2ULL,3ULL,5ULL,7ULL,11ULL,13ULL,17ULL,19ULL,23ULL,29ULL,31ULL,37ULL};
16 for (u64 p : testPrimes) { if (n % p == 0) return n == p; }
17 u64 d = n - 1, s = 0; while ((d & 1) == 0) { d >>= 1; ++s; }
18 auto check = [&](u64 a){ if (a % n == 0) return true; u64 x = pow_mod(a,d,n); if (x==1||x==n-1) return true; for (u64 r=1;r<s;++r){ x = mul_mod(x,x,n); if (x==n-1) return true; } return false; };
19 for (u64 a : testPrimes) if (!check(a)) return false;
20 return true;
21}
22
23static inline u64 f_rho(u64 x, u64 c, u64 n){ return (mul_mod(x,x,n) + c) % n; }
24
25static mt19937_64 rng((u64)chrono::steady_clock::now().time_since_epoch().count() ^ (u64)(uintptr_t)new int);
26
27// Brent's cycle detection variant
28u64 pollard_rho_brent(u64 n){
29 if ((n & 1ULL) == 0ULL) return 2ULL;
30 if (n % 3ULL == 0ULL) return 3ULL;
31 uniform_int_distribution<u64> dist(1ULL, n - 1);
32 while (true) {
33 u64 y = dist(rng);
34 u64 c = dist(rng);
35 if (c == n - 2) continue; // avoid c == -2
36 u64 m = 128; // batch size for gcd; tweak for speed
37 u64 g = 1, r = 1, q = 1;
38 u64 x, ys;
39 while (g == 1) {
40 x = y;
41 for (u64 i = 0; i < r; ++i) y = f_rho(y, c, n);
42 u64 k = 0;
43 while (k < r && g == 1) {
44 ys = y;
45 u64 lim = min(m, r - k);
46 for (u64 i = 0; i < lim; ++i) {
47 y = f_rho(y, c, n);
48 u64 diff = x > y ? x - y : y - x;
49 q = (u128)q * diff % n;
50 }
51 g = std::gcd(q, n);
52 k += lim;
53 }
54 r <<= 1;
55 }
56 if (g == n) {
57 // fallback: find single step gcds
58 do {
59 ys = f_rho(ys, c, n);
60 u64 diff = x > ys ? x - ys : ys - x;
61 g = std::gcd(diff, n);
62 } while (g == 1);
63 }
64 if (g != n) return g; // found factor
65 }
66}
67
68void factor(u64 n, vector<u64>& out){
69 if (n == 1) return;
70 if (isPrime(n)) { out.push_back(n); return; }
71 u64 d = pollard_rho_brent(n);
72 factor(d, out);
73 factor(n / d, out);
74}
75
76int main(){
77 ios::sync_with_stdio(false);
78 cin.tie(nullptr);
79
80 // Factor a random semiprime and verify
81 u64 a = 1000003ULL; // prime
82 u64 b = 1000000007ULL; // prime
83 u64 n = a * (u128)b; // 128-bit to avoid overflow during product
84
85 vector<u64> f; factor(n, f); sort(f.begin(), f.end());
86 cout << "n = " << n << "\n";
87 cout << "prime factors:"; for (auto p : f) cout << ' ' << p; cout << "\n";
88 __int128 prod = 1; for (auto p : f) prod *= p;
89 cout << "check product == n: " << (prod == (__int128)n ? "OK" : "FAIL") << "\n";
90 return 0;
91}
92

This version uses Brent’s cycle detection, which reduces the number of gcd computations by batching |x − y| products into q and taking gcd(q, n) periodically. It typically runs faster in practice. The program factors a known semiprime and verifies the result by multiplying the found primes.

Time: Expected O(p^{1/2}) modular operations per discovered factor (O(n^{1/4}) for balanced semiprimes), with smaller practical constants than Floyd’s method.Space: O(1) extra space besides recursion and output storage.
Single non-trivial factor finder with robust restarts and timeout guard
1#include <bits/stdc++.h>
2using namespace std;
3using u64 = unsigned long long; using u128 = unsigned __int128;
4
5static inline u64 mul_mod(u64 a, u64 b, u64 m){ return (u128)a*b % m; }
6static inline u64 f_rho(u64 x, u64 c, u64 n){ return (mul_mod(x,x,n) + c) % n; }
7
8u64 find_factor(u64 n, uint64_t max_iters = 1'000'000) {
9 if (n % 2ULL == 0ULL) return 2ULL;
10 if (n % 3ULL == 0ULL) return 3ULL;
11 mt19937_64 rng((u64)chrono::high_resolution_clock::now().time_since_epoch().count() ^ (u64)(uintptr_t)new int);
12 uniform_int_distribution<u64> dist(1ULL, n - 1);
13 for (int restart = 0; restart < 64; ++restart) { // multiple restarts
14 u64 c = dist(rng); if (c == n - 2) { --restart; continue; }
15 u64 x = dist(rng), y = x; u64 d = 1; uint64_t steps = 0;
16 while (d == 1 && steps < max_iters) {
17 x = f_rho(x, c, n);
18 y = f_rho(f_rho(y, c, n), c, n);
19 u64 diff = x > y ? x - y : y - x;
20 d = std::gcd(diff, n);
21 ++steps;
22 }
23 if (1 < d && d < n) return d; // success
24 // else retry with new random (c, seed)
25 }
26 return n; // signal failure
27}
28
29int main(){
30 ios::sync_with_stdio(false);
31 cin.tie(nullptr);
32
33 unsigned long long n; cin >> n; // input a composite number
34 u64 d = find_factor(n);
35 if (d == n) cout << "Failed to find a factor (try again).\n";
36 else cout << "Found factor: " << d << " and cofactor: " << n / d << "\n";
37 return 0;
38}
39

This standalone tool finds a single non-trivial factor of an input n using Pollard’s Rho with Floyd’s method, multiple randomized restarts, and a simple iteration cap to prevent hanging. It’s useful as a building block or when only one factor is needed quickly.

Time: Expected O(p^{1/2}) steps per successful run; multiple restarts add a small constant factor.Space: O(1) additional space.