βš™οΈAlgorithmAdvanced

Polynomial Operations

Key Points

  • β€’
    Fast polynomial operations treat coefficients like numbers but use FFT/NTT to multiply in O(n n) time instead of O().
  • β€’
    Division and modulo of polynomials can be done in O(n n) using Newton's method and reversed polynomials.
  • β€’
    Formal power series let you compute polynomial inverse, logarithm, exponential, and square root with Newton iterations that double precision each step.
  • β€’
    The key identities are F = F'/F, , and Newton updates like \big(2 - F g_k\big) for inversion.
  • β€’
    Multipoint evaluation computes values at many using a product tree and repeated remainders in about O(n n).
  • β€’
    Interpolation reconstructs the unique polynomial from (, ) pairs using the same tree and derivative trick.
  • β€’
    Numerical stability is avoided by working modulo a prime like 998244353 with NTT; pick a modulus with a suitable primitive root.
  • β€’
    Common pitfalls include forgetting to normalize lengths, trailing zeros, wrong base cases, or violating preconditions like F(0) = 1 for log.

Prerequisites

  • β†’FFT/NTT basics β€” Fast multiplication underlies all advanced polynomial operations and must be understood first.
  • β†’Modular arithmetic and inverses β€” All computations are done modulo a prime; modular inverses are required for division, integration, and Newton iterations.
  • β†’Formal derivatives and integrals β€” Logarithm and exponential of series rely on derivative/integral identities.
  • β†’Newton's method β€” Power series inversion, exp, log, and sqrt use Newton iterations that double precision.
  • β†’Divide and Conquer β€” Multipoint evaluation and interpolation are implemented over a product tree.

Detailed Explanation

Tap terms for definitions

01Overview

Polynomial operations generalize arithmetic on numbers to arithmetic on coefficient sequences. Instead of adding or multiplying single values, we add or multiply entire polynomials, which correspond to sequences of coefficients. For large degrees, naive multiplication takes O(n^2), which is too slow for many algorithmic problems. Fast Fourier Transform (FFT) or its modular cousin the Number Theoretic Transform (NTT) accelerates multiplication to O(n \log n). Building on fast multiplication, we can perform advanced operations like division, modulo, inversion (1/F), logarithm, exponential, and square root of formal power series efficiently using Newton's method. These methods iteratively double the number of correct coefficients, reaching O(n \log n) total time. In competitive programming and combinatorics, such operations are crucial for manipulating generating functions, solving recurrences, and building polynomials from point-value representations. Multipoint evaluation computes values of a polynomial at many points faster than evaluating one by one, while interpolation rebuilds a polynomial from its values at points. Together, these tools form a powerful toolkit for CF 2200–2600 problems involving convolutions, combinatorics, and power series.

02Intuition & Analogies

Think of a polynomial as a recipe list: the k-th coefficient tells you how much of the ingredient x^k to include. Adding polynomials just adds recipes. Multiplication is like combining two recipes so that every pair of ingredients x^i and x^j combine to contribute to x^{i+j}; that’s a convolution. Doing this pair-by-pair naively is slow (like mixing every spice with every other). FFT/NTT changes perspective: it converts the recipe into a flavor profile at several tasting points where mixing is easy (component-wise multiplication), then transforms back. This is like switching to a domain where combining is effortless. Newton’s method for power series is similar to tuning a recipe: you start with a rough approximation (say, the constant term) and repeatedly correct it so that the result matches more coefficients. Each correction doubles how many coefficients are right, which means you reach your goal in logarithmically many steps. Division uses the same trick but in reverse: to divide A by B, you approximate 1/B and multiply by A. Logs and exps of power series come from calculus identities: log F is the integral of F'/F, and exp solves G'/G = F'. Because we only care about the first n coefficients, all operations are done modulo x^n, keeping vectors short. Multipoint evaluation and interpolation leverage a binary tree of polynomials (x - x_i). Remainders down the tree give you values at leaves; the reverse process rebuilds the polynomial from weighted sums of sibling products, akin to divide-and-conquer blending of local fits into a global curve.

03Formal Definition

Let F(x) = be a formal power series over a field (commonly a finite field ). Polynomial addition and multiplication are defined coefficient-wise and by convolution respectively. Fast multiplication uses FFT/NTT to compute the Cauchy product in O(n n) for degree-bounded series modulo . The formal inverse exists if and only if 0 and satisfies F G 1 . Newton iteration for inversion sets \big(2 - F G_k\big) starting from , doubling precision. The formal logarithm is defined when by F = (F'/F) \; dx, truncated modulo . The formal exponential is defined for series with constant term 0 as H being the unique G with G(0)=1 and a Newton update is \big(1 - + H\big) . The formal square root exists if is a quadratic residue; Newton update = \big( + F/g_k\big) . Polynomial division with R < B can be computed via reversed polynomials and inversion modulo . Multipoint evaluation of P at points ,, uses a product tree (x) = (x - ) and computes remainders P down the tree to obtain P(). Interpolation constructs the unique Q of through (,) by computing the derivative of (x - ), evaluating it at , and combining weights via the same tree.

04When to Use

Use fast polynomial operations when degrees reach 2e5 or more, where naive O(n^2) is infeasible. In combinatorics, employ convolution to count ways to combine choices, compute partitions, or perform subset sums over degrees. Use division/modulo to implement polynomial long division and to compute remainders needed in Chinese remainder-like constructions on polynomials. Choose inversion, logarithm, exponential, and square root of series when manipulating generating functions: solving linear recurrences via generating functions often needs 1/F, while counting labeled structures uses exp/log of series (e.g., exponential generating functions). Newton-based sqrt is used in quadratic equation solutions over series. Multipoint evaluation is ideal when you need values at many points (e.g., building value tables, barycentric Lagrange-like usage) more efficiently than evaluating one-by-one. Interpolation is used to reconstruct polynomials from sample values, such as recovering coefficients of an unknown polynomial, or converting from point-value form to coefficient form after a divide-and-conquer algorithm. In CF-style problems, these techniques appear in tasks involving convolution chains, divide-and-conquer DP optimization with convolutions, solving functional equations in power series, and offline evaluation/interpolation routines.

⚠️Common Mistakes

  • Ignoring preconditions: inversion requires f_0 \neq 0; log requires f_0 = 1; exp requires constant term 0; sqrt requires the constant term to be a quadratic residue. Violating these leads to division by zero or undefined operations.
  • Forgetting to trim trailing zeros: many algorithms assume correct degree; stale high-degree zeros can bloat NTT sizes and break degree bounds in division/modulo.
  • Wrong truncation: Newton iterations must always reduce modulo x^m (or x^{2m} when doubling) to keep vectors bounded; forgetting truncation doubles lengths and time.
  • NTT length mis-sizing: convolution requires a power-of-two length at least deg(A)+deg(B)+1; too short leads to circular convolution errors.
  • Mixing moduli: NTT code is modulus-specific; using a different prime without updating primitive roots, powers, or inverses causes wrong answers.
  • Precision mismatch in multipoint evaluation/interpolation: product-tree levels must use exact polynomials; using floating FFT with rounding can mis-evaluate; prefer NTT or CRT if you need integers.
  • Off-by-one in derivative/integral: derivative reduces length by 1; integral increases by 1 with inv(i) factors; forgetting inverse factorial precomputation for many integrals is common.
  • Performance pitfalls: naive polynomial modulo (long division) is O(n^2); always switch to reversal + inverse for O(n \log n). Likewise, avoid rebuilding NTT twiddle factors inside inner loops.

Key Formulas

Cauchy Product

Explanation: The k-th coefficient of the product A(x)B(x) equals the sum over all pairs of indices that add to k. This is the definition of polynomial convolution.

FFT/NTT Multiplication Time

Explanation: Fast multiplication reduces from quadratic to near-linear time in the number of coefficients by transforming, multiplying point-wise, and inverse transforming.

Newton Inversion Update

Explanation: Starting from the inverse at low precision, this iteration doubles the number of correct coefficients each time. It solves F G = 1 modulo higher powers of x.

Formal Logarithm

Explanation: The logarithm of a series with constant term 1 is obtained by dividing its derivative by itself and integrating. All operations are truncated to the needed degree.

Formal Exponential via Newton

Explanation: To compute G = (H), update G so that G matches H. Each step doubles precision and relies on fast multiply and log.

Series Square Root Newton Step

Explanation: Analogous to scalar Newton iteration for square roots, this update converges quadratically to a series g with , assuming the constant term has a square root.

Quotient via Reversal

Explanation: For A of degree n-1 and B of degree m-1, compute the high-to-low quotient by reversing, inverting, multiplying, then reversing back. The remainder is A - BQ.

Interpolation Denominator

Explanation: If P(x)= (x - ), then evaluating its derivative at gives the denominator used to weight in fast interpolation.

Multipoint Evaluation Complexity

Explanation: Using a product tree of polynomials and remainders, evaluation at m points costs quasi-linear time with an extra log factor for the tree depth.

Complexity Analysis

Fast polynomial multiplication with NTT runs in O(n n) time and O(n) extra space once twiddle tables are prepared. Newton-based operations inherit this cost: inversion, logarithm, exponential, and square root all perform O( n) iterations where the i-th iteration multiplies/cons at size roughly 2^i, summing to O(n n). Division and modulo reduce to one inversion at size about n-m+1 plus a handful of multiplications, hence O(n n) for polynomials with comparable size. The constants differ: exp and sqrt typically involve multiple multiplies per iteration and calls to log or inverse, often 3–6 convolutions per doubling step. Multipoint evaluation and interpolation use a product tree with O( m) levels. Building the tree costs O(m m) via repeated multiplications. Evaluation proceeds by computing remainders modulo each node’s polynomial, which at each level touches all coefficients and costs O(n n) per level, yielding O((n + m) m) in typical NTT-based implementations. Interpolation adds one derivative, one multipoint evaluation of that derivative, and a reconstruction pass up the tree combining child results with sibling polynomials, again O(m m). Space usage is O(n) to O(n n) depending on whether you store the entire tree and intermediate remainders; iterative implementations can reduce peak memory by discarding levels when not needed. In practice, care with trimming and reuse of buffers prevents superlinear memory blow-ups.

Code Examples

NTT-based convolution modulo 998244353
1#include <bits/stdc++.h>
2using namespace std;
3
4// NTT for modulus 998244353 (primitive root 3)
5struct NTT {
6 static const uint32_t MOD = 998244353;
7 static const uint32_t G = 3;
8 uint32_t addmod(uint32_t a, uint32_t b){ uint32_t s=a+b; return s>=MOD? s-MOD:s; }
9 uint32_t submod(uint32_t a, uint32_t b){ return a>=b? a-b: a+MOD-b; }
10 uint32_t mulmod(uint64_t a, uint64_t b){ return uint32_t((a*b)%MOD); }
11 uint32_t modpow(uint32_t a, uint64_t e){ uint64_t r=1, x=a; while(e){ if(e&1) r=(r*x)%MOD; x=(x*x)%MOD; e>>=1;} return (uint32_t)r; }
12
13 void ntt(vector<uint32_t>& a, bool invert){
14 int n = (int)a.size();
15 static vector<int> rev; static vector<uint32_t> roots{0,1};
16 if((int)rev.size()!=n){
17 int k = __builtin_ctz(n);
18 rev.assign(n,0);
19 for(int i=0;i<n;i++) rev[i]=(rev[i>>1]>>1)|((i&1)<<(k-1));
20 }
21 if((int)roots.size()<n){
22 int k = __builtin_ctz(roots.size());
23 roots.resize(n);
24 while((1<<k)<n){
25 uint32_t z = modpow(G, (MOD-1)>>(k+1));
26 for(int i=1<<(k-1); i<(1<<k); ++i){
27 roots[2*i]=roots[i];
28 roots[2*i+1]=mulmod(roots[i], z);
29 }
30 ++k;
31 }
32 }
33 for(int i=0;i<n;i++) if(i<rev[i]) swap(a[i],a[rev[i]]);
34 for(int len=1; len<n; len<<=1){
35 for(int i=0;i<n;i+=2*len){
36 for(int j=0;j<len;j++){
37 uint32_t u=a[i+j];
38 uint32_t v=mulmod(a[i+j+len], roots[len+j]);
39 a[i+j]=addmod(u,v);
40 a[i+j+len]=submod(u,v);
41 }
42 }
43 }
44 if(invert){
45 reverse(a.begin()+1,a.end());
46 uint32_t inv_n = modpow(n, MOD-2);
47 for(auto &x:a) x = mulmod(x, inv_n);
48 }
49 }
50
51 vector<uint32_t> convolution(const vector<uint32_t>& a, const vector<uint32_t>& b){
52 if(a.empty()||b.empty()) return {};
53 size_t need = a.size()+b.size()-1; int n=1; while(n<(int)need) n<<=1;
54 vector<uint32_t> fa(a.begin(), a.end()), fb(b.begin(), b.end());
55 fa.resize(n); fb.resize(n);
56 ntt(fa,false); ntt(fb,false);
57 for(int i=0;i<n;i++) fa[i]=mulmod(fa[i],fb[i]);
58 ntt(fa,true); fa.resize(need);
59 return fa;
60 }
61} ntt;
62
63int main(){
64 ios::sync_with_stdio(false); cin.tie(nullptr);
65 // Example: (1 + 2x + 3x^2) * (3 + 4x)
66 vector<uint32_t> A={1,2,3};
67 vector<uint32_t> B={3,4};
68 auto C = ntt.convolution(A,B);
69 // Print coefficients
70 for(size_t i=0;i<C.size();++i){
71 if(i) cout << ' ';
72 cout << C[i];
73 }
74 cout << "\n"; // Expected: [3, 10, 17, 12] modulo 998244353 same as integers
75 return 0;
76}
77

This code implements the Number Theoretic Transform for modulus 998244353 and uses it to multiply two polynomials. It precomputes bit-reversal and roots of unity tables, performs forward transforms, multiplies point-wise, and applies the inverse transform. The output coefficients represent the product A(x)B(x) modulo the prime.

Time: O(n log n)Space: O(n)
Polynomial division and modulo via Newton inversion
1#include <bits/stdc++.h>
2using namespace std;
3
4struct NTT { // same as in Example 1 (shortened here by declaration)
5 static const uint32_t MOD = 998244353; static const uint32_t G = 3;
6 uint32_t addmod(uint32_t a,uint32_t b){ uint32_t s=a+b; return s>=MOD? s-MOD:s; }
7 uint32_t submod(uint32_t a,uint32_t b){ return a>=b? a-b: a+MOD-b; }
8 uint32_t mulmod(uint64_t a,uint64_t b){ return uint32_t((a*b)%MOD); }
9 uint32_t modpow(uint32_t a,uint64_t e){ uint64_t r=1,x=a; while(e){ if(e&1) r=(r*x)%MOD; x=(x*x)%MOD; e>>=1;} return (uint32_t)r; }
10 void ntt(vector<uint32_t>& a,bool inv){ int n=a.size(); static vector<int> rev; static vector<uint32_t> roots{0,1};
11 if((int)rev.size()!=n){ int k=__builtin_ctz(n); rev.assign(n,0); for(int i=0;i<n;i++) rev[i]=(rev[i>>1]>>1)|((i&1)<<(k-1)); }
12 if((int)roots.size()<n){ int k=__builtin_ctz(roots.size()); roots.resize(n); while((1<<k)<n){ uint32_t z=modpow(G,(MOD-1)>>(k+1)); for(int i=1<<(k-1); i<(1<<k); ++i){ roots[2*i]=roots[i]; roots[2*i+1]=mulmod(roots[i],z);} ++k; } }
13 for(int i=0;i<n;i++) if(i<rev[i]) swap(a[i],a[rev[i]]);
14 for(int len=1; len<n; len<<=1){ for(int i=0;i<n;i+=2*len){ for(int j=0;j<len;j++){ uint32_t u=a[i+j]; uint32_t v=mulmod(a[i+j+len], roots[len+j]); a[i+j]=addmod(u,v); a[i+j+len]=submod(u,v); } } }
15 if(inv){ reverse(a.begin()+1,a.end()); uint32_t inv_n=modpow(n,MOD-2); for(auto &x:a) x=mulmod(x,inv_n); }
16 }
17 vector<uint32_t> convolution(const vector<uint32_t>& a,const vector<uint32_t>& b){ if(a.empty()||b.empty()) return {}; size_t need=a.size()+b.size()-1; int n=1; while(n<(int)need) n<<=1; vector<uint32_t> fa(a.begin(),a.end()), fb(b.begin(),b.end()); fa.resize(n); fb.resize(n); ntt(fa,false); ntt(fb,false); for(int i=0;i<n;i++) fa[i]=mulmod(fa[i],fb[i]); ntt(fa,true); fa.resize(need); return fa; }
18} ntt;
19
20using Poly = vector<uint32_t>;
21
22void trim(Poly &a){ while(!a.empty() && a.back()==0) a.pop_back(); }
23
24Poly add(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av= i<a.size()?a[i]:0; uint32_t bv= i<b.size()?b[i]:0; c[i]=ntt.addmod(av,bv);} trim(c); return c; }
25Poly sub(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av= i<a.size()?a[i]:0; uint32_t bv= i<b.size()?b[i]:0; c[i]=ntt.submod(av,bv);} trim(c); return c; }
26Poly mul(const Poly &a,const Poly &b){ return ntt.convolution(a,b); }
27
28Poly rev(Poly a, size_t n){ a.resize(n); reverse(a.begin(), a.end()); return a; }
29
30Poly inv(const Poly &f, size_t n){ // Newton inversion modulo x^n, require f[0] != 0
31 Poly g(1); g[0] = ntt.modpow(f[0], NTT::MOD-2);
32 size_t m=1;
33 while(m<n){ m<<=1; Poly f_cut(min(f.size(), m)); for(size_t i=0;i<f_cut.size();++i) f_cut[i]=f[i];
34 Poly t = mul(mul(g,g), f_cut); t.resize(m);
35 // g <- g * (2 - f*g) mod x^m
36 Poly two_g = g; for(auto &x:two_g) x = ntt.addmod(x, x);
37 g.resize(m);
38 for(size_t i=0;i<m;i++){
39 uint32_t tg = i<t.size()? t[i] : 0;
40 g[i] = ntt.submod(two_g[i], tg);
41 }
42 g = mul(g, Poly(g.begin(), g.begin()+m)); // Mistake if used; correct below
43 // Correct update: g = g_old * (2 - f_cut * g_old)
44 // To avoid extra allocation, recompute properly
45 Poly fg = mul(f_cut, Poly(g.begin(), g.begin()+m)); fg.resize(m);
46 Poly one_minus_fg(m);
47 for(size_t i=0;i<m;i++) one_minus_fg[i] = ntt.submod(ntt.addmod(0, (i==0?2:0)), fg[i]);
48 g = mul(Poly(g.begin(), g.begin()+(m>>1)), one_minus_fg); // old g length m/2
49 g.resize(m);
50 }
51 g.resize(n); trim(g); return g;
52}
53
54// A safer, clearer inversion (replacing the above block if desired)
55Poly inv_clear(const Poly &f, size_t n){
56 Poly g(1); g[0] = ntt.modpow(f[0], NTT::MOD-2);
57 size_t m=1;
58 while(m<n){
59 size_t m2 = min(2*m, n);
60 Poly f_cut(min(f.size(), m2)); for(size_t i=0;i<f_cut.size();++i) f_cut[i]=f[i];
61 Poly fg = mul(f_cut, g); fg.resize(m2);
62 // t = 2 - f*g
63 Poly t(m2);
64 t[0] = 2; for(size_t i=0;i<m2;i++) t[i] = ntt.submod(t[i], (i<fg.size()?fg[i]:0));
65 g = mul(g, t); g.resize(m2);
66 m = m2;
67 }
68 g.resize(n); trim(g); return g;
69}
70
71pair<Poly,Poly> divmod(const Poly &A, const Poly &B){ // return {Q,R}
72 Poly a=A, b=B; trim(a); trim(b);
73 if(b.empty()) throw runtime_error("Division by zero polynomial");
74 int n = (int)a.size()-1, m=(int)b.size()-1;
75 if(n<m){ return {Poly{0}, a}; }
76 int k = n-m+1;
77 Poly ar = rev(a, n+1), br = rev(b, m+1);
78 Poly inv_br = inv_clear(br, k);
79 Poly qr = mul(ar, inv_br); qr.resize(k);
80 Poly q = rev(qr, k);
81 Poly r = sub(a, mul(b, q)); r.resize(m); trim(r);
82 trim(q);
83 return {q, r};
84}
85
86int main(){
87 ios::sync_with_stdio(false); cin.tie(nullptr);
88 // Divide A by B: A = x^5 + 2x^3 + 3x + 4, B = x^2 + 1
89 Poly A={4,3,0,2,0,1};
90 Poly B={1,0,1};
91 auto [Q,R] = divmod(A,B);
92 auto print=[&](const Poly &P){ if(P.empty()){ cout<<0; } else { for(size_t i=0;i<P.size();++i){ if(i) cout<<' '; cout<<P[i]; } } cout<<"\n"; };
93 print(Q); // expected quotient roughly degree 3
94 print(R); // remainder degree < 2
95 return 0;
96}
97

This code shows how to divide polynomials in near-linear time using reversal and inversion. The key step is computing the inverse of the reversed divisor modulo x^{n-m+1}, then reversing back to obtain the quotient; the remainder is A - BQ. The inv_clear function implements Newton inversion with doubling and correct truncation.

Time: O(n log n)Space: O(n)
Formal power series: log, exp, sqrt (Newton's method)
1#include <bits/stdc++.h>
2using namespace std;
3
4struct NTT {
5 static const uint32_t MOD = 998244353; static const uint32_t G = 3;
6 uint32_t addmod(uint32_t a,uint32_t b){ uint32_t s=a+b; return s>=MOD? s-MOD:s; }
7 uint32_t submod(uint32_t a,uint32_t b){ return a>=b? a-b: a+MOD-b; }
8 uint32_t mulmod(uint64_t a,uint64_t b){ return uint32_t((a*b)%MOD); }
9 uint32_t modpow(uint32_t a,uint64_t e){ uint64_t r=1,x=a; while(e){ if(e&1) r=(r*x)%MOD; x=(x*x)%MOD; e>>=1;} return (uint32_t)r; }
10 void ntt(vector<uint32_t>& a,bool inv){ int n=a.size(); static vector<int> rev; static vector<uint32_t> roots{0,1};
11 if((int)rev.size()!=n){ int k=__builtin_ctz(n); rev.assign(n,0); for(int i=0;i<n;i++) rev[i]=(rev[i>>1]>>1)|((i&1)<<(k-1)); }
12 if((int)roots.size()<n){ int k=__builtin_ctz(roots.size()); roots.resize(n); while((1<<k)<n){ uint32_t z=modpow(G,(MOD-1)>>(k+1)); for(int i=1<<(k-1); i<(1<<k); ++i){ roots[2*i]=roots[i]; roots[2*i+1]=mulmod(roots[i],z);} ++k; } }
13 for(int i=0;i<n;i++) if(i<rev[i]) swap(a[i],a[rev[i]]);
14 for(int len=1; len<n; len<<=1){ for(int i=0;i<n;i+=2*len){ for(int j=0;j<len;j++){ uint32_t u=a[i+j]; uint32_t v=mulmod(a[i+j+len], roots[len+j]); a[i+j]=addmod(u,v); a[i+j+len]=submod(u,v); } } }
15 if(inv){ reverse(a.begin()+1,a.end()); uint32_t inv_n=modpow(n,MOD-2); for(auto &x:a) x=mulmod(x,inv_n); }
16 }
17 vector<uint32_t> convolution(const vector<uint32_t>& a,const vector<uint32_t>& b){ if(a.empty()||b.empty()) return {}; size_t need=a.size()+b.size()-1; int n=1; while(n<(int)need) n<<=1; vector<uint32_t> fa(a.begin(),a.end()), fb(b.begin(),b.end()); fa.resize(n); fb.resize(n); ntt(fa,false); ntt(fb,false); for(int i=0;i<n;i++) fa[i]=mulmod(fa[i],fb[i]); ntt(fa,true); fa.resize(need); return fa; }
18} ntt;
19using Poly = vector<uint32_t>;
20void trim(Poly &a){ while(!a.empty() && a.back()==0) a.pop_back(); }
21Poly mul(const Poly &a,const Poly &b){ return ntt.convolution(a,b); }
22
23Poly deriv(const Poly &a){ if(a.empty()) return {}; Poly d(max<int>(0,(int)a.size()-1)); for(size_t i=1;i<a.size();++i) d[i-1]= (uint64_t)a[i]*i % NTT::MOD; trim(d); return d; }
24Poly integr(const Poly &a){ Poly I(a.size()+1); static vector<uint32_t> invs(1,0); // invs[k] = 1/k mod MOD
25 while(invs.size()<=a.size()){
26 size_t k=invs.size();
27 invs.push_back(NTT().modpow(k, NTT::MOD-2));
28 }
29 I[0]=0; for(size_t i=0;i<a.size();++i) I[i+1]= (uint64_t)a[i]*invs[i+1] % NTT::MOD; trim(I); return I; }
30
31Poly add(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av=i<a.size()?a[i]:0, bv=i<b.size()?b[i]:0; c[i]=ntt.addmod(av,bv);} trim(c); return c; }
32Poly subP(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av=i<a.size()?a[i]:0, bv=i<b.size()?b[i]:0; c[i]=ntt.submod(av,bv);} trim(c); return c; }
33
34Poly inv_clear(const Poly &f, size_t n){ Poly g(1); g[0]=ntt.modpow(f[0], NTT::MOD-2); size_t m=1; while(m<n){ size_t m2=min(2*m,n); Poly f_cut(min(f.size(),m2)); for(size_t i=0;i<f_cut.size();++i) f_cut[i]=f[i]; Poly fg=mul(f_cut,g); fg.resize(m2); Poly t(m2); t[0]=2; for(size_t i=0;i<m2;i++) t[i]=ntt.submod(t[i], (i<fg.size()?fg[i]:0)); g=mul(g,t); g.resize(m2); m=m2; } g.resize(n); trim(g); return g; }
35
36Poly log_series(const Poly &f, size_t n){ // require f[0]=1
37 Poly df = deriv(f);
38 Poly invf = inv_clear(f, n);
39 Poly q = mul(df, invf); q.resize(n-1);
40 Poly res = integr(q); res.resize(n); trim(res); return res;
41}
42
43Poly exp_series(const Poly &h, size_t n){ // require h[0]=0
44 Poly g(1,1); size_t m=1;
45 while(m<n){ size_t m2=min(2*m,n);
46 Poly lg = log_series(g, m2);
47 Poly diff = subP(h, lg); diff[0] = ntt.addmod(diff[0], 1); // 1 - log(g) + h
48 g = mul(g, diff); g.resize(m2);
49 m = m2;
50 }
51 g.resize(n); trim(g); return g;
52}
53
54Poly sqrt_series(const Poly &f, size_t n){ // require f[0] is a quadratic residue; take c0=sqrt(f0)
55 // For contest use, ensure f0=1 to avoid Tonelli-Shanks; then c0=1
56 uint32_t c0 = 1; // assume f[0]==1
57 Poly g(1, c0); uint32_t inv2 = ntt.modpow(2, NTT::MOD-2);
58 size_t m=1;
59 while(m<n){ size_t m2=min(2*m,n);
60 Poly invg = inv_clear(g, m2);
61 Poly t = mul(f, invg); t.resize(m2);
62 // g <- (g + t) / 2
63 if(g.size()<m2) g.resize(m2,0);
64 for(size_t i=0;i<m2;i++) g[i] = (uint64_t)( (i<g.size()?g[i]:0) + (i<t.size()?t[i]:0) ) * inv2 % NTT::MOD;
65 m = m2;
66 }
67 g.resize(n); trim(g); return g;
68}
69
70int main(){
71 ios::sync_with_stdio(false); cin.tie(nullptr);
72 // f = 1 + 2x + 3x^2 + ... up to n
73 size_t n=8;
74 Poly f(n,0); f[0]=1; f[1]=2; f[2]=3; f[3]=4; // sample
75 Poly lg = log_series(f,n);
76 Poly ex = exp_series(Poly{0,1}, n); // exp(x)
77 Poly sq = sqrt_series(f,n); // assuming f[0]=1
78 auto print=[&](const Poly &P){ for(size_t i=0;i<P.size();++i){ if(i) cout<<' '; cout<<P[i]; } cout<<"\n"; };
79 print(lg);
80 print(ex);
81 print(sq);
82 return 0;
83}
84

This example implements derivative, integral, and Newton-driven formal log, exp, and sqrt. It assumes modulus 998244353 and uses NTT-based multiplication. Preconditions: log requires f[0]=1; exp requires input h with h[0]=0; sqrt assumes f[0]=1 (so the modular square root is 1).

Time: O(n log n)Space: O(n)
Multipoint evaluation and interpolation using a product tree
1#include <bits/stdc++.h>
2using namespace std;
3
4struct NTT {
5 static const uint32_t MOD = 998244353; static const uint32_t G = 3;
6 uint32_t addmod(uint32_t a,uint32_t b){ uint32_t s=a+b; return s>=MOD? s-MOD:s; }
7 uint32_t submod(uint32_t a,uint32_t b){ return a>=b? a-b: a+MOD-b; }
8 uint32_t mulmod(uint64_t a,uint64_t b){ return uint32_t((a*b)%MOD); }
9 uint32_t modpow(uint32_t a,uint64_t e){ uint64_t r=1,x=a; while(e){ if(e&1) r=(r*x)%MOD; x=(x*x)%MOD; e>>=1;} return (uint32_t)r; }
10 void ntt(vector<uint32_t>& a,bool inv){ int n=a.size(); static vector<int> rev; static vector<uint32_t> roots{0,1};
11 if((int)rev.size()!=n){ int k=__builtin_ctz(n); rev.assign(n,0); for(int i=0;i<n;i++) rev[i]=(rev[i>>1]>>1)|((i&1)<<(k-1)); }
12 if((int)roots.size()<n){ int k=__builtin_ctz(roots.size()); roots.resize(n); while((1<<k)<n){ uint32_t z=modpow(G,(MOD-1)>>(k+1)); for(int i=1<<(k-1); i<(1<<k); ++i){ roots[2*i]=roots[i]; roots[2*i+1]=mulmod(roots[i],z);} ++k; } }
13 for(int i=0;i<n;i++) if(i<rev[i]) swap(a[i],a[rev[i]]);
14 for(int len=1; len<n; len<<=1){ for(int i=0;i<n;i+=2*len){ for(int j=0;j<len;j++){ uint32_t u=a[i+j]; uint32_t v=mulmod(a[i+j+len], roots[len+j]); a[i+j]=addmod(u,v); a[i+j+len]=submod(u,v); } } }
15 if(inv){ reverse(a.begin()+1,a.end()); uint32_t inv_n=modpow(n,MOD-2); for(auto &x:a) x=mulmod(x,inv_n); }
16 }
17 vector<uint32_t> convolution(const vector<uint32_t>& a,const vector<uint32_t>& b){ if(a.empty()||b.empty()) return {}; size_t need=a.size()+b.size()-1; int n=1; while(n<(int)need) n<<=1; vector<uint32_t> fa(a.begin(),a.end()), fb(b.begin(),b.end()); fa.resize(n); fb.resize(n); ntt(fa,false); ntt(fb,false); for(int i=0;i<n;i++) fa[i]=mulmod(fa[i],fb[i]); ntt(fa,true); fa.resize(need); return fa; }
18} ntt;
19using Poly = vector<uint32_t>;
20void trim(Poly &a){ while(!a.empty() && a.back()==0) a.pop_back(); }
21Poly add(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av=i<a.size()?a[i]:0, bv=i<b.size()?b[i]:0; c[i]=ntt.addmod(av,bv);} trim(c); return c; }
22Poly subP(const Poly &a,const Poly &b){ Poly c(max(a.size(),b.size())); for(size_t i=0;i<c.size();++i){ uint32_t av=i<a.size()?a[i]:0, bv=i<b.size()?b[i]:0; c[i]=ntt.submod(av,bv);} trim(c); return c; }
23Poly mul(const Poly &a,const Poly &b){ return ntt.convolution(a,b); }
24
25pair<Poly,Poly> divmod(const Poly &A, const Poly &B){ // O(n log n) division
26 auto rev=[&](Poly a,size_t n){ a.resize(n); reverse(a.begin(),a.end()); return a; };
27 auto inv=[&](const Poly &f, size_t n){ Poly g(1); g[0]=ntt.modpow(f[0], NTT::MOD-2); size_t m=1; while(m<n){ size_t m2=min(2*m,n); Poly f_cut(min(f.size(),m2)); for(size_t i=0;i<f_cut.size();++i) f_cut[i]=f[i]; Poly fg=mul(f_cut,g); fg.resize(m2); Poly t(m2); t[0]=2; for(size_t i=0;i<m2;i++) t[i]=ntt.submod(t[i], (i<fg.size()?fg[i]:0)); g=mul(g,t); g.resize(m2); m=m2; } g.resize(n); trim(g); return g; };
28 Poly a=A,b=B; trim(a); trim(b); if(b.empty()) throw runtime_error("div by zero"); int n=(int)a.size()-1,m=(int)b.size()-1; if(n<m) return {Poly{0},a}; int k=n-m+1; Poly ar=rev(a,n+1), br=rev(b,m+1); Poly invbr=inv(br,k); Poly qr=mul(ar,invbr); qr.resize(k); Poly q=rev(qr,k); Poly r=subP(a,mul(b,q)); r.resize(m); trim(q); trim(r); return {q,r}; }
29
30// Build product tree of (x - x_i)
31struct ProductTree {
32 vector<Poly> tree; int m;
33 ProductTree(const vector<uint32_t>& xs){ m=xs.size(); tree.resize(4*m); build(1,0,m,xs); }
34 void build(int p,int l,int r,const vector<uint32_t>& xs){ if(r-l==1){ tree[p] = Poly{ntt.submod(0,xs[l]), 1}; return; } int mid=(l+r)/2; build(p<<1,l,mid,xs); build(p<<1|1,mid,r,xs); tree[p]=mul(tree[p<<1], tree[p<<1|1]); }
35};
36
37// Multipoint evaluation: compute P(x_i)
38void eval_down(int p,int l,int r,const ProductTree &PT,const Poly &rem, vector<uint32_t>& ans){ if(r-l==1){ // rem is constant remainder modulo (x-x_i)
39 ans[l] = rem.empty()?0:rem[0]; return; }
40 int mid=(l+r)/2; auto [ql, rl] = divmod(rem, PT.tree[p<<1]); auto [qr, rr] = divmod(rem, PT.tree[p<<1|1]); // We only need remainders
41 eval_down(p<<1, l, mid, PT, rl, ans);
42 eval_down(p<<1|1, mid, r, PT, rr, ans);
43}
44
45vector<uint32_t> multipoint_evaluate(const Poly &P, const vector<uint32_t>& xs){ ProductTree PT(xs); vector<uint32_t> ans(xs.size()); Poly R = P; // start with whole polynomial
46 // compute remainder modulo root polynomial to keep degree manageable
47 auto [q, r0] = divmod(P, PT.tree[1]); (void)q;
48 eval_down(1,0,xs.size(),PT,r0,ans);
49 return ans;
50}
51
52// Interpolation: given (x_i, y_i), return degree < m polynomial Q
53Poly interpolate(const vector<uint32_t>& xs, const vector<uint32_t>& ys){ int m=xs.size(); ProductTree PT(xs); // P = prod (x - x_i)
54 // Compute P'(x_i)
55 Poly P = PT.tree[1]; // product polynomial
56 // derivative
57 Poly Pd(max<int>(0,(int)P.size()-1)); for(size_t i=1;i<P.size();++i) Pd[i-1]= (uint64_t)P[i]*i % NTT::MOD;
58 // Evaluate Pd at xs
59 vector<uint32_t> dvals = multipoint_evaluate(Pd, xs);
60 // weights w_i = y_i / P'(x_i)
61 vector<uint32_t> w(m);
62 for(int i=0;i<m;i++){ uint32_t invd = ntt.modpow(dvals[i], NTT::MOD-2); w[i] = (uint64_t)ys[i]*invd % NTT::MOD; }
63 // Reconstruct via tree: bottom-up combine
64 function<Poly(int,int,int)> build = [&](int p,int l,int r)->Poly{
65 if(r-l==1){ return Poly{ w[l] }; }
66 int mid=(l+r)/2; Poly L = build(p<<1,l,mid); Poly R = build(p<<1|1,mid,r);
67 Poly resL = mul(L, PT.tree[p<<1|1]); // L * poly(right)
68 Poly resR = mul(R, PT.tree[p<<1]); // R * poly(left)
69 return add(resL, resR);
70 };
71 Poly Q = build(1,0,m); Q.resize(m); trim(Q); return Q;
72}
73
74int main(){
75 ios::sync_with_stdio(false); cin.tie(nullptr);
76 // Let's build a polynomial and evaluate/interpolate it
77 Poly P = {5, 2, 7, 1}; // 5 + 2x + 7x^2 + x^3
78 vector<uint32_t> xs = {1,2,3,4};
79 vector<uint32_t> ys = multipoint_evaluate(P, xs);
80 Poly Q = interpolate(xs, ys);
81 auto print=[&](const Poly &A){ for(size_t i=0;i<A.size();++i){ if(i) cout<<' '; cout<<A[i]; } cout<<"\n"; };
82 for(auto v: ys){ cout << v << ' '; } cout << "\n"; // values
83 print(Q); // should match P (up to degree < m)
84 return 0;
85}
86

We build a product tree of (x βˆ’ x_i), then compute remainders top-down to evaluate P at all x_i. For interpolation, we compute the product polynomial P(x), its derivative, the weights w_i = y_i / P'(x_i), and combine them via the tree: at each node, the left result is multiplied by the right polynomial and vice versa, then added. This yields the unique polynomial of degree < m passing through the given points.

Time: O(n log^2 n)Space: O(n log n)