🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
⏱️Coach🧩Problems🧠Thinking🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way
∑MathIntermediate

Convolution Theorem

Key Points

  • •
    The Convolution Theorem says that convolving two signals in time (or space) equals multiplying their spectra in the frequency domain.
  • •
    This equivalence lets us compute long convolutions very fast using the Fast Fourier Transform (FFT).
  • •
    Linear convolution of lengths n and m has length n+m−1 and costs O(nm) naively but only O(N log N) with FFT after zero-padding to N.
  • •
    Circular convolution appears naturally with the Discrete Fourier Transform (DFT), so zero-padding is required to get linear convolution from a DFT.
  • •
    In practice, convolution via FFT trades multiplications in time for pointwise multiplications in frequency, plus two FFTs and one inverse FFT.
  • •
    The theorem underpins fast polynomial multiplication, digital filtering, image processing, and large-integer arithmetic.
  • •
    Numerical issues include floating-point round-off, scaling factors in the inverse FFT, and choosing a sufficiently large transform size.
  • •
    For integer polynomials, rounding the inverse FFT results to the nearest integer recovers exact coefficients when precision is adequate.

Prerequisites

  • →Complex Numbers and Euler's Formula — The DFT and FFT use complex exponentials e^{i\theta} to represent sinusoids.
  • →Fourier Transform Basics — Understanding time–frequency duality is essential to grasp why convolution becomes multiplication.
  • →Big-O Notation — Needed to compare O(n m) naive convolution with O(N log N) FFT-based methods.
  • →Discrete Signals and Sampling — The DFT assumes discrete, finite-length sequences and periodic extension.
  • →Polynomial Arithmetic — Recognizing that polynomial multiplication equals coefficient convolution motivates FFT usage.

Detailed Explanation

Tap terms for definitions

01Overview

Convolution is a way to combine two functions or sequences by sliding one over the other and measuring how much they overlap. The Convolution Theorem is the powerful bridge connecting the time (or spatial) domain to the frequency domain: it states that convolution in time corresponds to multiplication in frequency, and vice versa. In practical terms, this lets us replace a potentially expensive sliding-sum computation with a few fast Fourier transforms and cheap, elementwise multiplications. This idea is fundamental in signal processing, image processing, data analysis, and algorithms for polynomial and large-integer multiplication. The theorem applies in both continuous and discrete settings, with appropriate Fourier transforms (Fourier transform for continuous functions; DFT/FFT for discrete sequences). To get linear convolution using a DFT, we must zero-pad to avoid wrap-around effects because the DFT naturally models periodic (circular) behavior. With well-chosen padding, the convolution can be computed in O(N \log N) time using FFT instead of O(nm) time naively, where N is a convenient transform size at least as large as n+m−1. The Convolution Theorem not only accelerates computations but also provides conceptual clarity: filtering shapes a signal’s spectrum via multiplication, so designing filters is often easier in the frequency domain.

02Intuition & Analogies

Imagine placing a stencil (kernel) over a long strip of wallpaper (signal) and sliding it from left to right. At each position, you multiply each stencil hole’s weight by the wallpaper’s pattern directly beneath it, summing to get one output value. That sliding, weighing, and summing is convolution. Now switch perspectives. Instead of working in the wallpaper’s spatial pattern, consider its mix of repeating motifs—its frequencies. In this frequency viewpoint, complex interactions from sliding sums simplify dramatically: the combined effect of the stencil and wallpaper is just multiplying their frequency ingredients at each matching frequency. This is like mixing paint colors by multiplying each color channel component-wise rather than repeatedly dabbing and sliding the brush. Why does this help computationally? Sliding the stencil across every position is costly: for each output position we do many multiplications and additions. But if we can quickly convert both the wallpaper and stencil into their frequency palettes (via FFT), then merging them is just a fast, point-by-point multiplication, and converting back (inverse FFT) instantly reconstructs the full, detailed pattern. There is a catch: when we use the Discrete Fourier Transform, it assumes signals repeat periodically, so sliding wraps around (circular convolution). To get the true, non-wrapping slide (linear convolution), we add blank margins—zero-padding—wide enough so the stencil never falls off an edge and wraps. This small adjustment gives us the exact sliding-sum result while still enjoying the speed of frequency-domain multiplication.

03Formal Definition

Let f and g be functions on \(R\) with suitable integrability. Their (continuous) convolution is \((f * g)(t) = ∫−∞∞​ f(τ)\, g(t - τ)\, dτ\). If \(F\{⋅\}\) denotes the Fourier transform, the Convolution Theorem states \(F\{f * g\}(ω) = F\{f\}(ω)\, F\{g\}(ω)\). Conversely, multiplication in time corresponds to convolution in frequency: \(F\{f\,g\} = (F\{f\} * F\{g\})\). For discrete sequences \(x[n]\) and \(h[n]\), the linear convolution is \((x * h)[n] = ∑k=−∞∞​ x[k] h[n-k]\), which in finite-support cases reduces to a finite sum. The Discrete Fourier Transform (DFT) of length \(N\) is \(X[k] = ∑n=0N−1​ x[n] e−2πikn/N\), with inverse \(x[n] = N1​ ∑k=0N−1​ X[k] e2πikn/N\). The DFT’s natural convolution is circular convolution: \((x ⊛ y)[n] = ∑k=0N−1​ x[k] y[(n-k) mod N]\). The discrete Convolution Theorem states that the DFT of a circular convolution equals the pointwise product of DFTs. Linear convolution is obtained by zero-padding sequences to length \(N ≥ n + m - 1\), computing circular convolution via DFT, and then discarding trailing padded zeros. In algorithmic contexts, the FFT computes the DFT and its inverse in \(O(N log N)\) time, enabling fast convolution via two FFTs, one pointwise multiplication, and one inverse FFT.

04When to Use

Use the Convolution Theorem when you need to compute long convolutions efficiently. Classic examples include: (1) Digital filtering: low-pass, high-pass, and smoothing filters applied to audio or sensor data; convolving with an impulse response is sped up dramatically by FFTs. (2) Image processing: blurring (Gaussian), sharpening, and edge detection use 2D convolutions that become elementwise multiplications in the 2D frequency domain. (3) Polynomial multiplication: coefficients of the product polynomial are the linear convolution of the input coefficient sequences; FFT-based convolution yields near-linear-time multiplication. (4) Large-integer multiplication: representing integers in a base and convolving digit arrays via FFT approximates the product, later corrected by rounding and carry propagation. (5) Correlation via convolution: cross-correlation can be computed using Fourier transforms by conjugating one sequence; this is useful for template matching and fast similarity searches. (6) Real-time or block processing: overlap-add and overlap-save methods combine the FFT approach with streaming to handle arbitrarily long signals using fixed-size FFTs. Avoid FFT-based convolution when sequences are very short (where (O(nm)) direct convolution may be faster due to constant factors) or when exact arithmetic is mandatory and floating-point round-off is unacceptable—unless you switch to integer-friendly transforms like the Number Theoretic Transform (NTT).

⚠️Common Mistakes

  1. Forgetting zero-padding: Applying an N-point DFT to sequences and multiplying without padding yields circular convolution, not linear. Fix: pad to (N \ge n + m - 1) (often the next power of two for FFT). 2) Wrong scaling in the inverse FFT: Some FFT libraries place the 1/N factor in the forward transform, others in the inverse. Fix: Know your library’s convention and ensure the final result is scaled correctly. 3) Insufficient transform length: Choosing (N < n + m - 1) causes time-domain aliasing (wrap-around). Fix: Always compute or bound the needed output length and pad accordingly. 4) Floating-point round-off: FFT-based integer polynomial multiplication may give values like 6.999999 instead of 7. Fix: round to nearest integer and consider using long double or splitting tricks for very large inputs; for exactness, consider NTT. 5) Confusing correlation and convolution: Cross-correlation uses (y[n+k]) instead of (y[n-k]); in frequency domain it is (X(\omega)\overline{Y(\omega)}), not (X(\omega)Y(\omega)). Fix: If computing correlation with FFT, conjugate one spectrum. 6) Ignoring complex conjugate symmetry for real signals: Real input sequences have Hermitian-symmetric spectra; failing to exploit this can double work. Fix: use real-to-complex FFTs when available. 7) Numerical overflow with large magnitudes: Even doubles can overflow with large unscaled values. Fix: normalize inputs or use higher-precision types and careful scaling.

Key Formulas

Continuous Convolution

(f∗g)(t)=∫−∞∞​f(τ)g(t−τ)dτ

Explanation: This defines convolution as a sliding integral of one function reversed and shifted over another. It accumulates overlap across all shifts.

Convolution Theorem (Continuous)

F{f∗g}(ω)=F{f}(ω)F{g}(ω)

Explanation: The Fourier transform of a convolution equals the pointwise product of transforms. This identity enables fast computation of convolutions via FFTs.

Discrete Linear Convolution

(x∗h)[n]=k=−∞∑∞​x[k]h[n−k]

Explanation: For finite sequences, only a finite range contributes to the sum. The output length is n+m−1 when inputs have lengths n and m.

Circular Convolution

(x⊛y)[n]=k=0∑N−1​x[k]y[(n−k)modN]

Explanation: Convolution on sequences that are assumed periodic with period N. This is the convolution naturally computed by an N-point DFT.

DFT and Inverse DFT

X[k]=n=0∑N−1​x[n]e−2πikn/N,x[n]=N1​k=0∑N−1​X[k]e2πikn/N

Explanation: These transform a length-N sequence between time and frequency domains. The inverse includes a 1/N factor to recover the original sequence.

Convolution Theorem (Discrete, Circular)

DFT{x⊛y}[k]=X[k]⋅Y[k]

Explanation: The DFT of the circular convolution equals the pointwise product of DFTs. This is the core identity used in FFT-based convolution.

Zero-Padding Rule

N≥n+m−1

Explanation: To compute linear convolution via DFT, pad each sequence so the DFT length N is at least n+m−1. This prevents time-domain aliasing.

Time Complexity

Tnaive​(n,m)=O(nm),TFFT​(N)=O(NlogN)

Explanation: Direct convolution scales with the product of input lengths, while FFT-based convolution scales near-linearly with the padded length N.

Product-to-Convolution Duality

(fg)(t)F​(F{f}∗F{g})(ω)

Explanation: Multiplication in time corresponds to convolution in frequency. This dual view is often used in spectral analysis and filter design.

Parseval's Theorem (DFT form)

n=0∑N−1​∣x[n]∣2=N1​k=0∑N−1​∣X[k]∣2

Explanation: Energy is preserved between time and frequency domains up to the 1/N factor. Useful for reasoning about numerical stability and scaling.

Complexity Analysis

Let the input lengths be n and m, and let N be the chosen transform size for FFT-based convolution. Direct (naive) linear convolution computes each of the n+m−1 outputs as a dot product that touches up to min(n, m) terms, leading to a worst-case time of O(n m). Space usage is O(n+m) to store inputs and outputs, with O(1) extra working space beyond the output array. In contrast, FFT-based convolution proceeds by (1) zero-padding to N≥n+m−1 (often next power of two), (2) two forward FFTs of size N, (3) pointwise complex multiplication across N bins, and (4) one inverse FFT of size N. Each FFT costs O(N log N), so total time is O(N log N); the pointwise multiply is O(N). Space is O(N) to hold padded arrays and their spectra. In practice, constants matter: FFTs have larger constant factors than naive convolution, so for small n and m (dozens of samples), the direct method can be faster. For large inputs, the asymptotic advantage dominates. Numerical considerations include floating-point round-off in the FFT (especially when recovering integer results); using double or long double precision and careful rounding mitigates error. For 2D convolution of an H×W image with an h×w kernel, direct time is O(H W h w), while FFT-based time is approximately O(H W log(H W)) after padding to suitable sizes, often producing dramatic speedups for large kernels.

Code Examples

Naive linear convolution (O(nm))
1#include <bits/stdc++.h>
2using namespace std;
3
4// Compute linear convolution of two real sequences a and b
5// Output length is a.size() + b.size() - 1
6vector<double> linear_convolution(const vector<double>& a, const vector<double>& b) {
7 size_t n = a.size();
8 size_t m = b.size();
9 vector<double> c(n + m - 1, 0.0);
10 for (size_t i = 0; i < n; ++i) {
11 for (size_t j = 0; j < m; ++j) {
12 c[i + j] += a[i] * b[j];
13 }
14 }
15 return c;
16}
17
18int main() {
19 // Example: smooth a short signal with a simple averaging kernel
20 vector<double> signal = {1, 3, 2, 5, 4};
21 vector<double> kernel = {1.0/3, 1.0/3, 1.0/3}; // 3-point moving average
22
23 vector<double> out = linear_convolution(signal, kernel);
24
25 cout << fixed << setprecision(4);
26 cout << "Linear convolution result (naive):\n";
27 for (double v : out) cout << v << " ";
28 cout << "\n";
29 return 0;
30}
31

This program computes the standard (linear, non-wrapping) convolution by directly summing products for each output index. It demonstrates the definition of convolution as a sliding weighted sum.

Time: O(n m)Space: O(n + m)
Fast linear convolution via FFT (real sequences)
1#include <bits/stdc++.h>
2using namespace std;
3
4// Iterative Cooley-Tukey FFT (in-place). If invert=true, computes the inverse FFT.
5void fft(vector<complex<double>>& a, bool invert) {
6 int n = (int)a.size();
7 // Bit-reversal permutation
8 for (int i = 1, j = 0; i < n; ++i) {
9 int bit = n >> 1;
10 for (; j & bit; bit >>= 1) j ^= bit;
11 j ^= bit;
12 if (i < j) swap(a[i], a[j]);
13 }
14 for (int len = 2; len <= n; len <<= 1) {
15 double ang = 2 * M_PI / len * (invert ? -1 : 1);
16 complex<double> wlen(cos(ang), sin(ang));
17 for (int i = 0; i < n; i += len) {
18 complex<double> w(1);
19 for (int j = 0; j < len/2; ++j) {
20 complex<double> u = a[i + j];
21 complex<double> v = a[i + j + len/2] * w;
22 a[i + j] = u + v;
23 a[i + j + len/2] = u - v;
24 w *= wlen;
25 }
26 }
27 }
28 if (invert) {
29 for (int i = 0; i < n; ++i) a[i] /= n;
30 }
31}
32
33// Convolution of two real sequences using FFT
34vector<double> convolution_fft(const vector<double>& a, const vector<double>& b) {
35 int n = (int)a.size();
36 int m = (int)b.size();
37 if (n == 0 || m == 0) return {};
38 int need = n + m - 1;
39 int N = 1;
40 while (N < need) N <<= 1; // next power of two for speed
41
42 vector<complex<double>> fa(N), fb(N);
43 for (int i = 0; i < n; ++i) fa[i] = complex<double>(a[i], 0.0);
44 for (int i = n; i < N; ++i) fa[i] = 0.0;
45 for (int i = 0; i < m; ++i) fb[i] = complex<double>(b[i], 0.0);
46 for (int i = m; i < N; ++i) fb[i] = 0.0;
47
48 // Forward FFTs
49 fft(fa, false);
50 fft(fb, false);
51
52 // Pointwise multiply
53 for (int i = 0; i < N; ++i) fa[i] *= fb[i];
54
55 // Inverse FFT
56 fft(fa, true);
57
58 vector<double> result(need);
59 for (int i = 0; i < need; ++i) result[i] = fa[i].real();
60 return result;
61}
62
63int main() {
64 // Example: convolving a signal with a Gaussian-like kernel
65 vector<double> signal = {1, 3, 2, 5, 4, 6, 2, 1, 0, 3};
66 vector<double> kernel = {0.0625, 0.25, 0.375, 0.25, 0.0625}; // approximated 5-point Gaussian
67
68 vector<double> out = convolution_fft(signal, kernel);
69
70 cout << fixed << setprecision(6);
71 cout << "Linear convolution via FFT (zero-padded):\n";
72 for (double v : out) cout << v << ' ';
73 cout << "\n";
74
75 return 0;
76}
77

This program implements an in-place iterative FFT and uses it to compute linear convolution by zero-padding, transforming both sequences, multiplying spectra elementwise, and applying the inverse FFT. The 1/N scaling is placed in the inverse transform.

Time: O(N log N) where N is the next power of two ≥ n+m−1Space: O(N)
Polynomial multiplication via FFT-based convolution (integer coefficients)
1#include <bits/stdc++.h>
2using namespace std;
3
4void fft(vector<complex<double>>& a, bool invert) {
5 int n = (int)a.size();
6 for (int i = 1, j = 0; i < n; ++i) {
7 int bit = n >> 1;
8 for (; j & bit; bit >>= 1) j ^= bit;
9 j ^= bit;
10 if (i < j) swap(a[i], a[j]);
11 }
12 for (int len = 2; len <= n; len <<= 1) {
13 double ang = 2 * M_PI / len * (invert ? -1 : 1);
14 complex<double> wlen(cos(ang), sin(ang));
15 for (int i = 0; i < n; i += len) {
16 complex<double> w(1);
17 for (int j = 0; j < len/2; ++j) {
18 complex<double> u = a[i + j];
19 complex<double> v = a[i + j + len/2] * w;
20 a[i + j] = u + v;
21 a[i + j + len/2] = u - v;
22 w *= wlen;
23 }
24 }
25 }
26 if (invert) for (int i = 0; i < n; ++i) a[i] /= n;
27}
28
29// Multiply integer polynomials A(x) and B(x), returning integer coefficients of C(x)=A(x)B(x)
30vector<long long> multiply_polynomials(const vector<int>& A, const vector<int>& B) {
31 int n = (int)A.size(), m = (int)B.size();
32 if (n == 0 || m == 0) return {};
33 int need = n + m - 1;
34 int N = 1; while (N < need) N <<= 1;
35
36 vector<complex<double>> fa(N), fb(N);
37 for (int i = 0; i < n; ++i) fa[i] = complex<double>(A[i], 0);
38 for (int i = 0; i < m; ++i) fb[i] = complex<double>(B[i], 0);
39
40 fft(fa, false); fft(fb, false);
41 for (int i = 0; i < N; ++i) fa[i] *= fb[i];
42 fft(fa, true);
43
44 vector<long long> C(need);
45 // Round to nearest integer to handle floating error
46 for (int i = 0; i < need; ++i) C[i] = llround(fa[i].real());
47
48 // Optional: handle carry if using large digit bases (not needed for plain coefficients)
49 return C;
50}
51
52int main() {
53 // Example: (3 + 2x + 5x^2) * (5 + 1x + 2x^2)
54 vector<int> A = {3, 2, 5};
55 vector<int> B = {5, 1, 2};
56
57 vector<long long> C = multiply_polynomials(A, B);
58
59 cout << "Product coefficients (lowest degree first):\n";
60 for (auto v : C) cout << v << ' ';
61 cout << "\n"; // Expected: [15, 13, 33, 9, 10]
62 return 0;
63}
64

Polynomial multiplication is linear convolution of coefficient sequences. The FFT accelerates this by transforming, multiplying pointwise, and inverting; rounding recovers exact integer coefficients in typical cases.

Time: O(N log N) where N is the padded FFT length ≥ deg(A)+deg(B)+1Space: O(N)
#convolution theorem#fft#dft#linear convolution#circular convolution#zero padding#polynomial multiplication#frequency domain#signal processing#image filtering#cross-correlation#hermitian symmetry#parseval#ntt#time complexity