Frequency-domain representations, spectral methods, and convolution theory underlying CNNs, audio models, and positional encodings.
9 concepts
A Fourier series rewrites any reasonable periodic function as a weighted sum of sines and cosines (or complex exponentials).
The Fourier Transform converts a signal from the time domain into the frequency domain, revealing which sine and cosine waves (frequencies) make up the signal.
The Discrete Fourier Transform (DFT) converts a length-N sequence from the time (or spatial) domain into N complex frequency coefficients that describe how much of each sinusoid is present.
The Convolution Theorem says that convolving two signals in time (or space) equals multiplying their spectra in the frequency domain.
The Short-Time Fourier Transform (STFT) breaks a signal into small overlapping windows and computes a Fourier transform on each window to reveal how frequencies evolve over time.
The wavelet transform splits a signal into βcoarseβ trends and βfineβ details at multiple scales, like zooming in and out with a smart magnifying glass.
Transformers are permutation-invariant by default, so they need positional encodings to understand word order in sequences.
Spectral normalization rescales a weight matrix so its largest singular value (spectral norm) is at most a target value, typically 1.