Mathematical foundations of sequence modeling: recurrence, state space models, and temporal convolutions.
8 concepts
A Recurrent Neural Network (RNN) processes sequences by carrying a hidden state that is updated at every time step using h_t = f(W_h h_{t-1} + W_x x_t + b).
Long Short-Term Memory (LSTM) networks use gates (forget, input, and output) to control what information to erase, write, and reveal at each time step.
A State Space Model (SSM) describes a dynamical system using a state vector x(t) that evolves via a first-order matrix differential equation and produces outputs y(t).
Temporal (causal) convolution computes each output at time t using only the current and past inputs, ensuring no future information leakage.
Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.
Dynamic Time Warping (DTW) aligns two time series that may vary in speed to find the minimum-cost correspondence between their elements.
A Hidden Markov Model (HMM) describes sequences where you cannot see the true state directly, but you can observe outputs generated by those hidden states.