Concepts18

Groups

Sequence-to-Sequence with Attention

Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.

#sequence-to-sequence#attention#encoder-decoder+12

📚TheoryIntermediate

Recurrent Neural Network Theory

A Recurrent Neural Network (RNN) processes sequences by carrying a hidden state that is updated at every time step using h_t = f(W_h h_{t-1} + W_x x_t + b).

1 2

Concepts18

Sequence-to-Sequence with Attention

Recurrent Neural Network Theory

Surrogate Loss Theory

Knowledge Distillation Loss

Focal Loss

Cross-Entropy Loss

Neural Collapse

t-SNE & UMAP

Maximum Likelihood & Generative Models

Label Smoothing

Cross-Entropy

KL Divergence