Concepts12

Groups

Sequence-to-Sequence with Attention

Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.

#sequence-to-sequence#attention#encoder-decoder+12

📚TheoryIntermediate

Recurrent Neural Network Theory

A Recurrent Neural Network (RNN) processes sequences by carrying a hidden state that is updated at every time step using h_t = f(W_h h_{t-1} + W_x x_t + b).

Concepts12

Sequence-to-Sequence with Attention

Recurrent Neural Network Theory

Knowledge Distillation Loss

Focal Loss

Maximum Likelihood & Generative Models

Label Smoothing

Cross-Entropy

KL Divergence

Empirical Risk Minimization

KL Divergence (Kullback-Leibler Divergence)

Shannon Entropy

Information Theory