Concepts13

Groups

Sequence-to-Sequence with Attention

Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.

#sequence-to-sequence#attention#encoder-decoder+12

📚TheoryIntermediate

Recurrent Neural Network Theory

A Recurrent Neural Network (RNN) processes sequences by carrying a hidden state that is updated at every time step using h_t = f(W_h h_{t-1} + W_x x_t + b).

1 2

Concepts13

Sequence-to-Sequence with Attention

Recurrent Neural Network Theory

Knowledge Distillation Loss

Focal Loss

Key-Value Memory Systems

Self-Attention as Graph Neural Network

Multi-Head Attention

Scaled Dot-Product Attention

Label Smoothing

Positional Encoding Theory

Message Passing Framework

Cross-Entropy