Groups
Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.
A Recurrent Neural Network (RNN) processes sequences by carrying a hidden state that is updated at every time step using h_t = f(W_h h_{t-1} + W_x x_t + b).