Groups
Sequence-to-sequence with attention lets a decoder focus on the most relevant parts of the input at each output step, rather than compressing everything into a single vector.
Key-Value memory systems store information as pairs where keys are used to look up values by similarity rather than exact match.