Groups
Key-Value memory systems store information as pairs where keys are used to look up values by similarity rather than exact match.
Multi-Head Attention runs several attention mechanisms in parallel so each head can focus on different relationships in the data.