Groups
Category
Neural Collapse describes what happens at the end of training: the penultimate-layer features of each class concentrate tightly around a class mean.
Transformers map sequences to sequences using layers of self-attention and feed-forward networks wrapped with residual connections and LayerNorm.