Groups
Category
Transformer expressiveness studies what kinds of sequence-to-sequence mappings a Transformer can represent or approximate.
Transformers are permutation-invariant by default, so they need positional encodings to understand word order in sequences.