Attention Is All You Need
IntermediateAshish Vaswani, Noam Shazeer et al.Jun 12arXiv
The paper introduces the Transformer, a model that understands and generates sequences (like sentences) using only attention, without RNNs or CNNs.
#Transformer#Self-Attention#Multi-Head Attention