This paper makes video editing easier by teaching an AI to spread changes from the first frame across the whole video smoothly and accurately.
The paper introduces the Transformer, a model that understands and generates sequences (like sentences) using only attention, without RNNs or CNNs.