
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, et al.
00
2020-10-22
vitvision
Abstract
This paper introduces and evaluates the idea described in “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, and reports empirical results that helped shape subsequent work in vit, vision.