Sorbonne University

Training data-efficient image transformers & distillation through attention

Produces competitive convolution-free transformer, training only on ImageNet.