deep learning

Excessive Invariance Causes Adversarial Vulnerability

Uses bijective networks to identify large subspaces of invariance-based vulnerability and introduces the independence cross-entropy loss which partially alleviates it.

On The Power of Curriculum Learning in Training Deep Networks

Demonstrates the benefit of curriculum learning with different scoring and pacing functions on various small datasets.

ConvNets and ImageNet Beyond Accuracy: Understanding Mistakes and Uncovering Biases

Demonstrates that providing explantions and model criticism can be useful tools to improve the reliability of ImageNet-trained CNNs for end-users.

Training data-efficient image transformers & distillation through attention

Produces competitive convolution-free transformer, training only on ImageNet.

Learning the Predictability of the Future

Presents the idea of using hyperbolic embeddings for hierarchical representations and provides some experiments classifying action within a hierarchy of actions.

Scaling Laws for Neural Language Models

A large-scale empirical invesigation of scaling laws shows that performance has a power-law relationship to model size, dataset size, and training compute, while architectural details have minimal effects.

Why does deep and cheap learning work so well?

Success of reasonably sized neural networks hinges on symmetry, locality, and polynomial log-probability in data from the natural world.

VisualCOMET: Reasoning about the Dynamic Context of a Still Image

By training on a large-scale repository of Visual Commonsense Graphs, VisualCOMET, a single stream vision-language transformer model, is able to generate inferences about past and present events by integrating information from the image with textual descriptions of the present event and location.

Attention Is All You Need

The Transformer, a sequence transduction model that replaces recurrent layers and relies entirely on attention mechanisms, achieves new SotA on machine translation tasks while reducing training time significantly.

A critique of pure learning and what artificial neural networks can learn from animal brains

Development of artificial neural networks should leverage the insight that much of animal behavior is innate as a result of wiring rules encoded in the genome, learned through billions of years of evolution.