University of Toronto

Excessive Invariance Causes Adversarial Vulnerability

Uses bijective networks to identify large subspaces of invariance-based vulnerability and introduces the independence cross-entropy loss which partially alleviates it.

Attention Is All You Need

The Transformer, a sequence transduction model that replaces recurrent layers and relies entirely on attention mechanisms, achieves new SotA on machine translation tasks while reducing training time significantly.