SimCLR, a simple unsupervised contrastive learning framework, uses data augmentation for positive pairs, a nonlinear projection head, normalized temperature-scaled cross entropy loss, and large batch sizes to achieve SotA in self-supervised, semi-supervised, and transfer learning domains.
The Object-centric perception, prediction, and planning (OP3) framework demonstrates strong generalization to novel configurations in block stacking tasks by symmetrically processing entity representations extracted from raw visual observations.
Adversarial examples trained on an ensemble of CNNs with a retinal preprocessing layer reduce the accuracy of time-limited humans in a two alternative forced choice task.
Introduces a novel hierarchical representation of visual scenes, Physical Scene Graphs (PSGs), as well as a network for learning them from RGB movies, PSGNet, which outperforms other unsupervised methods in scene segmentation.