Forward Prediction for Physical Reasoning

Demonstrates the potential of forward-prediction for solving PHYRE physical reasoning tasks by investigating various combinations of object and pixel-based forward-prediction and task-solution models.

IntPhys 2019: A Benchmark for Visual Intuitive Physics Understanding

IntPhys provides a well-designed benchmark for evaluting a system's understanding of a few core concepts about the physics of objects.

Occlusion resistant learning of intuitive physics from videos

Combines a compositional rendering network with a recurrent interaction network to learn dynamics in scenes with significant occlusion, but relies on ground-truth object positions and segmentations.

Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases

Analysis of invariances in representations from contrastive self-supervised models reveals that they leverage aggressive cropping on object-centric datasets to improve occlusion invariance at the expense of viewpoint and category instance invariance.

Compositional Video Prediction

Novel method for video prediction from a single frame by decomposing the scene into entities with location and appearance features, capturing ambiguities with a global latent variable.

Embodied Multimodal Multitask Learning

Proposes multitask model to jointly learn semantic goal navigation and embodied question answering.