Self-supervised learning

Contrasting quadratic assignments for set-based representation learning

The standard approach to contrastive learning is to maximize the agreement between different views of the data. The views are ordered in pairs, such that they are either positive, encoding different views of the same object, or negative, …

How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning?

Despite the recent success of video self-supervised learning models, there is much still to be understood about their generalization capability. In this paper, we investigate how sensitive video self-supervised learning is to the current conventional …

Self-Supervised Learning of Object Parts for Semantic Segmentation

Progress in self-supervised learning has brought strong general image representation learning methods. Yet so far, it has mostly focused on image-level learning. In turn, tasks such as unsupervised image segmentation have not benefited from this …

Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing

Self-supervised visual representation learning has recently attracted significant research interest. While a common way to evaluate self-supervised representations is through transfer to various downstream tasks, we instead investigate the problem of …

PASS: An ImageNet replacement for self-supervised pretraining without humans

Computer vision has long relied on ImageNet and other large datasets of images sampled from the Internet for pretraining models. However, these datasets have ethical and technical shortcomings, such as containing personal information taken without …

Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning

The quality of the image representations obtained from self-supervised learning depends strongly on the type of data augmentations used in the learning formulation. Recent papers have ported these methods from still images to videos and found that …

On Compositions of Transformations in Contrastive Self-Supervised Learning

In the image domain, excellent representations can be learned by inducing invariance to content-preserving transformations via noise contrastive learning. In this paper, we generalize contrastive learning to a wider set of transformations, and their …