Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features

Thalles Santos Silva, Helio Pedrini, Adín Ramírez Rivera

The Forty-first International Conference on Machine Learning (ICML 2024)

...

Abstract

This paper introduces a novel approach to improving the training stability of self-supervised learning (SSL) methods by leveraging a non-parametric memory of seen concepts. The proposed method involves augmenting a neural network with a memory component to stochastically compare current image views with previously encountered concepts. Additionally, we introduce stochastic memory blocks to regularize training and enforce consistency between image views. We extensively benchmark our method on many vision tasks, such as linear probing, transfer learning, low-shot classification, and image retrieval on many datasets. The experimental results consolidate the effectiveness of the proposed approach in achieving stable SSL training without additional regularizers while learning highly transferable representations and requiring less computing time and resources.

Pre-trained models

Models pre-trained on the ImageNet-1K dataset with transformer backbones.

Epochs URL
MaSSL (ViT-S/16) 800 Checkpoints
MaSSL (ViT-B/16) 400 Checkpoints

Important links

Code OpenReview

Video

Reference


@inproceedings{silva2024learning,
    title={Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features},
    author={Silva, Thalles and Pedrini, Helio and Ram{\'\i}rez, Ad{\'\i}n},
    booktitle={Forty-first International Conference on Machine Learning},
    year={2024},
    url={https://openreview.net/forum?id=Ed4KgHoKNe}
}
            

Authors

Thalles Santos Silva

Helio Pedrini

Adín Ramírez Rivera

Acknowledgements

The computations were performed in part on resources provided by Sigma2---the National Infrastructure for High Performance Computing and Data Storage in Norway---through Project NN8104K. This work was funded in part by the Research Council of Norway, through its Centre for Research-based Innovation funding scheme (grant no. 309439), and Consortium Partners. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior---Brasil (CAPES)---Finance Code 001