Traces of Image Memorability in Vision Encoders: Activations, Attention Distributions and Autoencoder Losses

Ece Takmaz, Albert Gatt, Jakub Dotlacil

公開日: 2025/9/1

Abstract

Images vary in how memorable they are to humans. Inspired by findings from cognitive science and computer vision, this paper explores the correlates of image memorability in pretrained vision encoders, focusing on latent activations, attention distributions, and the uniformity of image patches. We find that these features correlate with memorability to some extent. Additionally, we explore sparse autoencoder loss over the representations of vision transformers as a proxy for memorability, which yields results outperforming past methods using convolutional neural network representations. Our results shed light on the relationship between model-internal features and memorability. They show that some features are informative predictors of what makes images memorable to humans.

全文を読む (arXiv.org)