Evaluating Perceptual Distance Models by Fitting Binomial Distributions to Two-Alternative Forced Choice Data
Alexander Hepburn, Raul Santos-Rodriguez, Javier Portilla
Published: 2024/3/15
Abstract
The Two Alternative Forced Choice (2AFC) paradigm offers advantages over the Mean Opinion Score (MOS) paradigm in psychophysics (PF), such as simplicity and robustness. However, when evaluating perceptual distance models, MOS enables direct correlation between model predictions and PF data. In contrast, 2AFC only allows pairwise comparisons to be converted into a quality ranking similar to MOS when comparisons include shared images. In large datasets, like BAPPS, where image patches and distortions are combined randomly, deriving rankings from 2AFC PF data becomes infeasible, as distorted images included in each comparisons are independent. To address this, instead of relying on MOS correlation, researchers have trained ad-hoc neural networks to reproduce 2AFC PF data based on pairs of model distances - a black-box approach with conceptual and operational limitations. This paper introduces a more robust distance-model evaluation method using a pure probabilistic approach, applying maximum likelihood estimation to a binomial decision model. Our method demonstrates superior simplicity, interpretability, flexibility, and computational efficiency, as shown through evaluations of various visual distance models on two 2AFC PF datasets.