Radio Galaxy Zoo: Morphological classification by Fanaroff-Riley designation using self-supervised pre-training

Nutthawara Buatthaisong, Inigo Val Slijepcevic, Anna M. M. Scaife, Micah Bowles, Andrew Hopkins, Devina Mohan, Stanislav S Shabala, O. Ivy Wong

Published: 2025/9/15

Abstract

In this study, we examine over 14,000 radio galaxies finely selected from Radio Galaxy Zoo (RGZ) project and provide classifications for approximately 5,900 FRIs and 8,100 FRIIs. We present an analysis of these predicted radio galaxy morphologies for the RGZ catalogue, classified using a pre-trained radio galaxy foundation model that has been fine-tuned to predict Fanaroff-Riley (FR) morphology. As seen in previous studies, our results show overlap between morphologically classified FRI and FRII luminosity-size distributions and we find that the model's confidence in its predictions is lowest in this overlap region, suggesting that source morphologies are more ambiguous. We identify the presence of low-luminosity FRII sources, the proportion of which, with respect to the total number of FRIIs, is consistent with previous studies. However, a comparison of the low-luminosity FRII sources found in this work with those identified by previous studies reveals differences that may indicate their selection is influenced by the choice of classification methodology. We investigate the impacts of both pre-training and fine-tuning data selection on model performance for the downstream classification task, and show that while different pre-training data choices affect model confidence they do not appear to cause systematic generalisation biases for the range of physical and observational characteristics considered in this work; however, we note that the same is not necessarily true for fine-tuning. As automated approaches to astronomical source identification and classification become increasingly prevalent, we highlight training data choices that can affect the model outputs and propagate into downstream analyses.