Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge

Junsu Kim, Yunhoe Ku, Dongyoon Han, Seungryul Baek

公開日: 2025/3/30

Abstract

Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data while requiring models to acquire new knowledge without catastrophic forgetting. Recent works have explored generative models, particularly Stable Diffusion (SD), to address these challenges. However, existing approaches use SD mainly as a replay generator, whereas we demonstrate that SD's rich multi-scale representations can serve as a unified backbone. Motivated by this observation, we introduce Diffusion-FSCIL, which extracts four synergistic feature types from SD by capturing real image characteristics through inversion, providing semantic diversity via class-conditioned synthesis, enhancing generalization through controlled noise injection, and enabling replay without image storage through generative features. Unlike conventional approaches requiring synthetic buffers and separate classification backbones, our unified framework operates entirely in the latent space with only lightweight networks ($\approx$6M parameters). Extensive experiments on CUB-200, miniImageNet, and CIFAR-100 demonstrate state-of-the-art performance, with comprehensive ablations confirming the necessity of each feature type. Furthermore, we confirm that our streamlined variant maintains competitive accuracy while substantially improving efficiency, establishing the viability of generative models as practical and effective backbones for FSCIL.

Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge | SummarXiv | SummarXiv