SPACE-iT: Spatial-Aware Curriculum Exploration and Feedback-Driven Adaptive Augmentation for Vision Transformer Distillation

Jihyeon Seong, Hyunkyung Han

公開日: 2025/6/12

Abstract

Knowledge distillation (KD) has proven to be a powerful technique for improving the performance of Vision Transformers (ViTs). However, traditional KD methods often treat all image patches uniformly, overlooking spatial variations in learning difficulty. To address this limitation, we propose SPACE-iT, a novel framework for Spatial-Aware Curriculum Exploration via Feedback-Driven Adaptive Augmentation. At its core, SPACE-iT computes spatial confidence scores at the attention, patch, and logit levels. This confidence map supports a two-fold strategy: (1) dynamically modulating the distillation loss, and (2) guiding an adaptive augmentation module that intensifies reverse curriculum learning. By establishing a feedback-driven reverse curriculum that initially exposes students to challenging regions-progressing from hard to easy-SPACE-iT enables more effective learning of complex spatial patterns and achieves superior performance over vanilla distillation, without introducing additional memory overhead.

全文を読む (arXiv.org)