Validation-Free Sparse Learning: A Phase Transition Approach to Feature Selection

Sylvain Sardy, Maxime van Cutsem, Xiaoyu Ma

公開日: 2024/11/26

Abstract

The growing environmental footprint of artificial intelligence (AI), especially in terms of storage and computation, calls for more frugal and interpretable models. Sparse models (e.g., linear, neural networks) offer a promising solution by selecting only the most relevant features, reducing complexity, preventing over-fitting and enabling interpretation-marking a step towards truly intelligent AI. The concept of a right amount of sparsity (without too many false positive or too few true positive) is subjective. So we propose a new paradigm previously only observed and mathematically studied for compressed sensing (noiseless linear models): obtaining a phase transition in the probability of retrieving the relevant features. We show in practice how to obtain this phase transition for a class of sparse learners. Our approach is flexible and applicable to complex models ranging from linear to shallow and deep artificial neural networks while supporting various loss functions and sparsity-promoting penalties. It does not rely on cross-validation or on a validation set to select its single regularization parameter. For real-world data, it provides a good balance between predictive accuracy and feature sparsity. A Python package is available at https://github.com/VcMaxouuu/HarderLASSO containing all the simulations and ready-to-use models.