Gradient-Free Training of Quantized Neural Networks

Noa Cohen, Omkar Joglekar, Dotan Di Castro, Vladimir Tchuiev, Shir Kozlovsky, Michal Moshkovitz

公開日: 2024/10/13

Abstract

Training neural networks requires significant computational resources and energy. Methods like mixed-precision and quantization-aware training reduce bit usage, yet they still depend heavily on computationally expensive gradient-based optimization. In this work, we propose a paradigm shift: eliminate gradients altogether. One might hope that, in a finite quantized space, finding optimal weights with out gradients would be easier but we theoretically prove that this problem is NP-hard even in simple settings where the continuous case is efficiently solvable. To address this, we introduce a novel heuristic optimization framework that avoids full weight updates and significantly improves efficiency. Empirically, our method achieves performance comparable to that of full-precision gradient-based training on standard datasets and architectures, while using up to 3x less energy and requiring up to 5x fewer parameter updates.

全文を読む (arXiv.org)