LEMUR Neural Network Dataset: Towards Seamless AutoML
Arash Torabi Goodarzi, Roman Kochnev, Waleed Khalid, Hojjat Torabi Goudarzi, Furui Qin, Tolgay Atinc Uzun, Yashkumar Sanjaybhai Dhameliya, Yash Kanubhai Kathiriya, Zofia Antonina Bentyn, Dmitry Ignatov, Radu Timofte
公開日: 2025/4/14
Abstract
Neural networks have become the backbone of modern AI, yet designing, evaluating, and comparing them remains labor-intensive. While many datasets exist for training models, there are few standardized collections of the models themselves. We present LEMUR, an open-source dataset and framework that brings together a large collection of PyTorch-based neural networks across tasks such as classification, segmentation, detection, and natural language processing. Each model follows a common template, with configurations and results logged in a structured database to ensure consistency and reproducibility. LEMUR integrates Optuna for automated hyperparameter optimization, provides statistical analysis and visualization tools, and exposes an API for seamless access to performance data. The framework also supports extensibility, enabling researchers to add new models, datasets, or metrics without breaking compatibility. By standardizing implementations and unifying evaluation, LEMUR aims to accelerate AutoML research, facilitate fair benchmarking, and lower the barrier to large-scale neural network experimentation.