Deep Synthetic Cross-Project Approaches for Software Reliability Growth Modeling

Taehyoun Kim, Duksan Ryu, Jongmoon Baik

公開日: 2025/9/21

Abstract

Software Reliability Growth Models (SRGMs) are widely used to predict software reliability based on defect discovery data collected during testing or operational phases. However, their predictive accuracy often degrades in data-scarce environments, such as early-stage testing or safety-critical systems. Although cross-project transfer learning has been explored to mitigate this issue by leveraging data from past projects, its applicability remains limited due to the scarcity and confidentiality of real-world datasets. To overcome these limitations, we propose Deep Synthetic Cross-project SRGM (DSC-SRGM), a novel approach that integrates synthetic data generation with cross-project transfer learning. Synthetic datasets are generated using traditional SRGMs to preserve the statistical characteristics of real-world defect discovery trends. A cross-correlation-based clustering method is applied to identify synthetic datasets with patterns similar to the target project. These datasets are then used to train a deep learning model for reliability prediction. The proposed method is evaluated on 60 real-world datasets, and its performance is compared with both traditional SRGMs and cross-project deep learning models trained on real-world datasets. DSC-SRGM achieves up to 23.3% improvement in predictive accuracy over traditional SRGMs and 32.2% over cross-project deep learning models trained on real-world datasets. However, excessive use of synthetic data or a naive combination of synthetic and real-world data may degrade prediction performance, highlighting the importance of maintaining an appropriate data balance. These findings indicate that DSC-SRGM is a promising approach for software reliability prediction in data-scarce environments.

全文を読む (arXiv.org)