Data-Driven Prediction of Maternal Nutritional Status in Ethiopia Using Ensemble Machine Learning Models
Amsalu Tessema, Tizazu Bayih, Kassahun Azezew, Ayenew Kassie
公開日: 2025/9/18
Abstract
Malnutrition among pregnant women is a major public health challenge in Ethiopia, increasing the risk of adverse maternal and neonatal outcomes. Traditional statistical approaches often fail to capture the complex and multidimensional determinants of nutritional status. This study develops a predictive model using ensemble machine learning techniques, leveraging data from the Ethiopian Demographic and Health Survey (2005-2020), comprising 18,108 records with 30 socio-demographic and health attributes. Data preprocessing included handling missing values, normalization, and balancing with SMOTE, followed by feature selection to identify key predictors. Several supervised ensemble algorithms including XGBoost, Random Forest, CatBoost, and AdaBoost were applied to classify nutritional status. Among them, the Random Forest model achieved the best performance, classifying women into four categories (normal, moderate malnutrition, severe malnutrition, and overnutrition) with 97.87% accuracy, 97.88% precision, 97.87% recall, 97.87% F1-score, and 99.86% ROC AUC. These findings demonstrate the effectiveness of ensemble learning in capturing hidden patterns from complex datasets and provide timely insights for early detection of nutritional risks. The results offer practical implications for healthcare providers, policymakers, and researchers, supporting data-driven strategies to improve maternal nutrition and health outcomes in Ethiopia.