Identifying the post-pandemic determinants of low performing students in Latin America through interpretable Machine Learning SHAP Values-Insights from PISA 2022
Marcos Delprato
Published: 2025/9/29
Abstract
The high prevalence of students not achieving the basic competencies in Latin America is concerning. Even more so given the region's deep structural inequalities and the larger post-pandemic regional learning losses. Within this scenario, this paper contributes to the identification of the determinants of bottom and low performers (below level 2) using recent advancements on explainable machine learning methods. In particular, relying on PISA 2022 data for 10 countries and using the Shapley Additive Explanations (SHAP) analysis, I identify critical factors impacting on the student performance across low performers groups. I find that a student with the highest probability of being a not achiever speaks a minority language and had repeated, has no digital devices at home, comes from a poor family and works for payment half of the week, and the school he/she attends has wide disadvantages such as bad school climate, weak ICT infrastructure and poor teaching quality (only a third of teachers being certified). Regarding countries' estimates, I find quite homogeneous patterns as far as global average contribution of top ranked factors is concerned, with repetition at primary, household wealth, and educational ICT inputs being top ten ranked covariates in at least 8 out of the 10 total countries. The paper findings contribute to the broad literature on strategies to identify and to target those most left behind in Latin American education systems.