A review of topological data analysis and topological deep learning in molecular sciences
JunJie Wee, Jian Jiang
Published: 2025/9/21
Abstract
Topological Data Analysis (TDA) has emerged as a powerful framework for extracting robust, multiscale, and interpretable features from complex molecular data for artificial intelligence (AI) modeling and topological deep learning (TDL). This review provides a comprehensive overview of the development, methodologies, and applications of TDA in molecular sciences. We trace the evolution of TDA from early qualitative tools to advanced quantitative and predictive models, highlighting innovations such as persistent homology, persistent Laplacians, and topological machine learning. The paper explores TDA's transformative impact across diverse domains, including biomolecular stability, protein-ligand interactions, drug discovery, materials science, and viral evolution. Special attention is given to recent advances in integrating TDA with machine learning and AI, enabling breakthroughs in protein engineering, solubility and toxicity prediction, and the discovery of novel materials and therapeutics. We also discuss the limitations of current TDA approaches and outline future directions, including the integration of TDA with advanced AI models and the development of new topological invariants. This review aims to serve as a foundational reference for researchers seeking to harness the power of topology in molecular science.