Predicting cell-specific gene expression profile and knockout impact through deep learning

Yongjian He, Vered Klein, Orr Levy, Xu-Wen Wang

Published: 2025/10/3

Abstract

Gene expression data is essential for understanding how genes are regulated and interact within biological systems, providing insights into disease pathways and potential therapeutic targets. Gene knockout has proven to be a fundamental technique in molecular biology, allowing the investigation of the function of specific genes in an organism, as well as in specific cell types. However, gene expression patterns are quite heterogeneous in single-cell transcriptional data from a uniform environment, representing different cell states, which produce cell-type and cell-specific gene knockout impacts. A computational method that can predict the single-cell resolution knockout impact is still lacking. Here, we present a data-driven framework for learning the mapping between gene expression profiles derived from gene assemblages, enabling the accurate prediction of perturbed expression profiles following knockout (KO) for any cell, without relying on prior perturbed data. We systematically validated our framework using synthetic data generated from gene regulatory dynamics models, two mouse knockout single-cell datasets, and high-throughput in vitro CRISPRi Perturb-seq data. Our results demonstrate that the framework can accurately predict both expression profiles and KO effects at the single-cell level. Our approach provides a generalizable tool for inferring gene function at single-cell resolution, offering new opportunities to study genetic perturbations in contexts where large-scale experimental screens are infeasible.

Predicting cell-specific gene expression profile and knockout impact through deep learning | SummarXiv | SummarXiv