Accelerating Crystal Structure Prediction Using Data-Derived Potentials: High-Pressure Binary Hydrides

Lewis J. Conway, Chris J. Pickard

Published: 2025/9/27

Abstract

Crystal structures can be predicted from first-principles using ab initio random structure searching AIRSS and density functional theory (DFT). AIRSS provides a method to sample the potential energy landscape and DFT provides a robust and accurate description of that landscape. Classical interatomic potentials can describe energy landscapes at a significantly lower computational cost, typically at the expense of robustness and accuracy. Modern machine-learning interatomic potentials offer a compromise, with greater robustness and accuracy than classical potentials at a fraction of the computational cost of DFT. In this work, we use Ephemeral Data-Derived Potentials EDDPs to perform accelerated AIRSS calculations for the binary hydrides at 100 GPa. Since the training data is generated iteratively using AIRSS, the searches can be performed with no prior knowledge of hydrides. These potentials allow for more diverse searches, sampling a wider range of compositions, larger unit cells, and orders-of-magnitude more structures. In addition to recovering many of the known structures, the searches reveal structures such as the hydrogen-rich phases of H$_{22}$(BrH), H$_{23}$Pb, and H$_{32}$Mg, supermolecular phases of H$_{25}$Cs and H$_{26}$Rn, and many substoichiometric variants of known hydrides. Our results indicate that using the current generation of pretrained universal MLIPs to search for novel high-pressure hydrides is less effective due to model instabilities or markedly slower inference speeds and highlight the necessity of generating new, targeted data to drive further discoveries.