RAVE: End-to-end Hierarchical Visual Localization with Rasterized and Vectorized HD map
Jinyu Miao, Tuopu Wen, Kun Jiang, Kangan Qian, Zheng Fu, Yunlong Wang, Zhihuang Zhang, Mengmeng Yang, Jin Huang, Zhihua Zhong, Diange Yang
Published: 2025/3/2
Abstract
Accurate localization serves as an important component in autonomous driving systems. Traditional rule-based localization involves many standalone modules, which is theoretically fragile and requires costly hyperparameter tuning, therefore sacrificing the accuracy and generalization. In this paper, we propose an end-to-end visual localization system, RAVE, in which the surrounding images are associated with the HD map data to estimate pose. To ensure high-quality observations for localization, a low-rank flow-based prior fusion module (FLORA) is developed to incorporate misaligned map prior into the perceived BEV features. Pursuing a balance among efficiency, interpretability, and accuracy, a hierarchical localization module is proposed, which efficiently estimates poses through a decoupled BEV neural matching-based pose solver (DEMA) using rasterized HD map, and then refines the estimation through a Transformer-based pose regressor (POET) using vectorized HD map. The experimental results demonstrate that our method can perform robust and accurate localization under varying environmental conditions while running efficiently.