BERTector: An Intrusion Detection Framework Constructed via Joint-dataset Learning Based on Language Model

Haoyang Hu, Xun Huang, Chenyu Wu, Shiwen Liu, Zhichao Lian, Shuangquan Zhang

Published: 2025/8/14

Abstract

Intrusion detection systems (IDS) are widely used to maintain the stability of network environments, but still face restrictions in generalizability due to the heterogeneity of network traffics. In this work, we propose BERTector, a new framework of joint-dataset learning for IDS based on BERT. BERTector integrates three key components: NSS-Tokenizer for traffic-aware semantic tokenization, supervised fine-tuning with a hybrid dataset, and low-rank adaptation for efficient fine-tuning. Experiments show that BERTector achieves state-of-the-art detection accuracy, strong generalizability, and excellent robustness. BERTector achieves the highest accuracy of 99.28% on NSL-KDD and reaches the average 80% detection success rate against four perturbations. These results establish a unified and efficient solution for modern IDS in complex and dynamic network environments.

Read Full Paper (arXiv.org)