Smart Contract Intent Detection with Pre-trained Programming Language Model

Youwei Huang, Jianwen Li, Sen Fang, Yao Li, Peng Yang, Bin Hu

Published: 2025/8/27

Abstract

Malicious developer intents in smart contracts constitute a significant security threat in decentralized applications (DApps), leading to substantial economic losses. To address this, SmartIntentNN was previously introduced as a deep learning model for detecting unsafe developer intents. It integrates the Universal Sentence Encoder, K-means clustering-based intent highlighting, and a Bidirectional Long Short-Term Memory (BiLSTM) network for multi-label classification, achieving an F1 score of 0.8633. In this study, we present an enhanced version of this model, SmartIntentNN2 (Smart Contract Intent Neural Network V2). The primary enhancement is the integration of a BERT-based pre-trained programming language model, which we domain-adaptively pre-train on a dataset of 16,000 real-world smart contracts using a Masked Language Modeling (MLM) objective. SmartIntentNN2 retains the BiLSTM-based multi-label classification network for downstream tasks. Experimental results demonstrate that SmartIntentNN2 achieves superior overall performance with an accuracy of 0.9789, precision of 0.9090, recall of 0.9476, and an F1 score of 0.9279, substantially outperforming its predecessor and other baseline models. Notably, SmartIntentNN2 also shows significant advantages over large language models (LLMs), achieving a 65.5% relative improvement in F1 score over GPT-4.1 on this specialized task. These results establish SmartIntentNN2 as the new state-of-the-art model for smart contract intent detection.

Read Full Paper (arXiv.org)