Scalable Thread-Safety Analysis of Java Classes with CodeQL

Bjørnar Haugstad Jåtten, Simon Boye Jørgensen, Rasmus Petersen, Raúl Pardo

公開日: 2025/9/2

Abstract

In object-oriented languages software developers rely on thread-safe classes to implement concurrent applications. However, determining whether a class is thread-safe is a challenging task. This paper presents a highly scalable method to analyze thread-safety in Java classes. We provide a definition of thread-safety for Java classes founded on the correctness principle of the Java memory model, data race freedom. We devise a set of properties for Java classes that are proven to ensure thread-safety. We encode these properties in the static analysis tool CodeQL to automatically analyze Java source code. We perform an evaluation on the top 1000 GitHub repositories. The evaluation comprises 3632865 Java classes; with 1992 classes annotated as @ThreadSafe from 71 repositories. These repositories include highly popular software such as Apache Flink (24.6k stars), Facebook Fresco (17.1k stars), PrestoDB (16.2k starts), and gRPC (11.6k starts). Our queries detected thousands of thread-safety errors. The running time of our queries is below 2 minutes for repositories up to 200k lines of code, 20k methods, 6000 fields, and 1200 classes. We have submitted a selection of detected concurrency errors as PRs, and developers positively reacted to these PRs. We have submitted our CodeQL queries to the main CodeQL repository, and they are currently in the process of becoming available as part of GitHub actions. The results demonstrate the applicability and scalability of our method to analyze thread-safety in real-world code bases.