A Systematic Review of FAIR-compliant Big Data Software Reference Architectures
João Pedro de Carvalho Castro, Maria Júlia Soares De Grandi, Cristina Dutra de Aguiar
Published: 2025/9/17
Abstract
To meet the standards of the Open Science movement, the FAIR Principles emphasize the importance of making scientific data Findable, Accessible, Interoperable, and Reusable. Yet, creating a repository that adheres to these principles presents significant challenges. Managing large volumes of diverse research data and metadata, often generated rapidly, requires a precise approach. This necessity has led to the development of Software Reference Architectures (SRAs) to guide the implementation process for FAIR-compliant repositories. This article conducts a systematic review of research efforts focused on architectural solutions for such repositories. We detail our methodology, covering all activities undertaken in the planning and execution phases of the review. We analyze 323 references from reputable sources and expert recommendations, identifying 7 studies on general-purpose big data SRAs, 13 pipelines implementing FAIR Principles in specific contexts, and 3 FAIR-compliant big data SRAs. We provide a thorough description of their key features and assess whether the research questions posed in the planning phase were adequately addressed. Additionally, we discuss the limitations of the retrieved studies and identify tendencies and opportunities for further research.