Sanitization of Multimedia Content: A Survey of Techniques, Attacks, and Future Directions

Andrea Ciccotelli, Hanaa Abbas, Roberto Di Pietro

Published: 2022/7/5

Abstract

The exploding rate of data publishing in our networked society has magnified the risk of sensitive information leakage and misuse, pushing the need to secure multimedia content from unintended exposure to potentially untrusted third parties. Data sanitization -- the process of securing multimedia by removing or obfuscating sensitive information such as personally identifiable or confidential data -- helps to mitigate the severe impact of security risks and privacy violations related to the published data. In this paper, we make several contributions. First, we classify data sanitization methods along two main dimensions: the media type (images, audio, text, and video) and the techniques used to sanitize sensitive regions, which we group into obfuscation-based (e.g., distortion, replacement) and removal-based approaches. Building on this categorization, we present a comprehensive review of technologies designed to protect multimedia content. We then broaden the scope by introducing the attacks that specifically target these technologies, followed by a discussion of potential countermeasures. Each aspect is complemented with critical discussions and lessons learned. Finally, we identify and elaborate on open research challenges in the crucial domain of multimodal multimedia sanitization. We argue that the systematization provided in this work -- together with the highlighted challenges and research directions -- offers a valuable blueprint for practitioners, industry, and academia alike, while paving the way for novel research avenues in the field.

Read Full Paper (arXiv.org)