Skip to yearly menu bar Skip to main content


Poster

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun · Rui Wang · Zixuan Zhu · Lihua Jing · Yuanfang Guo


Abstract:

High-quality open-source datasets are essential for advancing deep neural networks. However, the unauthorized commercial use of these datasets has raised significant concerns about copyright protection. One promising approach is backdoor watermark-based dataset ownership verification (BW-DOV), in which dataset protectors implant specific backdoors into illicit models through dataset watermarking, enabling tracing these models through abnormal prediction behaviors. Unfortunately, the targeted nature of these BW-DOV methods can be maliciously exploited, potentially leading to harmful side effects. While existing harmless methods attempt to mitigate these risks, watermarked datasets can still negatively affect prediction results, partially compromising dataset functionality. In this paper, we propose a more harmless backdoor watermark, called EntropyMark, which improves prediction confidence without altering the final prediction results. For this purpose, an entropy-based constraint is introduced to regulate the probability distribution. Specifically, we design an iterative clean-label dataset watermarking framework. Our framework employs gradient matching and adaptive data selection to optimize backdoor injection. In parallel, we introduce a hypothesis test method grounded in entropy inconsistency to verify dataset ownership. Extensive experiments on benchmark datasets demonstrate the effectiveness, transferability, and defense resistance of our approach.

Live content is unavailable. Log in and register to view live content