Skip to yearly menu bar Skip to main content


Efficient Model Stealing Defense with Noise Transition Matrix

Dong-Dong Wu · Chilin Fu · Weichang Wu · Wenwen Xia · Xiaolu Zhang · JUN ZHOU · Min-Ling Zhang

Arch 4A-E Poster #8
[ ]
Fri 21 Jun 5 p.m. PDT — 6:30 p.m. PDT


With the escalating complexity and investment cost of training deep neural networks, safeguarding them from unauthorized usage and intellectual property theft has become imperative. Especially the rampant misuse of prediction APIs to replicate models without access to the original data or architecture poses grave security threats. Diverse defense strategies have emerged to address these vulnerabilities, yet these defenses either incur heavy inference overheads or assume idealized attack scenarios. To address these challenges, we revisit the utilization of noise transition matrix as an efficient perturbation technique, which injects noise into predicted posteriors in a linear manner and integrates seamlessly into existing systems with minimal overhead, for model stealing defense. Provably, with such perturbed posteriors, the attacker's cloning process degrades into learning from noisy data. Toward optimizing the noise transition matrix, we proposed a novel bi-level optimization training framework, which performs fidelity on the victim model while the surrogate model adversarially. Comprehensive experimental results demonstrate that our method effectively thwarts model stealing attacks and achieves minimal utility trade-offs, outperforming existing state-of-the-art defenses.

Live content is unavailable. Log in and register to view live content