INTEGRITY

SPEECH & AUDIO

Targeted Data Poisoning for Black-Box Audio Datasets Ownership Verification

March 25, 2025

Abstract

Protecting the use of audio datasets is a major concern for data owners, particularly with the recent rise of audio deep learning models. While watermarks can be used to protect the data itself, they do not allow to identify a deep learning model trained on a protected dataset. In this paper, we adapt to audio data the recently introduced data taggants approach. Data taggants is a method to verify if a neural network was trained on a protected image dataset with top-k predictions access to the model only. This method relies on a targeted data poisoning scheme by discreetly altering a small fraction (1%) of the dataset as to induce a harmless behavior on out-of-distribution data called keys. We evaluate our method on the Speechcommands and the ESC50 datasets and state of the art transformer models, and show that we can detect the use of the dataset with high confidence without loss of performance. We also show the robustness of our method against common data augmentation techniques, making it a practical method to protect audio datasets.

Download the Paper

AUTHORS

Written by

Wassim (Wes) Bouaziz

El Mahdi El Mhamdi

Nicolas Usunier

Publisher

ICASSP

Related Publications

May 12, 2026

HUMAN & MACHINE INTELLIGENCE

RESEARCH

NeuralSet: A High-Performing Python Package for Neuro-AI

Jean Remi King, Corentin Bel, Linnea Evanson, Julien Gadonneix, Sophia Houhamdi, Jarod Levy, Josephine Raugel, Andrea Santos Revilla, Mingfang (Lucy) Zhang, Julie Bonnaire, Charlotte Caucheteux, Alexandre Défossez, Théo Desbordes, Pablo Diego-Simón, Shubh Khanna, Juliette Millet, Pierre Orhan, Saarang Panchavati, Antoine Ratouchniak, Alexis Thual, Teon Brooks, Katelyn Begany, Yohann Benchetrit, Marlene Careil, Hubert Jacob Banville, Stéphane d'Ascoli, Simon Dahan, Jérémy Rapin

May 12, 2026

March 17, 2026

RESEARCH

SPEECH & AUDIO

Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech

Omnilingual SONAR Team, João Maria Janeiro, Pere Lluís Huguet Cabot, Ioannis Tsiamas, Yen Meng, Vivek Iyer, Guillem Ramirez, Loic Barrault, Belen Alastruey, Yu-An Chung, Marta R. Costa-jussa, David Dale, Kevin Heffernan, Jaehyeong Jo, Artyom Kozhevnikov, Alexandre Mourachko, Christophe Ropers, Holger Schwenk, Paul-Ambroise Duquenne

March 17, 2026

December 16, 2025

SPEECH & AUDIO

COMPUTER VISION

SAM Audio: Segment Anything in Audio

Bowen Shi, Andros Tjandra, John Hoffman, Helin Wang, Yi-Chiao Wu, Luya Gao, Julius Richter, Matt Le, Apoorv Vyas, Sanyuan Chen, Christoph Feichtenhofer, Piotr Dollar, Wei-Ning Hsu, Ann Lee

December 16, 2025

December 16, 2025

SPEECH & AUDIO

COMPUTER VISION

Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning

Apoorv Vyas, Heng-Jui Chang, Cheng-Fu Yang, Bernie Huang, Luya Gao, Julius Richter, Sanyuan Chen, Matt Le, Piotr Dollar, Christoph Feichtenhofer, Ann Lee, Wei-Ning Hsu

December 16, 2025

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.