April 25, 2022
The vulnerability of machine learning models to membership inference attacks has received much attention in recent years. Existing attacks mostly remain impractical due to having high false positive rates, where non-member samples are often erroneously predicted as members. This type of error makes the predicted membership signal unreliable, especially since most samples are non-members in real world applications. In this work, we argue that membership inference attacks can benefit drastically from difficulty calibration, where an attack’s predicted membership score is adjusted to the difficulty of correctly classifying the target sample. We show that difficulty calibration can significantly reduce the false positive rate of a variety of existing attacks without a loss in accuracy.
Publisher
ICLR
Research Topics
Core Machine Learning
Foundational models
Latest news
Foundational models