Posted by Shuang Song and David Marn
Overview of a membership inference attack. An attacker tries to figure out whether certain examples were part of the training data. |
Today, we’re excited to announce a new experimental module in TensorFlow Privacy (GitHub) that allows developers to assess the privacy properties of their classification models.
Privacy is an emerging topic in the Machine Learning community. There aren’t canonical guidelines to produce a private model. There is a growing body of research showing that a machine learning model can leak sensitive information of the training dataset, thus creating a privacy risk for users in the training set.
Last year, we launched TensorFlow Privacy, enabling developers to train their models with differential privacy. Differential privacy adds noise to hide individual examples in the training dataset. However, this noise is designed for academic worst-case scenarios and can significantly affect model accuracy.
These challenges led us to tackle privacy from a different perspective. A few years ago, research around the privacy properties of machine learning models started to emerge. Cost-efficient “membership inference attacks” predict whether a specific piece of data was used during training. If an attacker is able to make a prediction with high accuracy, they will likely succeed in figuring out if a data piece was used in the training set. The biggest advantage of a membership inference attack is that it is easy to perform, i.e., does not require any re-training.
A test produces a vulnerability score that determines whether the model leaks information from the training set. We found that this vulnerability score often decreases with heuristics, such as early stopping or using DP-SGD for training.
Unsurprisingly, differential privacy helps in reducing these vulnerability scores. Even with very small amounts of noise, the vulnerability score decreased.
After using membership inference tests internally, we’re sharing them with developers to help them build more private models, explore better architecture choices, use regularization techniques such as early stopping, dropout, weight decay, and input augmentation, or collect more data. Ultimately, these tests can help the developer community identify more architectures that incorporate privacy design principles and data processing choices.
We hope this library will be the starting point of a robust privacy testing suite that can be used by any machine learning developer around the world. Moving forward, we’ll explore the feasibility of extending membership inference attacks beyond classifiers and develop new tests. We’ll also explore adding this test to the TensorFlow ecosystem by integrating with TFX.
Reach out to tf-privacy@google.com and let us know how you’re using this new module. We’re keen on hearing your stories, feedback, and suggestions!
Acknowledgments: Yurii Sushko, Andreas Terzis, Miguel Guevara, Niki Kilbertus, Vadym Doroshenko, Borja De Balle Pigem, Ananth Raghunathan. Read More