When I attended the Data Privacy seminar in this term, I presented a paper named Privacy Risks of Securing Machine Learning Models against Adversarial Examples from Liwei Song et al., published on ACM CCS'19.
In summary, they make the following contributions in this paper:
they propose two new membership inference attacks specific to adversarial robust models by exploiting adversarial examples’ predictions and verified worst-case predictions.
They perform membership inference attacks on models trained with six state-of-the-art adversarial defense methods. They demonstrate that all methods indeed increase the model’s membership inference risk.
they further explore the factors that influence the membership inference performance of the adversarial robust.
Finally, they experimentally evaluate the effect of the adversary’s prior knowledge, countermeasures such as temperature scaling and regularization, and discuss the relationship between training data privacy and model robustness.
What I think is interesting is that this paper concludes that the more robust the model is, the more privacy will be leaked by using the membership inference attack. They tested six robust machine learning models, trained by adversarial data or verifiable data. Compared with the natural model, all of them leak more membership information. The reason may be easy to understand because the adversarial samples in training data affect more on the model compare to the natural training. When doing membership inference attack, these samples will be much easier to infer. But, this is just analysis, not proof. Also, this paper doesn't prove the conclusion.
This is just like a paradox that when you want to improve the security of the model to defend the adversarial attacks, the privacy leakage will increase.
However, is this a real paradox? Maybe not. To find a balance between security and privacy in these kinds of models may be difficult. But we can use some techniques to add another box to defend the membership inference attack after training, such as differential privacy. That is to say, adversarial training can only defend one part in security. Congzheng song et al. demonstrated the differential private model had better defense on membership inference. So, combine the adversarial examples and differential privacy in the training model should be a solution.