Roberto Santana and Unai Garciarena
Department of Computer Science and Artificial Intelligence
University of the Basque Country
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
H. Yakura, Y. Akimoto, and J. Sakum. Generate (non-software) Bugs to Fool Classifiers. arXiv preprint arXiv:1911.08644. 2019.
N. Papernot, P. McDaniel, X. Wu, S. Jha and A. Swami. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. Proceedings of the 2016 IEEE Symposium on Security and Privacy. Pp. 582-597. 2016.
H. Yakura, Y. Akimoto, and J. Sakum. Generate (non-software) Bugs to Fool Classifiers. arXiv preprint arXiv:1911.08644. 2019.
A. Karpathy. Breaking Linear Classifiers on ImageNet. Hacker's guide to Neural Networks. 2015.
where \(L(\cdot)\) is a distance metric between examples from the input space.
S. Baluja and I. Fisher. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv preprint arXiv:1703.09387. 2017.
where \( y_t \in \mathcal{Y} \) is some target label chosen by the attacker.
S. Baluja and I. Fisher. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv preprint arXiv:1703.09387. 2017.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
Two main groups include Gradient Based Optimization and Constrained Optimization methods.
Include a variety of optimization methods such as Evolutionary Algorithms and other search-based heuristics.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
where \(x^{\prime}, x \in [0,1]^n\), \( \mathcal{L}(x^{\prime},t) \) is the true loss function of the model, and \(t\) is the target misclassification label.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
where \( \nabla_x \mathcal{L}(x, y) \) is the first derivative of the loss function with respect to the input x.
I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572. 2014.
I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572. 2014.
I. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. arXiv preprint arXiv:1412.6572. 2014.
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The Limitations of Deep Learning in Adversarial Settings. arXiv preprint arXiv:1511.07528. 2015.
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The Limitations of Deep Learning in Adversarial Settings. arXiv preprint arXiv:1511.07528. 2015.
N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The Limitations of Deep Learning in Adversarial Settings. arXiv preprint arXiv:1511.07528. 2015.
H. Yakura, Y. Akimoto, and J. Sakum. Generate (non-software) Bugs to Fool Classifiers. arXiv preprint arXiv:1911.08644. 2019.
S. Baluja and I. Fisher. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv preprint arXiv:1703.09387. 2017.
S. Baluja and I. Fisher. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv preprint arXiv:1703.09387. 2017.
M. Dezfooli, S. Mohsen, A. Fawzi, O. Fawzi, and P. Frossard. Universal adversarial perturbations. Proceedings of the IEEE conference on computer vision and pattern recognition. Pp. 1765-1773. 2017.
J. Vadillo and R. Santana. Universal Adversarial Examples in Speech Command Classification. arXiv preprint arXiv:1911.0000. 2019.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
J. Vadillo and R. Santana. Universal Adversarial Examples in Speech Command Classification. arXiv preprint arXiv:1911.0000. 2019.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.
W. Xu, D. Evans, and Y. Qi Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155. 2017.
Y. Song, T. Kim, S. Nowozin, S. Ermon, and N. Kushman. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples. arXiv preprint arXiv:1710.10766. 2017.
N. Papernot, P. McDaniel, X. Wu, S. Jha and A. Swami. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. Proceedings of the 2016 IEEE Symposium on Security and Privacy. Pp. 582-597. 2016.
R. Reza-Wiyatno, A. Xu, O. Dia, A. de Berke. Adversarial Examples in Modern Machine Learning: A Review. arXiv preprint arXiv:1911.05268. 2019.