Abstract
Deep neural networks are well known to be vulnerable to adversarial samples in the white-box setting. However, as research progressed, researchers discovered that adversarial samples can perform black-box attacks, that is, adversarial samples generated on the original model can cause models with different structures from the original model to misidentify. A large number of methods have recently been proposed to improve the transferability of adversarial samples, but the majority of them have low transferability. In this paper, we propose an intermediate feature-based attack algorithm to improve the transferability of adversarial samples even further. Rather than generating adversarial samples directly from the original samples, we continue to optimize existing adversarial samples to improve attack transferability. To begin, we calculate the feature importance of the original samples using existing adversarial samples. Then, we analyze which features are more likely to produce adversarial samples with high transferability. Finally, we optimize those features to improve the attack transferability of the adversarial samples. Furthermore, rather than using the model’s logit output, we generate adversarial samples using the model’s intermediate layer output. Extensive experiments on the standard ImageNet dataset show that our method improves transferability and outperforms state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bai, T., Luo, J., Zhao, J., Wen, B., Wang, Q.: Recent advances in adversarial training for adversarial robustness. In: Zhou, Z.H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-2021, pp. 4312–4321. International Joint Conferences on Artificial Intelligence Organization, August 2021. https://doi.org/10.24963/ijcai.2021/591, Survey Track
Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks (2017)
Dong, Y., et al.: Boosting adversarial attacks with momentum (2018)
Dong, Y., Pang, T., Su, H., Zhu, J.: Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2019)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples (2015)
He, C., et al.: Boosting the robustness of neural networks with M-PGD. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds.) Neural Information Processing, pp. 562–573. Springer, Cham (2023). https://doi.org/10.1007/978-981-99-1639-9_47
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Huang, Q., Katsman, I., He, H., Gu, Z., Belongie, S.J., Lim, S.N.: Enhancing adversarial example transferability with an intermediate level attack. arXiv:abs/1907.10823 (2019)
Huang, Y., Kong, A.W.K.: Transferable adversarial attack based on integrated gradients. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=DesNW4-5ai9
Huang, Y., Chen, C.: Smart app attack: hacking deep learning models in android apps. IEEE Trans. Inf. Forensics Secur. 17, 1827–1840 (2022). https://doi.org/10.1109/TIFS.2022.3172213
Inkawhich, N., Liang, K.J., Carin, L., Chen, Y.: Transferable perturbations of deep feature distributions (2020)
Inkawhich, N., Wen, W., Li, H.H., Chen, Y.: Feature space perturbations yield more transferable adversarial examples. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7059–7067 (2019). https://doi.org/10.1109/CVPR.2019.00723
Kim, M., Jain, A.K., Liu, X.: AdaFace: Quality adaptive margin for face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial examples in the physical world (2017)
Li, Q., Guo, Y., Chen, H.: Yet another intermediate-level attack (2020). https://doi.org/10.48550/ARXIV.2008.08847, https://arxiv.org/abs/2008.08847
Lin, J., Song, C., He, K., Wang, L., Hopcroft, J.E.: Nesterov accelerated gradient and scale invariance for adversarial attacks (2020)
Lu, Y., et al.: Enhancing cross-task black-box transferability of adversarial examples with dispersion reduction (2019)
Luo, C., Lin, Q., Xie, W., Wu, B., Xie, J., Shen, L.: Frequency-driven imperceptible adversarial attack on semantic similarity. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Luo, X., Wu, Y., Xiao, X., Ooi, B.C.: Feature inference attack on model predictions in vertical federated learning. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 181–192 (2021). https://doi.org/10.1109/ICDE51399.2021.00023
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=rJzIBfZAb
Muneeb, M., Feng, S., Henschel, A.: Deep learning pipeline for image classification on mobile phones. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), May 2022
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge (2015)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI 2017, pp. 4278–4284. AAAI Press (2017)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016). https://doi.org/10.1109/CVPR.2016.308
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=rkZvSe-RZ
Wang, Z., Guo, H., Zhang, Z., Liu, W., Qin, Z., Ren, K.: Feature importance-aware transferable adversarial attacks (2022)
Xie, C., et al.: Improving transferability of adversarial examples with input diversity. In: Computer Vision and Pattern Recognition. IEEE (2019)
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? (2014)
Zhang, J., Zhu, J., Niu, G., Han, B., Sugiyama, M., Kankanhalli, M.: Geometry-aware instance-reweighted adversarial training. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=iAX0l6Cz8ub
Zhao, Y., Zhong, Z., Sebe, N., Lee, G.H.: Novel class discovery in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Zheng, Z., et al.: Localization distillation for dense object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Zhou, D., et al.: Removing adversarial noise in class activation feature space (2021)
Acknowledgements
This work is supported by the National Key R &D Program of China (No. 2021YFB3100600).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
He, C. et al. (2023). Boosting Adversarial Transferability Through Intermediate Feature. In: Iliadis, L., Papaleonidas, A., Angelov, P., Jayne, C. (eds) Artificial Neural Networks and Machine Learning – ICANN 2023. ICANN 2023. Lecture Notes in Computer Science, vol 14258. Springer, Cham. https://doi.org/10.1007/978-3-031-44192-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-031-44192-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44191-2
Online ISBN: 978-3-031-44192-9
eBook Packages: Computer ScienceComputer Science (R0)