当前位置>主页 > 期刊在线 > 计算机技术 >

计算机技术22年17期

基于词替换的对抗文本生成研究
王晓娟
(安徽理工大学 计算机科学与工程学院,安徽 淮南 232001)

摘  要:对抗样本的存在对自然语言处理领域的众多应用存在安全威胁,对抗攻击方法的研究有助于评估甚至提高深度神经网络模型的鲁棒性。现有的词级文本对抗攻击在生成对抗样本的过程中,依赖于单词重要性评分并排序,但效率低下,需要频繁访问目标模型来获取重要性分数。文章针对该问题,提出通过训练替代模型计算单词重要性分数,并结合语义相似度分层采样后得到的目标模型决策概率差值,对原始输入中的单词进行排序。在文本分类任务上的实验结果证明了该方法的有效性。


关键词:文本对抗攻击;黑盒攻击;深度神经网络;自然语言处理



DOI:10.19850/j.cnki.2096-4706.2022.17.020


中图分类号:TP391                                          文献标识码:A                                 文章编号:2096-4706(2022)17-0078-04


Research on Adversarial Text Generation Based on Word Replacement

WANG Xiaojuan

(School of Computer Science and Engineering, Anhui University of Science and Technology, Huainan 232001, China)

Abstract: The existence of adversarial samples pose a security threat to many applications in the field of natural language processing. Research on adversarial attack methods can help to evaluate and even improve the robustness of deep neural network models. Existing word-level text adversarial attacks rely on word importance scoring and ranking in the process of generating adversarial samples, but they are inefficient and require frequent access to the target model to obtain important scores Aiming at this problem, this paper proposes to rank the words in the original input by calculating the word importance score through training the substitute model and combining the decision probability difference of the target model obtained after stratified sampling of semantic similarity. Experimental results on text classification tasks demonstrate the effectiveness of the method.

Keywords: text adversarial attack; black box attack; deep neural network; natural language processing


参考文献:

[1] GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and Harnessing Adversarial Examples [J/OL].arXiv:1412.6572 [stat. ML].[2022-05-03].https://arxiv.org/abs/1412.6572v1.

[2] STRINGHINI G,KRUEGEL C,VIGNA G.Detecting Spammers on Social Networks [C]//Twenty-Sixth Annual Computer Security Applications Conference(ACSAC).Austin:ACM,2010:1-9.

[3] KOLTER J Z,MALOOF M A.Learning to Detect and Classify Malicious Executables in the Wild [J].The Journal of Machine Learning Research,2006,7:2721-2744.

[4] DONG L,WEI F R,TAN C Q.Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification [C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:Association for Computational Linguistics,2014:49-54.

[5] GAO J,LANCHANTIN J,SOFFA M L,et al.Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers [C]//2018 IEEE Security and Privacy Workshops (SPW).San Francisco:IEEE,2018:50-56.

[6] WANG Y C,BANSAL M.ROBUST Machine Comprehension Models via Adversarial Training [J/OL].arXiv:1804.06473 [cs.CL]. [2022-05-16].https://arxiv.org/abs/1804.06473.

[7] REN S H,DENG Y H,HE K,et al.Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency [C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence:Association for Computational Linguistics,2019:1085-1097.

[8] JIN D,JIN Z J,ZHOU J T,et al.Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment [J].Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(5):8018-8025.

[9] NIVEN T,KAO H Y.Probing Neural Network Comprehension of Natural Language Arguments [J/OL].arXiv:1907.07355 [cs.CL]. [2022-06-21].https://arxiv.org/abs/1907.07355v2.

[10] CHEN J B,SONG L,WAINWRIGHT M J,et al.Learning to Explain:An Information-Theoretic Perspective on Model Interpretation [J/OL].arXiv:1802.07814 [cs.LG].[2022-06-27].https:// arxiv.org/abs/1802.07814v1.

[11] MRKŠIĆ N,SÉAGHDHA D Ó,THOMSON B,et al.Counter-fitting Word Vectors to Linguistic Constraints [J/OL].arXiv: 1603.00892 [cs.CL].[2022-07-11].https://arxiv.org/abs/1603.00892.

[12] CER D,YANG Y F,KONG S Y.Universal SentenceEncoder [J/OL].arXiv:1803.11175 [cs.CL].[2022-07-16].https://arxiv. org/abs/1803.11175v1.


作者简介:王晓娟(1997—),女,汉族,安徽黄山人,在读硕士研究生,研究方向:网络与信息安全。