摘 要:近年来网络社交平台兴起,大众倾向于在网上发表日常生活的感受,通过对这些文本的分析可以挖掘出人们的情感信息。文章基于新冠疫情暴发初期新浪微博有关新冠话题的评论数据,通过结合情感词典和支持向量机的方法构建情感分类模型,接着通过情感时序分析和 LDA 主题模型综合探讨疫情期间微博网民的情感走势与特征。经实验分析,网民在新冠疫情期间的情感以积极情感为主,体现了大众对于战胜疫情有着相当充足的信心。
关键词:新冠疫情;情感词典;支持向量机;情感时序分析;LDA
DOI:10.19850/j.cnki.2096-4706.2021.24.007
中图分类号:TP391.1;TP181 文献标识码:A 文章编号:2096-4706(2021)24-0024-05
Emotion Analysis of Micro-blog Netizens Based on Emotion Dictionary and SVM
WANG Wentao, ZHANG Shibao
(Nanjing University of Information Science & Technology, Nanjing 210044, China)
Abstract: In recent years, with the rise of network social platforms, the public tends to publish their feelings in daily life on the Internet. Through the analysis of these texts, people’s emotional information can be mined. Based on the data of comments on the topic of COVID-19 on Sina micro-blog in the early stage of COVID-19 epidemic outbreak, this paper constructs an emotion classification model by combining emotion dictionary and Supports Vector Machine. Then, the emotion temporal sequence analysis and LDA (Latent Dirichlet Allocation) theme model are used to comprehensively explore the emotional trend and characteristics of micro-blog netizens during the epidemic period. According to the experimental analysis, the emotions of netizens during the COVID-19 epidemic period are mainly positive emotions, which shows that the public has sufficient confidence in overcoming the epidemic.
Keywords: COVID-19 epidemic; emotion dictionary; Support Vector Machine; emotion temporal sequence analysis; LDA
参考文献:
[1] 王艳东,李昊,王腾,等 . 基于社交媒体的突发事件应急信息挖掘与分析 [J]. 武汉大学学报(信息科学版),2016,41(3): 290-297.
[2] 韩珂珂,邢子瑶,刘哲,等 . 重大公共卫生事件中的舆情分析方法研究——以新冠肺炎疫情为例 [J]. 地球信息科学学报, 2021,23(2):331-340.
[3] PANG B,LEE L,VAITHYANATHAN S. Thumbs up ? Sentiment Classification using Machine Learning Techniques [J/ OL].arXiv:cs/0205070 [cs.CL].[2021-11-03].https://arxiv.org/abs/ cs/0205070v1.
[4] LIU L R,FENG S,WANG D L,et al. An Empirical Study on Chinese Microblog Stance Detection Using Supervised and Semi-supervised Machine Learning Methods [C]//Natural Language Understanding and Intelligent Applications.Kunming:Springer, 2016:753-765.
[5] XUE J,CHEN J X,HU R,et al. Twitter discussions and concerns about COVID-19 pandemic:Twitter data analysis using a machine learning approach [J/ OL].arXiv:2005.12830 [cs.SI].[2012.11.16].2020.https://arxiv.org/ abs/2005.12830v2.
[6] FERNÁNDEZ-GAVILANES M,ÁLVAREZ-LÓPEZ T, JUNCAL-MARTÍNEZ J,et al. Unsupervised method for sentiment analysis in online texts [J].Expert Systems with Applications:An International Journal,2016,58(C):57-75.
[7] 梁军,柴玉梅,原慧斌,等 . 基于深度学习的微博情感分析 [J]. 中文信息学报,2014,28(5):155-161.
[8] 梁斌,刘全,徐进,等 . 基于多注意力卷积神经网络的特定目标情感分析 [J]. 计算机研究与发展,2017,54(8):1724-1735.
[9] WHISSELL C. Objective Analysis of Text:II.Using an Emotional Compass to Describe the Emotional Tone of Situation Comedies [J].Psychological Reports,1998,82(2):643-646.
[10] 栗雨晴,礼欣,韩煦,等 . 基于双语词典的微博多类情感分析方法 [J]. 电子学报,2016,44(9):2068-2073.
[11] KAITY M,BALAKRISHNAN V. An automatic non-English sentiment lexicon builder using unannotated corpus [J].The Journal of Supercomputing,2019,75(4):2243-2268.
作者简介:王文韬(1997—),男,汉族,江苏苏州人,硕士在读, 研究方向:大数据分析;张士豹(1996—),男,汉族,安徽滁州人, 硕士在读,研究方向:图像处理。