摘 要:随着各类社交媒体上的评论数据数量的急剧增加,从大量的评论数据中挖掘出其所包含的情感信息具有越来越高的商业价值。本文提出了一种基于卷积神经网络的社交媒体情感分类模型,首先通过包含酒店评论在内的语料库完成词向量的初始化,而后通过卷积网络层、隐含层、嵌入层和分类层完成评论数据的情感分类。实验结果表明,基于卷积神经网络的情感分类模型无论是面对不同维度的词向量模型还是不同比例的测试集,都能够得到较高的分类准确率。
关键词:情感分析;卷积神经网络;词向量
中图分类号:TP391.41;TP183 文献标识码:A 文章编号:2096-4706(2018)02-0089-04
Social Media Text Sentiment Analysis Based on Convolutional Neural Network
LU Zhengqiu,WANG Linge,ZHOU Chunliang
(Ningbo Dahongying University School of Information Engineering,Ningbo 315175,China)
Abstract:With the number of comments on all kinds of social media increased dramatically,it has higher and higher commercial value while mining emotional information contained in a large number of comments. This article puts forward a social media sentiment classification model based on convolutional neural network. It at first finishes the word vector initialization through a corpus of hotel reviews,and then realizes the sentiment classification through convolutional network layer,hidden layer,embed layer and classification layer. Experiment results show that the emotion classification model based on convolutional neural network can get higher classification accuracy in terms of different dimension word vector models or in different proportion of test sets.
Keywords:sentiment analysis;convolutional neural network;word vector
基金项目:浙江省教育厅科研项目(项目编号:Y201738610)
参考文献:
[1] Turney P D. Thumbs Up or Thumbs Down:Semantic Orientation Applied to Unsupervised Classification of Reviews [C]//Proceedings of Annual Conference of the Association for Computational Linguistics,2002:417-424.
[2] Kamps J and Marx M. Words with Attitude[C]// Proceedings of International Conference on Global Word Net,2002:332-341.
[3] Budanitsky A,Hirst G. Evaluating Wordnet-based Measures of Lexical Semantic Relatedness [J].Computational Linguistics,2006,32(1):13-47.
[4] Pang B,Lee L. Opinion Mining and Sentiment Analysis [J]. Foundations and Trends in Information Retrieval,2008,2(1-2):1-135.
[5] Pang B,Lee L,Vaithyanathan S. Thumbs up:Sentiment Classification using Machine Learning Techniques [C]//Proceedings of Annual Conference of the Association for Computational Linguistics,2002:79-86.
[6] Cui H,Mittal V,Datar M. Comparative Experiments on Sentiment Classification for Online Product Reviews [C]// Proceedings of American Association of Artificial Intelligence,2006:1265-1270.
[7] Read J,Carroll J.Weakly Supervised Techniques for Domain-independent Sentiment Classification [C]//Proceedings of International Conference on Information and Knowledge Management Workshop on Topic-sentiment Analysis for Mass Opinion,2009:45-52.
[8] Li S,Wang Z,Zhou G,Lee S Y M. Semisupervised Learning for Imbalanced Sentiment Classification [C]//Proceedings of International Joint Conference on Artificial Intelligence,2011:1826-1831.
[9] Yu N,Kübler S. Filling the Gap:Semi-supervised Learning for Opinion Detection across Domains [C]// Proceedings of Annual Conference of the Association for Computational Linguistics,2011:200-209.
[10] Liu S,Li F,Li F,Cheng X,Shen H. Adaptive Co-training SVM for Sentiment Classification on Tweets [C]//Proceedings of International Conference on Information and Knowledge Management,2013:2079-2088.
作者简介:
陆正球(1982-),男,浙江宁波人,硕士,讲师,研究方向:数据分析,计算机网络等;
王麟阁(1979-),男,吉林双辽人,硕士,高级工程师,研究方向:物联网,大数据;
周春良(1982-),男,浙江宁波人,硕士,讲师,研究方向:数据挖掘。