融合语音文本的跨模态情感分析研究进展-现代信息科技

点击排行

当前位置>主页 > 期刊在线 > 计算机技术 >

计算机技术22年11期

融合语音文本的跨模态情感分析研究进展

裴洪丽

（山东交通学院信息科学与电气工程学院，山东济南 250357）

摘要：情感分析，是指人们针对某些事件、物品及其属性的观点、情感、评价和态度的分析。近年来，随着自媒体的不断发展，人们表达其观点和态度时，已经不仅仅满足于文本，而是呈现出图像、音频、视频等多种形式，由此多模态情感分析逐渐成为理论界和产业界都极为关注的热点。文章在对情感分析相关概念介绍的基础上，着重介绍了文本情感分析、语音情感分析相关研究进展，并对多模态情感分析相关研究问题进行了分析。

关键词：情感分析；音频；多模态

DOI:10.19850/j.cnki.2096-4706.2022.011.029

中图分类号：TP391 文献标识码：A 文章编号：2096-4706（2022）11-0113-04

Research Progress of Cross Modal Sentiment Analysis of Speech-fused Texts

PEI Hongli

(School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China)

Abstract: Sentiment analysis refers to the analysis of opinions, sentiments, evaluations and attitudes aiming at certain events, items and their attributes of people. In recent years, with the continuous development of self-media, when people express their opinions and attitudes, they are no longer satisfied with text, but in various forms such as images, audios and videos. As a result, multi-modal sentiment analysis has gradually become a hot topic of great concern in both the theoretical and industrial circles. Based on the introduction of related concepts of sentiment analysis, this paper focuses on introducing the related research progress of text sentiment analysis and speech sentiment analysis, and analyzes the related research problems of multi-modal sentiment analysis.

Keywords: sentiment analysis; audio; multi-modal

参考文献：

[1] 赵妍妍，秦兵，刘挺 . 文本情感分析 [J]. 软件学报，2010，21（8）：1834-1848.

[2] GAUTAM，G，YADAV D. Sentiment analysis of twitter data using machine learning approaches and semantic analysis [C]//2014 Seventh International Conference on Contemporary Computing（IC3）. Noida：IEEE，2014：437-442.

[3] TROUSSAS C，VIRVOU M，ESPINOSA K J et al. Sentiment analysis of Facebook statuses using Naïve Bayes classifier for language learning [C]//IISA 2013.Piraeus：IEEE，2013：1-6.

[4] 耿佳宁 . 用户语音数据情感分析研究 [D]. 合肥：中国科学技术大学，2021.

[5] 罗相林，秦雪佩，贾年 . 基于 MFCC 及其一阶差分特征的语音情感识别研究 [J]. 现代计算机（专业版），2019（11）：20-24.

[6] CHAVHAN Y，DHORE M L，YESAWARE P. Speech Emotion Recognition using Support Vector Machine [J/OL].International Journal of Computer Applications，2010，1（20）[2022-04-04].https://www.xueshufan.com/publication/2159408788.

[7] EYBEN F，WÖLLMER M，SCHULLER B. Opensmile： The munich versatile and fast open-source audio feature extractor [C]//Proceedings of the 18th ACM international conference on Multimedia. Firenze：Association for Computing Machinery，2010:1459-1462.

[8] PAO T L，CHEN Y T，YEH J H，et al. Detecting emotions in Mandarin speech [J].International Journal of Computational Linguistics & Chinese Language Processing，2005，10（3）：347-361.

[9] EYBEN F，WÖLLMER M，GRAVES A，et al. On-line emotion recognition in a 3-D activation-valence-time continuum using acoustic and linguisticcues [J].Journal on Multimodal User Interfaces，2010，3：7-19.

[10] TRIGEORGIS G，RINGEVAL F，BRUECKNER R， et al. Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network [C]//2016 IEEE International Conference on Acoustics， Speech and Signal Processing（ICASSP）. Shanghai：IEEE，2016：5200-5204.

[11] METALLINOU A，KATSAMANIS A，WÖLLMER M， et al. Context-sensitive learning for enhanced audiovisual emotion classification（Extended abstract） [C]//2015 International Conference on Affective Computing and Intelligent Interaction（ACII）.Xi’an： IEEE，2015：463-469.

[12] HU T T，SHEN L J，FENG Y Q，et al. Research on anger and happy misclassification in speech and text emotion recognition [EB/OL].[2022-04-08]. http://en.cnki.com.cn/Article_en/CJFDTotalWJFZ201811028.htm.

[13] GU Y，CHEN S H，MARSIC I. Deep MulTimodal Learning for Emotion Recognition in Spoken Language [J/OL]. arXiv： 1802.08332 [cs.CL].[2022-05-07]. https://arxiv.org/abs/1802.08332.

[14] DENG D，ZHOU Y，PI J，et al. Multimodal Utterance-level Affect Analysis using Visual， Audio and Text Features [J/OL]. arXiv： 1805.00625 [eess.IV].[2022-04-06].https://arxiv.org/abs/1805.00625.

[15] PORIA S，CAMBRIA E，G E L B U K H A . D e e p Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis [C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon：The Association for Computational Linguistic，2015：2539-2544.

作者简介：裴洪丽（1993—），女，汉族，山东曲阜人，助教，硕士，研究方向：自然语言处理、人工智能。

上一篇：基于 Fluent 软件的室内颗粒物浓度分布影响研究

下一篇：基于 Unity 3D 的旅游景点漫游设计