查看论文信息

中文题名：	基于深度网络结构的情感分析方法研究
姓名：	罗怀芍
一卡通号：	0000310168
论文语种：	中文
学科名称：	工学 - 计算机科学与技术（可授工学 ; 理学学位）
公开时间：	公开
学生类型：	博士
学位：	工学博士
学校：	西南交通大学
院系：	计算机与人工智能学院
专业：	计算机科学与技术
第一导师姓名：	李天瑞
第一导师单位：	西南交通大学
完成日期：	2021-10-15
答辩日期：	2021-12-31
外文题名：	THE RESEARCH OF SENTIMENT ANALYSIS BASED ON DEEP NETWORK STRUCTURE
中文关键词：	情感分析 ; 方面词抽取 ; 协同抽取 ; 多模态情感 ; 深度学习
外文关键词：	Sentiment Analysis; Aspect Term Extraction; Collaborative Extraction; Multimodal Sentiment; Deep Learning
中文摘要：	︿随着电子商务、线上餐饮和社交媒体等互联网信息技术的迅猛发展，越来越多的用户倾向于在线上发表自己对商品、服务、问题和事件等观点和态度。通过挖掘这些用户输出的文本、图片、视频和音频等信息，可以为个人行为决策提供支撑、帮助企业和商家进行改进产品和提升服务和辅助政府进行舆情的分析和引导。情感分析，又叫观点挖掘，是一个挖掘用户意图和情感倾向的研究方向。大致从二十世纪九十年代开始，越来越多的研究者投身到这个方向的研究工作上。经过二三十年的发展，情感分析已经成为数据挖掘、机器学习和人工智能等研究领域的热点方向之一。研究内容涵盖文本的文档级、句子级、词语级等多种粒度以及图片、视频和音频等多个模态的信息。本文主要对情感分析中的抽取和分类任务进行模型研究，包括基于方面词的细粒度情感分析和多模态情感分析两个场景。具体的研究任务包含方面词抽取、方面词-极性对协同抽取和多模态情感分类。主要的研究工作和研究成果总结如下：（1）基于依赖语法树递归神经网络的方面词抽取：针对细粒度文本情感挖掘中的方面词抽取，之前的研究工作缺乏对语法特征和词序特征的融合。本文首先结合依赖语法树的拓扑结构，设计出一种能够提取双向依赖语法表征的递归神经网络BiDTree，该网络能够实现从下到上和从上到下两个方向的语法树逐层递归建模。应用依赖语法树中的语法依赖关系，模型能够捕捉到词与词之间的长距离依赖关系。通过深度挖掘词与词之间的正向和反向的依赖关系，并进一步将抽取的依赖语法特征和词序特征进行融合，增强了自然语言句子的表达能力并提升了方面词的抽取效果。（2）基于成对跨度共享循环神经网络的方面词及其极性抽取：之前的研究工作主要分开建模方面词抽取和方面词极性分类，而解决这两个子任务的模型由于任务属性不同而很难联合训练。为了同时完成方面词抽取及其极性分类，本文将这两个子任务统一为两个序列标注问题，提出了一个成对跨路共享循环神经网络结构DOER，该网络的核心思想是将用于方面词序列标注的表征和用于方面词极性标注的表征进行跨路共享，以达到双路网络相互促进的效果。同时设计了两个辅助训练任务来增强数据特征表示，其一是方面词长度预测，该辅助任务能够缓解方面词过长带来的依赖抽取困难问题；另一个是情感词判别，通过判断输入的词是否是情感词来增强情感极性的标注。（3）基于梯度协调和级联标注的方面词及其极性抽取：为了进一步考虑方面词-极性协同抽取任务中极性标注时方面词之间的关系、标注标签的不平衡问题和预训练模型的作用，本文提出了一个标签级联的深度学习模型GRACE，同时在模型中引入梯度协调的交叉熵加权策略。该网络的主要思想是将生成的方面词标签序列作为生成方面词情感极性标注序列时的输入，然后通过Transformer的自注意力机制，学习方面词之间的交互关系以得到更好的极性序列标注效果。梯度协调的交叉熵加权策略能够有效缓解标签的不平衡问题。此外，本文还对模型进行虚拟对抗训练，增强模型的鲁棒性和准确性。（4）基于多尺度局部聚合描述子共享融合的多模态情感分析：针对文本模态作为单一表征在情感分析上可能存在的歧义问题和情感缺失问题，本文将视频和语音信号融合进来，与文本模态融合以捕捉更多的情感特征，提升情感分析的性能，提出了一个基于局部聚合描述符向量的多尺度特征融合方法ScaleVLAD，通过将不同模态的表征对齐到共享的向量空间来实现融合不同粒度特征的效果，空间对齐能有效缓解不同模态语义边界不清晰的问题。同时本文提出了一个自监督的漂移聚类损失函数，在每次迭代时使融合特征聚合在不同的类簇下面，以学习到更有标签判别性的数据特征，提升模型的分类和回归性能。﹀
外文摘要：	︿ With the rapid development of Internet information technologies, such as e-commerce, online catering, and social media, more and more users express their views and attitudes on goods, services, problems, and events online. Mining the information of texts, pictures, videos, audios, etc., generated by users, can impact individual behavior decisions, help enterprises and businesses to improve products and services, and assist the government to analyze and guide public feelings. Sentiment analysis, also known as opinion mining, is a research direction to discover users' intentions and emotional tendencies. Since the 1990s, more and more researchers have devoted themselves to this research direction. After two or three decades of development, sentiment analysis has become one of the hot topics in data mining, machine learning, and artificial intelligence. The research content contains different granularities, including document, sentence, and word, and multiple modalities, including pictures, videos, and audios besides texts. This dissertation mainly studies extraction and classification tasks in sentiment analysis, including aspect-based sentiment analysis and multimodal sentiment analysis. Especially, the research tasks include aspect term extraction, aspect term-polarity co-extraction, and multimodal sentiment analysis. The main research work and results are summarized as follows: (1) Dependency tree based recursive neural network for aspect term extraction: Previous studies ignored the fusion of grammatical and word ordering features, which is essential in aspect term extraction. Thus, a kind of recursive neural network BiDTree, which can extract bidirectional dependency structure feature coupling the topology of dependency tree of text, is designed in the dissertation. The proposed network can be achieved via layer-by-layer recursive modeling on a syntax tree from bottom-up and top-down directions. Using the syntax relationships in the dependency tree, the model can capture the long-range dependencies between words. By deeply mining the forward and reverse dependencies between words and further fusing the extracted dependent grammatical features and word ordering features, the representation of natural language sentences is enhanced, and the effectiveness of aspect term extraction can be improved. (2) Dual cross-shared recurrent neural network for aspect term-polarity co-extraction: Previous research mainly trains the models of aspect term extraction and sentiment classification separately because these two tasks belong to different task types. The two tasks are unified as two sequence labeling problems in the dissertation. Then, a Dual crOss-sharEd recurrent neural network DOER is proposed to achieve aspect term extraction and aspect sentiment classification simultaneously. The core idea is to share the representation across the dual recurrent neural networks used to label aspect terms and polarities, respectively, thus promoting the dual network. Besides, two auxiliary training tasks are designed to facilitate feature extraction. One is to predict the aspect term length, which can alleviate the complex dependency problem caused by long aspect terms. The other is sentiment lexicon enhancement, promoting polarity labeling by classifying whether the word is a sentiment word. (3) Gradient harmonized and cascaded labeling for aspect term-polarity co-extraction: To further consider the relationship between aspect terms in polarity labeling, the imbalance of labels, and the effectiveness of the pre-training model in aspect term-polarity co-extraction task, a cascaded labeling-based deep learning model GRACE is proposed in the dissertation. Besides, the gradient harmonized cross-entropy is introduced to train the model. The main idea of the network is to take the generated aspect term label sequence as input and learn the interaction between aspect terms through the self-attention mechanism of Transformer when generating polarity labeling sequence, thus getting better polarity labeling result. The gradient harmonized cross-entropy can effectively alleviate the problem of label imbalance. In addition, the model is also trained with virtual adversarial training to improve the robustness and accuracy. (4) Multi-scale fusion of locally descriptors for multimodal sentiment analysis: To address the shortcomings of the single representation of text modality, the video and speech signals with text modality are integrated to capture more sentiment features and improve the performance of sentiment analysis in the dissertation. A multi-scale feature fusion method ScaleVLAD is proposed based on the vector of locally aggregated descriptors, which fuses features of different granularities by aligning the representations of different modalities into a shared vector space. The spatial alignment can effectively alleviate the unclear semantic boundaries of different modalities. A self-supervised shifted clustering loss is also proposed to aggregate the fused features under different clusters as much as possible at each iteration to learn more distinctive features, thus improving classification and regression performance. ﹀
分类号：	TP183
总页码：	114
参考文献总数：	210
馆藏位置：	TP183 B 2021
开放日期：	2022-06-09

附件下载