[1]郝洁,谢珺,苏婧琼,等.基于词加权LDA算法的无监督情感分类[J].智能系统学报,2016,11(4):539-545.[doi:10.11992/tis.201606007]
HAO Jie,XIE Jun,SU Jingqiong,et al.An unsupervised approach for sentiment classification based on weighted latent dirichlet allocation[J].CAAI Transactions on Intelligent Systems,2016,11(4):539-545.[doi:10.11992/tis.201606007]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
11
期数:
2016年第4期
页码:
539-545
栏目:
学术论文—自然语言处理与理解
出版日期:
2016-07-25
- Title:
-
An unsupervised approach for sentiment classification based on weighted latent dirichlet allocation
- 作者:
-
郝洁, 谢珺, 苏婧琼, 续欣莹, 韩晓霞
-
太原理工大学 信息工程学院, 山西 晋中 030600
- Author(s):
-
HAO Jie, XIE Jun, SU Jingqiong, XU Xinying, HAN Xiaoxia
-
Information Engineering College, Taiyuan University of Technology, Jinzhong 030600, China
-
- 关键词:
-
情感分类; 主题情感混合模型; 主题模型; LDA; 加权算法
- Keywords:
-
sentiment classification; topic and sentiment unification model; topic model; LDA; weighting algorithm
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201606007
- 摘要:
-
主题情感混合模型可以有效地提取语料的主题信息和情感倾向。本文针对现有主题/情感分析方法主题间区分度较低的问题提出了一种词加权LDA算法(weighted latent dirichlet allocation algorithm,WLDA),该算法可以实现无监督的主题提取和情感分析。通过计算语料中词汇与情感种子词的距离,在吉布斯采样中对不同词汇赋予不同权重,利用每个主题下的关键词判断主题的情感倾向,进而得到每篇文档的情感分布。这种方法增强了具有情感倾向的词汇在采样过程中的影响,从而改善了主题间的区分性。实验表明,与JST(Joint Sentiment/Topic model)模型相比,WLDA不仅在采样中迭代速度快,也能够更好地实现主题提取和情感分类。
- Abstract:
-
The topic and sentiment unification model can efficiently detect topics and emotions for a given corpus. Faced with the low discriminability of topics in sentiment/topic analysis methods, this paper proposes a novel method, the weighted latent dirichlet allocation algorithm (WLDA), which can acquire sentiments and topics without supervision. The model assigns weights to terms during Gibbs sampling by calculating the distance between seed words and terms, then counts the weights of key words to estimate the sentiment orientation of each topic and obtain the emotional distribution throughout documents. This method enhances the impact of words that convey emotional attitudes and obtains more discriminative topics as a consequence. The experiments show that WLDA, compared with the joint sentiment/topic model (JST), not only has a higher iteration sampling speed, but also gives better results for topic extraction and sentiment classification.
备注/Memo
收稿日期:2016-06-02。
基金项目:山西省回国留学人员科研项目(2015-045,2013-033);山西省留学回国人员科技活动择优资助项目(2013);山西省自然科学基金项目(2014011018-2).
作者简介:郝洁,女,1992年生,硕士研究生,主要研究方向为自然语言处理、粗糙集;谢珺,女,1979年生,副教授,主要研究方向为粒计算、粗糙集、数据挖掘、智能信息处理;苏婧琼,女,1991年生,硕士研究生,主要研究方向为自然语言处理、粒计算。
通讯作者:谢珺.E-mail:xiejun@tyut.edu.cn.
更新日期/Last Update:
1900-01-01