[1]李凌霄,李绍滋,曹冬林.基于多情绪源关联模型的中文微博情感分析[J].智能系统学报,2016,11(4):546-553.[doi:10.11992/tis.201605019]
 LI Lingxiao,LI Shaozi,CAO Donglin.Emotional multi-source correlation model for chinese micro-blog sentiment analysis[J].CAAI Transactions on Intelligent Systems,2016,11(4):546-553.[doi:10.11992/tis.201605019]
点击复制

基于多情绪源关联模型的中文微博情感分析(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年4期
页码:
546-553
栏目:
出版日期:
2016-07-25

文章信息/Info

Title:
Emotional multi-source correlation model for chinese micro-blog sentiment analysis
作者:
李凌霄12 李绍滋12 曹冬林12
1. 厦门大学 智能科学与技术系, 福建 厦门 361005;
2. 厦门大学 福建省仿脑智能系统重点实验室, 福建 厦门 361005
Author(s):
LI Lingxiao12 LI Shaozi12 CAO Donglin12
1. Cognitive Science Department, Xiamen University, Xiamen 361005, China;
2. Fujian Key Laboratory of the Brain-like Intelligent Systems, Xiamen 361005, China
关键词:
多模态情感分析多情绪源社交媒体关联性
Keywords:
multi-modal sentiment analysisemotional multi-sourcessocial mediacorrelation
分类号:
TP391
DOI:
10.11992/tis.201605019
摘要:
社交媒体信息的爆炸式增长,使得依据其对公众舆论情感的分析受到越来越多的关注。与传统文本不同,新浪微博中存在包括情感词、表情、图片和视频等特征在内的多情绪源,本文针对中文社交短文本情感分析中情感词典时效性问题和多情绪源间的关联性问题,提出了一种多情绪源关联模型。该模型考虑微博中的情感词和表情特征及其之间的关联关系,在经典的词典规则投票方法基础上,引入多情绪源以及关联概率,通过概率建模的方式对情感词和表情两类情绪源建立关联模型,实现对微博情感的判别。实验表明,在6 171条微博数据集中,多情绪源关联模型分类准确率达到了85.3%,强于包含情感词和表情的传统投票模型(83.4%)以及包含同类多特征的SVM方法(82.9%)。
Abstract:
With the explosion of social media information, sentiment analysis of public opinion is attracting more and more attention. Compared with traditional text, the Sina micro-blog contains a variety of emotional sources, including sentiment words, emoticons, pictures, etc. To solve the problem of the poor timeliness of lexicons in Chinese social short messages and to utilize the correlation between different emotional sources, an emotional multi-source correlation model (EMCM) is proposed to carry out sentiment analysis on a micro-blog. In particular, it takes advantage of the correlation between sentiment words and emoticons. It imports the multi-sources and correlation probabilities, and then builds a correlation model between the two emotional sources, emotional words and emoticons, based on a voting model using sentimental words. Experimental results show that this model achieved an accuracy of 85.3% in 6 171 micro-blogs, higher than either the traditional method based on voting (83.4%) or the SVM method based on similar multi-features (82.9%).

参考文献/References:

[1] PANG Bo, LEE L, VAITHYANATHAN S. Thumbs up?:sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:ACM, 2002, 10:79-86.
[2] DAVE K, LAWRENCE S, PENNOCK D M. Mining the Peanut gallery:opinion extraction and semantic classification of product reviews[C]//Proceedings of the 12th International Conference on World Wide Web. Budapest, HU:ACM, 2003:519-528.
[3] YU HONG, HATZIVASSILOGLOU V. Towards answering opinion questions:separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:ACM, 2003:129-136.
[4] NA J C, SUI H, KHOO C, et al. Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews[C]//MCILWAINE I C. Knowledge Organization and the Global Information Society:Proceedings of the Eighth International ISKO Conference. Wurzburg, Germany:Ergon Verlag, 2004:49-54.
[5] WILSON T, WIEBE J, HOFFMANN P. Recognizing contextual polarity in phrase-level sentiment analysis[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA:ACM, 2005:347-354.
[6] SCHAPIRE R E, SINGER Y. BoosTexter:a boosting-based system for text categorization[J]. Machine Learning, 2000, 39(2/3):135-168.
[7] TURNEY P D. Thumbs up or thumbs down?:semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA, USA:ACM, 2002:417-424.
[8] 朱嫣岚, 闵锦, 周雅倩, 等. 基于HowNet的词汇语义倾向计算[J]. 中文信息学报, 2006, 20(1):14-20.ZHU Yanlan, MIN Jin, ZHOU Yaqian, et al. Semantic orientation computing based on HowNet[J]. Journal of Chinese information processing, 2006, 20(1):14-20.
[9] HU Xia, TANG Jiliang, GAO Huiji, et al. Unsupervised sentiment analysis with emotional signals[C]//Proceedings of the 22nd international conference on World Wide Web. Rio de Janeiro, Brazil:ACM, 2013:607-618.
[10] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010, 21(8):1834-1848. ZHAO Yanyan, QIN Bing, LIU Ting. Sentiment analysis[J]. Journal of software, 2010, 21(8):1834-1848.
[11] 魏现辉, 任巨伟, 何文译, 等. DUTIR:中文短文本倾向性分析及要素抽取方法研究[C]//第五届中文倾向性分析评测研讨会论文集. 太原, 2013:116-129. WEI Xianhui, REN Juwei, HE Wenyi, et al. DUTIR:method research of sentiment analysis and elements extraction of Chinese short text[C]//Proceedings of the Fifth Chinese Opinion Analysis Evaluation. Taiyuan, 2013:116-129.
[12] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1):73-83.XIE Lixing, ZHOU Ming, SUN Maosong. Hierarchical structure based hybrid approach to sentiment analysis of Chinese micro blog and its feature extraction[J]. Journal of Chinese information processing, 2012, 26(1):73-83.

备注/Memo

备注/Memo:
收稿日期:2016-05-19。
基金项目:国家自然科学基金项目(61202143,61305061,61402386,61572409);福建省自然科学基金项目(2013J05100).
作者简介:李凌霄,男,1990年生,硕士研究生,主要研究方向为跨媒体舆情分析;曹冬林,男,1977年生,博士,厦门大学智能科学与技术系助理教授,主要研究方向为自然语言处理、信息检索、跨媒体舆情分析、计算机视觉、模式识别;李绍滋,男,1963年生,博士,教授,博士生导师,主要研究方向为人工智能及其应用、计算机视觉与机器学习、运动目标检测与识别、跨媒体舆情分析等。主持过多项国家、省市级项目研究,获得省科学技术三等奖两项,发表学术论文200余篇,其中:27篇被SCI检索、171篇EI检索。
通讯作者:曹冬林.E-mail:another@xmu.edu.cn.
更新日期/Last Update: 1900-01-01