[1]吴钟强,张耀文,商琳.基于语义特征的多视图情感分类方法[J].智能系统学报,2017,12(05):745-751.[doi:10.11992/tis.201706026]
 WU Zhongqiang,ZHANG Yaowen,SHANG Lin.Multi-view sentiment classification of microblogs based on semantic features[J].CAAI Transactions on Intelligent Systems,2017,12(05):745-751.[doi:10.11992/tis.201706026]
点击复制

基于语义特征的多视图情感分类方法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第12卷
期数:
2017年05期
页码:
745-751
栏目:
出版日期:
2017-10-25

文章信息/Info

Title:
Multi-view sentiment classification of microblogs based on semantic features
作者:
吴钟强12 张耀文12 商琳12
1. 南京大学 计算机软件新技术国家重点实验室, 江苏 南京 210046;
2. 南京大学 计算机科学与技术系, 江苏 南京 210046
Author(s):
WU Zhongqiang12 ZHANG Yaowen12 SHANG Lin12
1. State Key Laboratory of Novel Software Technology, Nanjing University, Nanjing 210046, China;
2. Department of Computer Science and Technology, Nanjing University, Nanjing 210046, China
关键词:
情感分析文本挖掘潜在语义分析多模态语义特征特征融合特征提取
Keywords:
sentiment analysistext mininglatent semantic analysismulti-viewsemantic featuresfeature fusionfeature extraction
分类号:
TP181
DOI:
10.11992/tis.201706026
摘要:
情感分析也称为意见挖掘,是对文本中所包含的情感倾向进行分析的技术。目前很多情感分析工作都是基于纯文本的。而在微博上,除了文本,大量的图片信息也蕴含了丰富的情感信息。本文提出了一种基于文本和图像的多模态分类算法,通过使用潜在语义分析,将文本特征和图像特征分别映射到同维度下的语义空间,得到各自的语义特征,并用SVM-2K进行分类。利用新浪微博热门微博栏目下爬取的文字和配图的微博数据进行了实验。实验结果表明,通过融合文本和图像的语义特征,情感分类的效果好于单独使用文本特征或者图像特征。
Abstract:
The objective in sentiment analysis is to analyze the sentiment tendency contained in subjective text. Most sentiment analysis methods deal with text only and ignore the information provided in the corresponding pictures. In this paper, we propose a multi-view microblog analysis method based on semantic features. Using latent semantic analysis, we map both the text and image features to the semantic space in the same dimensionality, and use SVM-2K to obtain and classify the respective semantic features. We conducted experiments by crawling text and pictures from popular microblogs. The results show that, by combining the semantic features of text and pictures, the sentiment classification result is better than that obtained using text or image features alone.

参考文献/References:

[1] LIU B. Sentiment analysis and opinion mining[J]. Synthesis lectures on human language technologies, 2012, 5(1):1-167.
[2] PANG T B, PANG B, LEE L. Thumbs up? Sentiment classification using machine learning[J].Proceedings of EMNLP, 2002:79-86.
[3] TÄCKSTRÖM O, MCDONALD R. Semi-supervised latent variable models for sentence-level sentiment analysis[C]//The 49th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA, 2011:569-574.
[4] QIU G, LIU B, BU J, et al. Opinion word expansion and target extraction through double propagation[J]. Computational linguistics, 2011, 37(1):9-27.
[5] WU Y, ZAHNG Q, HUANG X, et al. Phrase Dependency Parsing for Opinion Mining[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA, 2009:1533-1541.
[6] LIU Y, HUANG X, AN A, et al. ARSA:a sentiment-aware model for predicting sales performance using blogs[C]//International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA, 2007:607-614.
[7] MISHNE G, GLANCE N S. Predicting movie sales from blogger sentiment[C]//National Conference on Artificial Intelligence. Menlo Park, USA, 2006:155-158.
[8] O’CONNOR B, BALASUBRAMANYAN R, ROUTLEDGE B R, et al. From tweets to polls:linking text sentiment to public opinion time series[C]//Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media. Menlo Park, USA, 2010:122-129.
[9] CHIANG H C, MOSES R L, POTTER L C. Model-based Bayesian feature matching with application to synthetic aperture radar target recognition[J]. Pattern recognition, 2001, 34(8):1539-1553.
[10] MCCULLOUGH C L. Feature and data-level fusion of infrared and visual images[J]. Proceedings of SPIE-the international society for optical engineering, 1999, 3719:312-318.
[11] YANG J, YANG J Y, ZHANG D, et al. Feature fusion:parallel strategy vs. serial strategy[J]. Pattern recognition, 2003, 36(6):1369-1381.
[12] SALTON G, WONG A, YANG C S. A vector space model for automatic indexing[M]. New York:ACM, 1975:613-620.
[13] DEERWESTER S, DUMAIS S T, FURNAS G W. Indexing by latent semantic analysis[J]. Journal of the american society for information science, 1990, 41:391-407.
[14] REHDER B, SCHREINER M E, WOLFE M B W, et al. Using latent semantic analysis to assess knowledge:some technical considerations[J]. Discourse processes, 1998, 25(2/3):337-354.
[15] GOLUB G H, REINSCH C. Singular value decomposition and least squares solutions[J]. Numerische mathematik,1970, 14(5):403-420.
[16] WANG F, PENG J, LI Y. Hypergraph based feature fusion for 3-D object retrieval[J]. Neurocomputing, 2015, 151:612-619.
[17] FARQUHAR J D R, HARDOON D R, MENG H, et al. Two view learning:SVM-2K, theory and practice[C]//International Conference on Neural Information Processing Systems. Stroud sburg, USA, 2005:355-362.
[18] ZHANG H P, YU H K, XIONG D Y, et al. HHMM-based Chinese lexical analyzer ICTCLAS[C]//Proceedings of the second SIGHAN workshop on Chinese language Processing-Volume 17. Stroudsburg, USA, 2003:758-759.
[19] TAN S, ZHANG J. An empirical study of sentiment analysis for chinese documents[J]. Expert systems with applications, 2008, 34(4):2622-2629.
[20] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of machine learning research, 2003, 3:993-1022.
[21] ZHAO W X, JIANG J, WENG J, et al. Comparing twitter and traditional media using topic models[J]. Lecture notes in computer science, 2011, 6611:338-349.
[22] YAN X, GUO J, LAN Y, et al. A biterm topic model for short texts[C]//Proceedings of the 22nd international conference on World Wide Web. New York, USA, 2013:1445-1456.

相似文献/References:

[1]赵文清,侯小可,沙海虹.语义规则在微博热点话题情感分析中的应用[J].智能系统学报,2014,9(01):121.[doi:10.3969/j.issn.1673-4785.201208020]
 ZHAO Wenqing,HOU Xiaoke,SHA Haihong.Application of semantic rules to sentiment analysis of microblog hot topics[J].CAAI Transactions on Intelligent Systems,2014,9(05):121.[doi:10.3969/j.issn.1673-4785.201208020]
[2]李海林,邹金串.基于分类词典的文本相似性度量方法[J].智能系统学报,2017,12(04):556.[doi:10.11992/tis.201608010]
 LI Hailin,ZOU Jinchuan.Text similarity measure method based on classified dictionary[J].CAAI Transactions on Intelligent Systems,2017,12(05):556.[doi:10.11992/tis.201608010]
[3]张森,张晨,林培光,等.基于用户查询日志的网络搜索主题分析[J].智能系统学报,2017,12(05):668.[doi:10.11992/tis.201706096]
 ZHANG Sen,ZHANG Chen,LIN Peiguang,et al.Web search topic analysis based on user search query logs[J].CAAI Transactions on Intelligent Systems,2017,12(05):668.[doi:10.11992/tis.201706096]
[4]曾碧卿,韩旭丽,王盛玉,等.层次化双注意力神经网络模型的情感分析研究[J].智能系统学报,2020,15(3):460.[doi:10.11992/tis.201812017]
 ZENG Biqing,HAN Xuli,WANG Shengyu,et al.Hierarchical double-attention neural networks for sentiment classification[J].CAAI Transactions on Intelligent Systems,2020,15(05):460.[doi:10.11992/tis.201812017]

备注/Memo

备注/Memo:
收稿日期:2017-06-08。
基金项目:国家自然科学基金项目(61672276);江苏省自然科学基金项目(20161406).
作者简介:吴钟强,男,1992年生,硕士研究生,主要研究方向为文本挖掘、情感分析;张耀文,男,1989年生,硕士研究生,主要研究方向为文本挖掘、情感分析;商琳,女,1973年生,副教授,博士,主要研究方向为计算智能、机器学习、文本挖掘等。
通讯作者:吴钟强.E-mail:wuzqchom@163.com
更新日期/Last Update: 2017-10-25