[1]赵军,王红.融合情感极性和逻辑回归的虚假评论检测方法[J].智能系统学报编辑部,2016,11(3):336-342.[doi:10.11992/tis.201603027]
 ZHAO Jun,WANG Hong.Detection of fake reviews based on emotional orientation and logistic regression[J].CAAI Transactions on Intelligent Systems,2016,11(3):336-342.[doi:10.11992/tis.201603027]
点击复制

融合情感极性和逻辑回归的虚假评论检测方法(/HTML)
分享到:

《智能系统学报》编辑部[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年3期
页码:
336-342
栏目:
出版日期:
2016-06-25

文章信息/Info

Title:
Detection of fake reviews based on emotional orientation and logistic regression
作者:
赵军12 王红12
1. 山东师范大学 信息科学与工程学院, 山东 济南 250014;
2. 山东省分布式计算软件新技术重点实验室, 山东 济南 250014
Author(s):
ZHAO Jun12 WANG Hong12
1. School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China;
2. Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Ji’nan 250014, China
关键词:
电子商务虚假评论购物行为情感极性逻辑回归
Keywords:
Electronic commercefake reviewshopping behavioremotional polaritylogic regression
分类号:
TP39
DOI:
10.11992/tis.201603027
摘要:
在线购物评论为消费者比较商品的质量和其他一些购买特性提供了有用信息,然而却有大量的虚假评论者受利益驱使撰写虚假或者不公正的评论来迷惑消费者。先前的研究一般都是使用文本相似度和评分模式来探测虚假评论,这些算法可以检测特定类型的攻击者,在现实场景中许多虚假评论者刻意模仿正常用户对商品进行评论,因此先前的算法对检测这类攻击效果不佳。本文通过分析评论文本的感情极性,抽取不同的特征并使用逻辑回归模型来检测虚假评论;首先,借用自然语言处理的相关技术来分析评论文本的情感极性,判断每个用户的情感偏离大众情感的程度,如果偏离越大则说明其是虚假评论者的概率就越大;然后再选取其他几个重要特征结合逻辑回归模型进行虚假检测;通过实验对比,表明了该方法取得了较好的效果。
Abstract:
Online shopping reviews provide valuable customer information for comparing the quality of products and several other aspects of future purchases. However, spammers are joining this community to mislead and confuse consumers by writing fake or unfair reviews. To detect the presence of spammers, reviewer styles have been scrutinized for text similarity and rating patterns. These studies have succeeded in identifying certain types of spammers. However, there are other spammers who can manipulate their behaviors such that they are indistinguishable from normal reviewers, and thus, they cannot be detected by available techniques. In this paper, we analyze the orientation of comments, extract different features, and use a logic regression model to detect false comments. First, we utilize natural language processing technology to analyze the orientation of comments and compute the departures of those comments from those of the general public. The greater is the deviation, the greater is the probability of the comment being generated by a spammer. Then, we select several other important features and combine them with the logic regression model to identify fake comments. The experimental results verify the greater accuracy of the proposed method.

参考文献/References:

[1] KOLCZ A, ALSPECTOR J. SVM-based filtering of E-mail spam with content specific misclassification costs[C]//Proceedings of ICDM-2001 Workshop on Text Mining. Dallas, USA, 2001: 324-332.
[2] BECCHETTI L, CASTILLO C, DONATO D, et al. Link-based characterization and detection of web spam[C]//Adversarial Information Retrieval on the Web. Washington, USA, 2006: 1012-1021.
[3] JINDAL N, LIU Bing. Review spam detection[C]//Proceedings of the 16th International Conference on World Wide Web. Alberta, Canada, 2007: 1189-1190.
[4] JINDAL N, LIU Bing, et al. Opinion spam and analysis[C]//Proceedings of the 2008 International Conference on Web Search and Data Mining. California, USA, 2008: 219-230.
[5] WU Fang, HUBERMAN B A. Opinion information under costly express[J]. ACM transactions on intelligence systems and technology, 2010, 1(1): 5.
[6] 谭文堂, 朱洪, 葛斌, 等. 垃圾评论自动过滤方法[J]. 国防科技大学学报, 2012, 34(5): 153-157, 168. TAN Wentang, ZHU Hong, GE Bin, et al. Method of review spam detection[J]. Journal of national university of defense technology, 2012, 34(5): 153-157, 168.
[7] OTT M, CHOI Y, CARIDIE C, et al. Finding deceptive opinion spam by any stretch of the imagination[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: human language technologies. Portland, USA, 2011, 1: 309-319.
[8] 任亚峰, 尹兰, 姬东鸿. 基于语言结构和情感极性的虚假评论识别[J]. 计算机科学与探索, 2014, 8(3): 313-320. REN Yafeng, YIN Lan, JI Donghong. Deceptive reviews detection based on language structure and sentiment polarity[J]. Journal of frontiers of computer science and technology, 2014, 8(3): 313-320.
[9] WANG Guan, XIE Sihong, LIU Bing, et al. Identify online store review spammers via social review graph[J]. ACM Transactions on intelligent systems and technology, 2012, 3(4): 61.
[10] GAO Jian, DONG Yuwei, SHANG Mingsheng, et al. Group-based ranking method for online rating systems with spamming attacks[J]. EPL (europhysics letters), 2015, 110(2): 28003.
[11] 唐波, 陈光, 王星雅, 等. 微博新词发现及情感倾向性判断分析[J]. 山东大学学报:理学版, 2015, 50(1): 20-25. TANG Bo, CHEN Guang, WANG Xingya, et al. Analysis on new word detection and sentiment orientation in Micro-blog[J]. Journal of Shandong university: nature science, 2015, 50(1): 20-25.
[12] 何凤英. 基于语义理解的中文博文倾向性分析[J]. 计算机应用, 2011, 31(8): 2130-2133, 2137. HE Fengying. Orientation analysis for Chinese blog text based on semantic comprehension[J]. Journal of computer application, 2011, 31(8): 2130-2133, 2137.
[13] 邸鹏, 李爱萍, 段利国. 基于转折句式的文本情感倾向性分析[J]. 计算机工程与设计, 2014, 35(12): 4289-4295. DI Peng, LI Aiping, DUAN Liguo. Text sentiment polarity analysis based on transition sentence[J]. Computer engineering and design, 2014, 35(12): 4289-4295.
[14] FENG Song, BANERJEE R, CHOI Y. Syntactic stylometry for deception detection[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Jeju, Korea, 2012: 171-175.
[15] LI Jiwei, CARDIE C, LI Sujian. TopicSpam: a topic-model-based approach for spam detection[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofi, Bulgaria, 2013: 217-221.
[16] JINDAL N, LIU Bing, LIM E P. Finding unusual review patterns using unexpected rules[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management. Ontario, Canada, 2010: 1549-1552.
[17] JO Y, OH A H. Aspect and sentiment unification model for online review analysis[C]//Proceedings of the 4th ACM International Conference on Web Search and Data Mining. New York, USA, 2011: 815-824.

相似文献/References:

[1]高珊,马良,张惠珍.基于人工蜂群算法的电子商务多Agent自动谈判模型[J].智能系统学报编辑部,2015,10(03):476.[doi:10.3969/j.issn.1673-4785.201405023]
 GAO Shan,MA Liang,ZHANG Huizhen.Multi-Agent automated negotiation model for E-commerce based on the artificial bee colony algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(3):476.[doi:10.3969/j.issn.1673-4785.201405023]
[2]王洪利.一种人工情绪模型及其电商计算实验应用[J].智能系统学报编辑部,2019,14(03):508.[doi:10.11992/tis.201712021]
 WANG Hongli.An artificial emotion model and its application in the computation experiment of e-commerce[J].CAAI Transactions on Intelligent Systems,2019,14(3):508.[doi:10.11992/tis.201712021]

备注/Memo

备注/Memo:
收稿日期:2016-3-17;改回日期:。
基金项目:国家自然科学基金项目(61373149,61472233);山东省科技计划项目(2012GGX10118,2014GGX101026);山东省教育科学规划项目(ZK1437B010).
作者简介:赵军,男,1989年生,硕士研究生,主要研究方向为大数据、数据挖掘、机器学习。王红,女,1966年生,教授,博士生导师,主要研究方向为大数据、复杂网络、数据挖掘。主持国家自然基金项目1项,参与国家自然基金项目3项,主持省级基金项目6项,发表学术论文43篇。
通讯作者:王红.E-mail:wanghong106@163.com.
更新日期/Last Update: 1900-01-01