[1]赵军,王红.融合情感极性和逻辑回归的虚假评论检测方法[J].智能系统学报编辑部,2016,11(3):336-342.[doi:10.11992/tis.201603027]
ZHAO Jun,WANG Hong.Detection of fake reviews based on emotional orientation and logistic regression[J].CAAI Transactions on Intelligent Systems,2016,11(3):336-342.[doi:10.11992/tis.201603027]
点击复制
《智能系统学报》编辑部[ISSN 1673-4785/CN 23-1538/TP] 卷:
11
期数:
2016年第3期
页码:
336-342
栏目:
学术论文—智能系统
出版日期:
2016-06-25
- Title:
-
Detection of fake reviews based on emotional orientation and logistic regression
- 作者:
-
赵军1,2, 王红1,2
-
1. 山东师范大学 信息科学与工程学院, 山东 济南 250014;
2. 山东省分布式计算软件新技术重点实验室, 山东 济南 250014
- Author(s):
-
ZHAO Jun1,2, WANG Hong1,2
-
1. School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China;
2. Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Ji’nan 250014, China
-
- 关键词:
-
电子商务; 虚假评论; 购物行为; 情感极性; 逻辑回归
- Keywords:
-
Electronic commerce; fake review; shopping behavior; emotional polarity; logic regression
- 分类号:
-
TP39
- DOI:
-
10.11992/tis.201603027
- 摘要:
-
在线购物评论为消费者比较商品的质量和其他一些购买特性提供了有用信息,然而却有大量的虚假评论者受利益驱使撰写虚假或者不公正的评论来迷惑消费者。先前的研究一般都是使用文本相似度和评分模式来探测虚假评论,这些算法可以检测特定类型的攻击者,在现实场景中许多虚假评论者刻意模仿正常用户对商品进行评论,因此先前的算法对检测这类攻击效果不佳。本文通过分析评论文本的感情极性,抽取不同的特征并使用逻辑回归模型来检测虚假评论;首先,借用自然语言处理的相关技术来分析评论文本的情感极性,判断每个用户的情感偏离大众情感的程度,如果偏离越大则说明其是虚假评论者的概率就越大;然后再选取其他几个重要特征结合逻辑回归模型进行虚假检测;通过实验对比,表明了该方法取得了较好的效果。
- Abstract:
-
Online shopping reviews provide valuable customer information for comparing the quality of products and several other aspects of future purchases. However, spammers are joining this community to mislead and confuse consumers by writing fake or unfair reviews. To detect the presence of spammers, reviewer styles have been scrutinized for text similarity and rating patterns. These studies have succeeded in identifying certain types of spammers. However, there are other spammers who can manipulate their behaviors such that they are indistinguishable from normal reviewers, and thus, they cannot be detected by available techniques. In this paper, we analyze the orientation of comments, extract different features, and use a logic regression model to detect false comments. First, we utilize natural language processing technology to analyze the orientation of comments and compute the departures of those comments from those of the general public. The greater is the deviation, the greater is the probability of the comment being generated by a spammer. Then, we select several other important features and combine them with the logic regression model to identify fake comments. The experimental results verify the greater accuracy of the proposed method.
备注/Memo
收稿日期:2016-3-17;改回日期:。
基金项目:国家自然科学基金项目(61373149,61472233);山东省科技计划项目(2012GGX10118,2014GGX101026);山东省教育科学规划项目(ZK1437B010).
作者简介:赵军,男,1989年生,硕士研究生,主要研究方向为大数据、数据挖掘、机器学习。王红,女,1966年生,教授,博士生导师,主要研究方向为大数据、复杂网络、数据挖掘。主持国家自然基金项目1项,参与国家自然基金项目3项,主持省级基金项目6项,发表学术论文43篇。
通讯作者:王红.E-mail:wanghong106@163.com.
更新日期/Last Update:
1900-01-01