[1]秦海菲,杜军平.酒店在线评论数据的特征挖掘[J].智能系统学报,2018,13(6):1006-1014.[doi:10.11992/tis.201806016]
QIN Haifei,DU Junping.Feature mining based on online hotel review[J].CAAI Transactions on Intelligent Systems,2018,13(6):1006-1014.[doi:10.11992/tis.201806016]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第6期
页码:
1006-1014
栏目:
学术论文—人工智能基础
出版日期:
2018-10-25
- Title:
-
Feature mining based on online hotel review
- 作者:
-
秦海菲1, 杜军平2
-
1. 楚雄师范学院 信息科学与技术学院, 云南 楚雄 675000;
2. 北京邮电大学 计算机学院, 北京 100876
- Author(s):
-
QIN Haifei1, DU Junping2
-
1. School of Information Science and Technology, Chuxiong Normal University, Chuxiong 675000, China;
2. School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
-
- 关键词:
-
酒店; 在线点评; 数据获取; 特征抽取; 特征挖掘; 聚类分析; 分类; 智能推荐
- Keywords:
-
hotel; online review; data capture; feature extract; feature mining; cluster analysis; classification; intelligent recommendation
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201806016
- 摘要:
-
论文以酒店在线评论数据为研究对象,对酒店在线评论数据的特征挖掘进行了研究。论文首先从酒店在线评论数据的获取出发,经过数据清洗、词性分析、特征抽取、指标确定、特征筛选、特征确定、特征校验几个环节,实现了酒店在线评论数据特征挖掘的目的。论文以词频为基础,融合了词性分析、聚类分析等方法,利用词频数(TF)、词频率(TF1)、词频权重(TTW)、评论频率(DF)、逆文档频率(IDF)和TF1-IDF等指标对候选特征词进行降维,得出酒店在线评论数据的特征,并对特征词进行校验,完成了酒店在线评论数据的特征挖掘的过程。论文将为以评论为依据的客户分类、酒店分类、智能推荐奠定基础。
- Abstract:
-
In this study, the feature mining of online hotel review data is investigated. First, online hotel reviews data were obtained. To mine features from the review data, data cleaning, part-of-speech analysis, feature extraction, index determination, feature selection, feature determination, feature checking were carried out. Based on the word frequency, integrating part-of-speech analysis, and cluster analysis, the word frequency (TF), word frequency rate (TF1), word frequency weight (TTW), comment frequency (DF), inverse document frequency (IDF), and TF1-IDF of candidate feature words were applied to reduce dimension. The online hotel review data features were obtained, and then the feature words were verified. This paper will lay a solid foundation for the classification of hotels and customers and intelligent recommendation based on online reviews.
备注/Memo
收稿日期:2018-06-05。
基金项目:国家自然科学基金项目(61320106006,61532006,61772083).
作者简介:秦海菲,女,1980年生,副教授,主要研究方向为数据库、数据仓库、数据挖掘;杜军平,女,1963年生,教授,博士生导师,主要研究方向为人工智能、社交网络分析、数据挖掘、运动图像处理,主持国家"863"、"973"计划项目、国家自然科学基金重点项目、国家自然科学基金重大国际合作项目、北京市自然科学基金重点项目等多项,发表学术论文多篇。
通讯作者:杜军平.E-mail:junpingdu@126.com
更新日期/Last Update:
2018-12-25