[1]王雪平,林甲祥,巫建伟,等.基于可决系数的自适应关联规则挖掘算法[J].智能系统学报,2020,15(2):352-359.[doi:10.11992/tis.201809030]
WANG Xueping,LIN Jiaxiang,WU Jianwei,et al.Adaptive-association-rule mining algorithm based on determination coefficient[J].CAAI Transactions on Intelligent Systems,2020,15(2):352-359.[doi:10.11992/tis.201809030]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
15
期数:
2020年第2期
页码:
352-359
栏目:
学术论文—人工智能基础
出版日期:
2020-03-05
- Title:
-
Adaptive-association-rule mining algorithm based on determination coefficient
- 作者:
-
王雪平1, 林甲祥1, 巫建伟2, 高敏节1
-
1. 福建农林大学 计算机与信息学院, 福建 福州 350002;
2. 自然资源部第三海洋研究所, 福建 厦门 361001
- Author(s):
-
WANG Xueping1, LIN Jiaxiang1, WU Jianwei2, GAO Minjie1
-
1. College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China;
2. Third Institute of Oceanography, Ministry of Natural Resources, Xiamen 361001, China
-
- 关键词:
-
关联规则; 阶次; 自适应; 可决系数; 规则; 支持度; 置信度; 曲线拟合; 多项式; 数据挖掘
- Keywords:
-
association rule; order; adaptive; coefficient of determination; rule; support; confidence; curve fitting; polynomial; data mining
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.201809030
- 摘要:
-
针对以频繁项集产生?规则产生为核心的两阶段关联规则挖掘,存在需要人工以先验知识指定最小支持度和最小置信度阈值的缺陷。本文提出以支持数和置信度为依据,采用曲线拟合技术,根据可决系数自动确定曲线的次数及对应多项式的算法AARM_BR(Adaptation Association Rule Mining Based on Determination Coefficient R2),从而确定支持度和置信度阈值。在标准数据集Trolley和Groceries上进行关联规则挖掘实验,结果表明本算法更具有数据依赖性,在用户不具备先验知识的情况下,无须人为指定多项式阶次、支持度和置信度阈值的优点。
- Abstract:
-
The two-stage association-rule-mining algorithm based on the frequent item set generation and rule generation requires the manual assigning of minimum support and minimum confidence. To overcome this defect, this paper proposes a new method using the curve fitting technology based on the number of supports and confidence, in which the number of the order of curve and corresponding polynomial is automatically determined by a determination coefficient, which is called “adaptation association rule mining based on the determination coefficient R2” (AARM_BR). As the proposed AARM_BR method is driven by data, the thresholds of support and confi-dence can be automatically obtained. The experiments on two standard datasets Trolley and Groceries show that compared with a recently published method, the proposed method is more data-dependent and automatically determines the number of order of polynomial and the threshold of support and confidence under the circumstance of not having a priori knowledge.
备注/Memo
收稿日期:2018-09-15。
基金项目:国家自然科学基金项目(41401458);福建省自然科学基金项目(2018J01644,2018J01645,2016J01753);中国-东盟海上合作基金项目(2020399);国家海洋局第三海洋研究所项目(2016020);福建省中青年教师教育科研项目(JT180129)
作者简介:王雪平,讲师,主要研究方向为数据挖掘、模式识别。主持省级科研项目1项,参与省级科研项目10余项;林甲祥,博士。主要研究方向为空间数据挖掘、人工智能和大数据。主持国家级和省部级科研项目4项,参与省部级科研项目20余项;获福建省科学技术奖二等奖1项,获国家发明专利授权2项,获国家计算机软件著作权登记5项。发表学术论文40 余篇;巫建伟,工程师,博士。主要研究方向为海洋环境管理信息系统、空间数据挖掘、海洋大数据分析。主持或参与国家级和省部级科研项目10余项。发表学术论文10余篇。
通讯作者:王雪平,E-mail:gggfvgu@163.com
更新日期/Last Update:
1900-01-01