[1]胡学钢 张圆圆.一种挖掘带时间约束序列模式的改进算法[J].智能系统学报,2007,2(02):89-93.
 HU Xue-gang,ZHANG Yuan-yuan.An improved algorithm for mining sequential patterns with time constraints[J].CAAI Transactions on Intelligent Systems,2007,2(02):89-93.
点击复制

一种挖掘带时间约束序列模式的改进算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第2卷
期数:
2007年02期
页码:
89-93
栏目:
出版日期:
2007-04-25

文章信息/Info

Title:
An improved algorithm for mining sequential patterns with time constraints
文章编号:
1673-4785(2007)02-0089-05
作者:
胡学钢 张圆圆
合肥工业大学计算机与信息学院 安徽 合肥 230009
Author(s):
HU Xue-gangZHANG Yuan-yuan
School of Computer and Information, Hefei University of Technology, Hefei 230009, China
关键词:
数据挖掘 序列模式 时间约束
Keywords:
data mining sequential pattern time constrain
分类号:
TP182
文献标志码:
A
摘要:
针对带时间约束的序列模式,提出了一种改进的挖掘算法TSPM,克服了传统的序列模式挖掘方法时空开销大,结果数量巨大且缺少针对性的缺陷.算法引入图结构表示频繁2序列,仅需扫描一次数据库,即可将与挖掘任务相关的信息映射到图中,图结构的表示使得挖掘过程可以充分利用项目之间的次序关系,提高了频繁序列的生成效率.另外算法利用序列的位置信息计算支持度,降低了处理时间约束的复杂性,避免了反复测试序列包含的过程. 实验证明,该算法较传统的序列模式发现算法在时间和空间性能上具有优越性.
Abstract:
An improved time constrained sequential pattern mining algorithm (TSPM) is propo sed, overcoming the problem of traditional sequential mining algorithm whose performance is poor, and result is numerous and short of pertinence. Grap h is introduced to express the frequent 2sequence. It need scan the transactio n database only once, then mapping information related to the mining task int o graph. The graph representation can fully utilize the property of item order i n the mining process, thus improving the generating efficiency of frequent seque nces. Besides it makes use of the positional information of sequence to count su pport, therefore reducing the complexity of time constraints processing, and avo iding the process of testing whether a candidate sequence is contained in a data sequence. Experimental results prove the superiority of the algorithm in time a nd space performance.

参考文献/References:

[1] AGRAWAL R, SRIKANT R. Mining sequential patterns[A]. In: Proc of the 11 st Int Conf on Data Engineering[C]. Taipei, China, 1995.
[2]SRIKANT R, AGRAWAL R. Mining sequential patterns: generalizations and perfor mance improvements[A]. In: Proc of the Fifth Int. Conf. on Extending Database Technology (EDBT)[C]. Avignon, France, 1996.
[3]SPADE M. J.An Efficient algorithm for mining frequent sequences[A]. In: Proc eeding of Machine Learning Journal, Special issue on Unsupervised Learning[C]. [s.l.],2003.
[4]MASSEGLIA F, CATHALA F,PONCELET P. The psp approach for mining seq uential p atterns[A]. In: Proc.1998 European symp. Principle of data mining and knowledg e discovery (PKDD’98)[C]. Nantes, France, 1998.
[5]HAN Peijian. FreeSpan: frequent patternprojected sequential pattern mini ng [A ]. In:Proc 2000 Int Conf Knowledge Discovery and Data Mining (KDD.00)[C]. Bos ton, 2000.
[6]PEI J, HAN J, MORTAZAVIASL B, PINTO H, CHEN Q, DAYAL U,HSU M C. P refixSpan: mining sequential patterns efficiently by prefix-projected pattern gr owth[A]. In: Proc 2001 Int Conf Data Engineering (ICDE’01)[C]. Heidelberg, Germany, 2001.
[7]邓明荣,叶福根,史 烈,潘云鹤.挖掘泛化序列模式的一种有效方法[J]. 浙江大学学报,2002,29(4):415-422.
 DENG Mingrong, YE Fugen, SHI Lie, PAN Yunhe. Efficient algorithm for mining gene ralized sequential patterns[J]. Journal of Zhejiang University, 2002,29(4):415 -42.
[8]朱立运,朱建秋.带时间特征的序列模式挖掘算法TESP[J].计算机工程,2 004,30(10),51-54.
 ZHU Liyun, ZHU Jianqiu. Timeenriched equential pattern mining algorithm TESP[ J] . Computer Engineering, 2004, 30(10), 51-54.
[9]周 斌,吴泉源.序列模式挖掘的一种渐进算法[J].计算机学报,1999,2 2(10):882-887.
 ZHOU Bin, WU Quanyuan. An incremental algorithm for mining sequential pattern[ J]. Chinese Journal of Computers,1999,22(10):882-887.
[10]陈金玉,樊兴华.序列模式的一种挖掘算法[J].重庆大学学报(自然科学版),2001,24(1):92-94.
CHEN Jinyu, FAN Xinghua. Algorithm for mining sequential pattern[J]. Journal o f Chongqing University (Natural Science Edition) ,2001,24(1):92-94.
[11]刘月波,陆阶平,刘同明.基于CTID序列模式的一种改进算法[J].微机发展,2005,15(3):20-22.
 LIU Yuebo, LU Jieping, LIU Tongming. An improved algorithm for mining sequenti al patterns based on CTID[J]. Microcomputer Development , 2005,15(3):20-22.

备注/Memo

备注/Memo:
收稿日期:2006-06-20.
基金项目:安徽省自然科学基金资助项目(050420207).
作者简介:
胡学钢,男,1961年生,教授,主要研究方向为知识工程、数据挖掘、数据结构.主持及参加国家自然基金课题、国家教委博士点专项基金课题、安徽省自然基金课题、安徽省教委基金课题等多项课题.发表论文20多篇,出版著作多部. E-mail:jsjxhuxg@hfut.edu.cn.
 张圆圆,女,1982年生,硕士研究生,主要研究方向为知识工程、数据挖掘. E-mail:zhangyuanyuan0401@sina.com.
更新日期/Last Update: 2009-05-06