[1]朱 林,王士同,修 宇.鲁棒的模糊方向相似性聚类算法[J].智能系统学报,2008,3(01):43-50.
 ZHU Lin,WANG Shi-tong,XIU Yu.A robust clustering algorithm with fuzzy directional similarity[J].CAAI Transactions on Intelligent Systems,2008,3(01):43-50.
点击复制

鲁棒的模糊方向相似性聚类算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第3卷
期数:
2008年01期
页码:
43-50
栏目:
出版日期:
2008-02-25

文章信息/Info

Title:
A robust clustering algorithm with fuzzy directional similarity
文章编号:
1673-4785(2008)01-0043-08
作者:
朱  林王士同修  宇
江南大学信息工程学院,江苏无锡214122
Author(s):
ZHU Lin WANG Shi-tong XIU Yu
School of Information Engineering,Jiangnan University , Wuxi 214122, China
关键词:
聚类算法方向相似性鲁棒性竞争学习
Keywords:
clustering algorithm directional similarity robustness competitive learning
分类号:
TP39141
文献标志码:
A
摘要:
鉴于文本数据具有方向性数据的特征,可利用方向数据的知识完成对文本数据聚类,提出了模糊方向相似性聚类算法FDSC,继而从竞争学习角度,通过引入隶属度约束函数,并根据拉格朗日优化理论推导出鲁棒的模糊方向相似性聚类算法RFDSC.实验结果表明R FDSC算法能够快速有效地对文本数据集进行聚类.
Abstract:
One of the important characteristics of text clustering in datasets is that each cluster center in the dataset has a direction that is different from that of all other cluster centers. This directional information should be incorp orated in clustering analysis. In this paper, a new robust fuzzy directional sim ilarity clustering algorithm (RFDSC) is proposed by introducing membership const raints. The new objective function was constructed. Finally, the robustness and convergence of the proposed algorithm were analyzed from the viewpoint of compet itive learning. Experimental tests of text clustering in datasets using RFDSC de monstrate its effectiveness.

参考文献/References:

[1]DHILLON I S, MODHA D S. Concept decompositions for large sparse tex t data using clustering [J]. Machine Learning, 2001, 42(1):143175.
[2]BANERJEE A, DHILLON I S, GHOST J, et al. Generative model based clusteri ng of directional data[C]// Conference on Knowledge Discovery in Data. W as hington, DC, 2003.
[3]LI H X, WANG S T, XIU Y. Applying robust directional similarity based cl ustering approach RDSC to classification of gene expression data [J]. J Bioinf ormatics and Computational Biology, 2006, 4(3):745768.
[4]ZHANG Y J, LIU Z Q. Selfsplitting competitive learning: a new online c lustering paradigm [J]. IEEE Trans on Neural Network, 2002, 13(2):369380.
[5]WU S H, LIEW W C, YAN H, et al. Cluster analysis of gene expression data based on selfsplitting and merging competitive learning [J]. IEEE Trans on Information Technology in Biomedicine, 2004, 8(1):515.
[6]XU L, KRZYAK A, OJA E. Rival penalized competitive learning for clusteri ng analysis, RBF net and curve detection [J]. IEEE Trans on Neural Network, 19 93,4(4):636649.
[7]魏立梅, 谢维信. 对手抑制式模糊C均值算法[J]. 电子学报, 2000,28(7) :6366. 
WEI Limei, XIE Weixin. Rival checked fuzzy Cmeans algorithm [J]. Acta Electr onica Sinica, 2000, 28(7): 6366.
[8]TAN P N. MICHAEL S, KUMAR V. Introduction to data mining [M]. Bost on: Addison Wesley,2005.
[9]姜 宁, 宫秀军,史忠植. 高维特征空间中文本聚类研究[J]. 计算机工程与应用, 2002, 38(10):6367.
 JIANG Ning, GONG Xiujun, SHI Zhongzhi. Text clustering in highdimension featur e space[J]. Computer Engineering and Applications, 2002, 38(10): 6367.
[10]ALEXANDER S, JOYDEEP G. Cluster ensemblesa knowledge reuse fram e work for combining partitions [J]. Journal of Machine Learning Research, 2002, 3(3):583 617.
[11]MAKOTO I, TAKENOBU T. Hierarchical Bayesian clustering for automa tic text classification[R]. Department of Computer Science,Tokyo Institute of Technology, 1995.
[12]RAND W. Objective criteria for the evaluation of clustering m ethods [J]. Journal of the American Statistical Association, 1971, 66(336):846850.
 [13]Available on http://kdd.ics.uci.edu./databases/ 20newsgroups/20news groups.html.
[14]Available on ftp://www.cs.umn.edu/~karypis/CLUTO/flies/datasets.t ar.gz.
[15]Mow: A toolkit for statistical language modeling, text retrieval, classi fication and clustering Available on http://www.cs.cmu.edu/mccallum/bow.

相似文献/References:

[1]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报,2015,10(04):607.[doi:10.3969/j.issn.1673-4785.201411036]
 SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10(01):607.[doi:10.3969/j.issn.1673-4785.201411036]
[2]郭瑛洁,王士同,许小龙.基于最大间隔理论的组合距离学习算法[J].智能系统学报,2015,10(6):843.[doi:10.11992/tis.201504027]
 GUO Yingjie,WANG Shitong,XU Xiaolong.Learning a linear combination of distances based on the maximum-margin theory[J].CAAI Transactions on Intelligent Systems,2015,10(01):843.[doi:10.11992/tis.201504027]
[3]陈爱国,王士同.基于极大熵的知识迁移模糊聚类算法[J].智能系统学报,2017,12(01):95.[doi:10.11992/tis.201602003]
 CHEN Aiguo,WANG Shitong.A maximum entropy-based knowledge transfer fuzzy clustering algorithm[J].CAAI Transactions on Intelligent Systems,2017,12(01):95.[doi:10.11992/tis.201602003]
[4]淦文燕,刘冲.一种改进的搜索密度峰值的聚类算法[J].智能系统学报,2017,12(02):229.[doi:10.11992/tis.201512036]
 GAN Wenyan,LIU Chong.An improved clustering algorithm that searches and finds density peaks[J].CAAI Transactions on Intelligent Systems,2017,12(01):229.[doi:10.11992/tis.201512036]

备注/Memo

备注/Memo:
收稿日期:2007-05-14.
基金项目:
国家“863”资助项目(2006AA10Z313);
国家自然科学基金资助项目(60773206;60704047);
国防应用基础研究基金资助项目(A142046 1266);
教育部科学研究重点基金资助项目(105087)
作者简介:
朱 林,男,1983年生,硕士研究生,主要研究方向为图像处理、模式识别.
王士同,男,1964年生,教授,博士生导师,中国计算机学会高级会员, 主要研究方向为人工智能、模式识别、数据挖掘、神经网络及生物信息学.
修 宇,男,1976年生,硕士研究生,主要研究方向为模式识别、数据挖掘.
通讯作者:王士同.E-mail:wxwangst@yahoo.com.cn.
更新日期/Last Update: 2009-05-10