[1]黄栋,王昌栋,赖剑煌,等.基于决策加权的聚类集成算法[J].智能系统学报编辑部,2016,11(3):418-425.[doi:10.11992/tis.201603030]
 HUANG Dong,WANG Changdong,LAI Jianhuang,et al.Clustering ensemble by decision weighting[J].CAAI Transactions on Intelligent Systems,2016,11(3):418-425.[doi:10.11992/tis.201603030]
点击复制

基于决策加权的聚类集成算法(/HTML)
分享到:

《智能系统学报》编辑部[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年3期
页码:
418-425
栏目:
出版日期:
2016-06-25

文章信息/Info

Title:
Clustering ensemble by decision weighting
作者:
黄栋1 王昌栋23 赖剑煌23 梁云1 边山1 陈羽1
1. 华南农业大学 数学与信息学院, 广东 广州 510640;
2. 中山大学 数据科学与计算机学院, 广东 广州 510006;
3. 广东省信息安全技术重点实验室, 广东 广州 510006
Author(s):
HUANG Dong1 WANG Changdong23 LAI Jianhuang23 LIANG Yun1 BIAN Shan1 CHEN Yu1
1. College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510640, China;
2. School of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, China;
3. Guangdong Key Laboratory of Information Security Tec
关键词:
聚类聚类集成决策加权二部图模型图分割基聚类可信度分享加权集成
Keywords:
clusteringclustering ensembledecision weightingbipartite graph formulationgraph partitioningbase clusteringcredit sharingweighted clustering ensemble
分类号:
TP18
DOI:
10.11992/tis.201603030
摘要:
聚类集成的目标是融合多个聚类成员的信息以得到一个更优、更鲁棒的聚类结果。针对聚类成员可靠度估计与加权问题,提出了一个基于二部图模型与决策加权机制的聚类集成方法。在该方法中,每个聚类成员被视作一个包含若干连接决策的集合。每个聚类成员的决策集合享有一个单位的可信度,该可信度由集合内的各个决策共同分享。基于可信度分享的思想,进一步对各个聚类成员内的决策进行加权,并将此决策加权机制整合至一个统一的二部图模型;然后利用快速二部图分割算法将该图划分为若干子集,以得到最终聚类结果。实验结果表明,该方法相较于其他对比方法在聚类效果及运算效率上均表现出显著优势。
Abstract:
The clustering ensemble technique aims to combine multiple base clusterings to achieve better and more robust clustering results.To evaluate the reliability of the base clusterings and weight them accordingly, in this paper, we propose a new clustering ensemble approach based on a bipartite graph formulation and decision weighting strategy. Each base clustering is treated as a bag of decisions, and is assigned one unit of credit. This credit is shared (divided) by all the decisions in one clustering. Using the credit sharing concept, we propose weighting the decisions in the base clusterings with regard to the credit they have. Then, the clustering ensemble problem is formulated into a bipartite graph model that incorporates the decision weights, and the final clustering is obtained by rapidly partitioning the bipartite graph. Experimental results have demonstrated the superiority of the proposed algorithm in terms of both effectiveness and efficiency.

参考文献/References:

[1] STREHL A, GHOSH J. Cluster ensembles-a knowledge reuse framework for combining multiple partitions[J]. The journal of machine learning research, 2003, 3(3): 583-617.
[2] CRISTOFOR D, SIMOVICI D. Finding median partitions using information-theoretical-based genetic algorithms[J]. Journal of universal computer science, 2002, 8(2): 153-172.
[3] FERN X Z, BRODLEY C E. Solving cluster ensemble problems by bipartite graph partitioning[C]//Proceedings of the 21st International Conference on Machine Learning. New York, NY, USA, 2004.
[4] FRED A L N, JAIN A K. Combining multiple clusterings using evidence accumulation[J]. IEEE transactions on pattern analysis and machine intelligence, 2005, 27(6): 835-850.
[5] WANG Xi, YANG Chunyu, ZHOU Jie. Clustering aggregation by probability accumulation[J]. Pattern recognition, 2009, 42(5): 668-675.
[6] SINGH V, MUKHERJEE L, PENG Jiming, et al. Ensemble clustering using semidefinite programming with applications[J]. Machine learning, 2010, 79(1/2): 177-200.
[7] HUANG Dong, LAI Jianhuang, WANG Changdong. Exploiting the wisdom of crowd: a multi-granularity approach to clustering ensemble[C]//Proceedings of the 4th International Conference on Intelligence Science and Big Data Engineering. Beijing, China, 2013: 112-119.
[8] HUANG Dong, LAI Jianhuang, WANG Changdong. Combining multiple clusterings via crowd agreement estimation and multi-granularity link analysis[J]. Neurocomputing, 2015, 170: 240-250.
[9] HUANG Dong, LAI Jianhuang, WANG Changdong. Ensemble clustering using factor graph[J]. Pattern recognition, 2016, 50: 131-142.
[10] HUANG Dong, LAI Jianhuang, WANG Changdong. Robust ensemble clustering using probability trajectories[J]. IEEE transactions on knowledge and data engineering, 2016, 28(5): 1312-1326.
[11] LI Tao, DING C. Weighted consensus clustering[C]//Proceedings of the 2008 SIAM International Conference on Data mining. Auckland, New Zealand, 2008: 798-809.
[12] KARYPIS G, KUMAR V. Multilevel k-way partitioning scheme for irregular graphs[J]. Journal of parallel and distributed computing, 1998, 48(1): 96-129.
[13] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: Analysis and an algorithm[C]//Advances in Neural Information Processing Systems. Vancouver, Canada, 2001.
[14] TOPCHY A, JAIN A K, PUNCH W. Clustering ensembles: models of consensus and weak partitions[J]. IEEE transactions on pattern analysis and machine intelligence, 2005, 27(12): 1866-1881.
[15] VEGA-PONS S, CORREA-MORRIS J, RUIZ-SHULCLOPER J. Weighted partition consensus via kernels[J]. Pattern recognition, 2010, 43(8): 2712-2724.
[16] VEGA-PONS S, RUIZ-SHULCLOPER J, GUERRA-GANDóN A. Weighted association based methods for the combination of heterogeneous partitions[J]. Pattern recognition letters, 2011, 32(16): 2163-2170.
[17] 徐森, 周天, 于化龙, 等. 一种基于矩阵低秩近似的聚类集成算法[J]. 电子学报, 2013, 41(6): 1219-1224.XU Sen, ZHOU Tian, YU Hualong, et al. Matrix low rank approximation-based cluster ensemble algorithm[J]. Acta electronica sinica, 2013, 41(6): 1219-1224.
[18] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[19] LI Zhenguo, WU Xiaoming, CHANG S F. Segmentation using superpixels: a bipartite graph partitioning approach[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA, 2012: 789-796.
[20] BACHE K, LICHMAN M. UCI machine learning repository[EB/OL]. (2013-04-04). http://archive.ics.uci.edu/ml.
[21] IAM-ON N, BOONGOEN T, GARRETT S. Refining pairwise similarity matrix for cluster ensemble problem with cluster relations[C]//Proceedings of the 11th International Conference on Discovery Science. Budapest, Hungary, 2008: 222-233.
[22] IAM-ON N, BOONGOEN T, GARRETT S, et al. A link-based approach to the cluster ensemble problem[J]. IEEE transactions on pattern analysis and machine intelligence, 2011, 33(12): 2396-2409.

相似文献/References:

[1]杨小兵,何灵敏,孔繁胜.切换回归模型的抗噪音聚类算法[J].智能系统学报编辑部,2009,4(06):497.[doi:10.3969/j.issn.1673-4785.2009.06.005]
 YANG Xiao-bing,HE Ling-min,KONG Fan-sheng.A noise-resistant clustering algorithm for switching regression models[J].CAAI Transactions on Intelligent Systems,2009,4(3):497.[doi:10.3969/j.issn.1673-4785.2009.06.005]
[2]季瑞瑞,刘 丁.支持向量数据描述的基因表达数据聚类方法[J].智能系统学报编辑部,2009,4(06):544.[doi:10.3969/j.issn.1673-4785.2009.06.013]
 JI Rui-rui,LIU Ding.Improved gene expression data clustering using a support vector domain description algorithm[J].CAAI Transactions on Intelligent Systems,2009,4(3):544.[doi:10.3969/j.issn.1673-4785.2009.06.013]
[3]张秀玲,逄宗鹏,李少清,等.ANFIS的板形控制动态影响矩阵方法[J].智能系统学报编辑部,2010,5(04):360.
 ZHANG Xiu-ling,PANG Zong-peng,LI Shao-qing,et al.A dynamic influence matrix method for flatness control based on adaptivenetworkbased fuzzy inference systems[J].CAAI Transactions on Intelligent Systems,2010,5(3):360.
[4]李伟,杨晓峰,张重阳,等.复杂网络社团的投影聚类划分[J].智能系统学报编辑部,2011,6(01):57.
 LI Wei,YANG Xiaofeng,ZHANG Chongyang,et al.A clustering method for community detection on complex networks[J].CAAI Transactions on Intelligent Systems,2011,6(3):57.
[5]陈岳峰,苗夺谦,李文,等.基于概念的词汇情感倾向识别方法[J].智能系统学报编辑部,2011,6(06):489.
 CHEN Yuefeng,MIAO Duoqian,LI Wen,et al.Semantic orientation computing based on concepts[J].CAAI Transactions on Intelligent Systems,2011,6(3):489.
[6]方然,苗夺谦,张志飞.一种基于情感的中文微博话题检测方法[J].智能系统学报编辑部,2013,8(03):208.
 FANG Ran,MIAO Duoqian,ZHANG Zhifei.An emotion-based method of topic detection from Chinese microblogs[J].CAAI Transactions on Intelligent Systems,2013,8(3):208.
[7]刘恋,常冬霞,邓勇.动态小生境人工鱼群算法的图像分割[J].智能系统学报编辑部,2015,10(5):669.[doi:10.11992/tis.201501001]
 LIU lian,CHANG Dongxia,DENG Yong.An image segmentation method based on dynamic niche artificial fish-swarm algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(3):669.[doi:10.11992/tis.201501001]
[8]刘贝贝,马儒宁,丁军娣.基于密度的统计合并聚类算法[J].智能系统学报编辑部,2015,10(5):712.[doi:10.11992/tis.201410028]
 LIU Beibei,MA Runing,DING Jundi.Density-based statistical merging clustering algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(3):712.[doi:10.11992/tis.201410028]
[9]朱书伟,周治平,张道文.融合并行混沌萤火虫算法的K-调和均值聚类[J].智能系统学报编辑部,2015,10(6):872.[doi:10.11992/tis.201505043]
 ZHU Shuwei,ZHOU Zhiping,ZHANG Daowen.K-harmonic means clustering merged with parallel chaotic firefly algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(3):872.[doi:10.11992/tis.201505043]
[10]谷飞洋,田博,张思萌,等.基于置换检验的聚类结果评估[J].智能系统学报编辑部,2016,11(3):301.[doi:10.11992/tis.201603038]
 GU Feiyang,TIAN Bo,ZHANG Simeng,et al.Statistical evaluation of the clustering results based on permutation test[J].CAAI Transactions on Intelligent Systems,2016,11(3):301.[doi:10.11992/tis.201603038]
[11]闵帆,王宏杰,刘福伦,等.SUCE:基于聚类集成的半监督二分类方法[J].智能系统学报编辑部,2018,13(06):974.[doi:10.11992/tis.201711027]
 MIN Fan,WANG Hongjie,LIU Fulun,et al.SUCE: semi-supervised binary classification based on clustering ensemble[J].CAAI Transactions on Intelligent Systems,2018,13(3):974.[doi:10.11992/tis.201711027]

备注/Memo

备注/Memo:
收稿日期:2016-3-18;改回日期:。
基金项目:国家自然科学基金项目(61573387,61502543);广东省自然科学基金杰出青年项目(16050000051);广东省自然科学基金博士启动项目(2016A030310457,2015A030310450,2014A030310180);广东省科技计划项目(2015A020209124,2015B010108001);广州市科技计划项目(201508010032);中央高校基本科研业务费专项项目(16lgzd15);华南农业大学青年科技人才培育专项基金项目.
作者简介:黄栋,男,1987年生,讲师,主要研究方向为数据挖掘与模式识别,发表学术论文10余篇。王昌栋,男,1984年生,讲师,主要研究方向为非线性聚类、社交网络、大数据分析,发表学术论文40余篇。赖剑煌,男,1964年生,教授,博士生导师,博士,广东省图象图形学会理事长,中国图象图形学会常务理事,主要研究方向为生物特征识别、数字图像处理、模式识别和机器学习。主持国家自然科学基金与广东联合重点项目、科技部科技支撑课题各1项,主持国家自然科学基金项目4项。发表学术论文近200篇。
通讯作者:王昌栋.E-mail:changdongwang@hotmail.com.
更新日期/Last Update: 1900-01-01