[1]赵晓晓,周治平.结合稀疏表示与约束传递的半监督谱聚类算法[J].智能系统学报,2018,13(05):855-863.[doi:10.11992/tis.201703013]
 ZHAO Xiaoxiao,ZHOU Zhiping.A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation[J].CAAI Transactions on Intelligent Systems,2018,13(05):855-863.[doi:10.11992/tis.201703013]
点击复制

结合稀疏表示与约束传递的半监督谱聚类算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第13卷
期数:
2018年05期
页码:
855-863
栏目:
出版日期:
2018-09-05

文章信息/Info

Title:
A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation
作者:
赵晓晓 周治平
江南大学 物联网技术应用教育部工程研究中心, 江苏 无锡 214122
Author(s):
ZHAO Xiaoxiao ZHOU Zhiping
Engineering Research Center of Internet of Things Technology Applications Ministry of Education, Jiangnan University, Wuxi 214122, China
关键词:
数据挖掘聚类分析谱聚类半监督学习稀疏表示约束传递
Keywords:
data miningcluster analysisspectral clusteringsemi-supervised learningsparse representationconstraint propagation
分类号:
TP18
DOI:
10.11992/tis.201703013
摘要:
针对半监督谱聚类不能有效处理大规模数据,没有考虑约束传递不能充分利用有限约束信息的问题,提出一种结合稀疏表示和约束传递的半监督谱聚类算法。首先,根据约束信息生成约束矩阵,将其引入到谱聚类中;然后,将约束集合中的数据作为地标点构造稀疏表示矩阵,近似获得图相似度矩阵,从而改进约束谱聚类模型;同时,根据地标点的相似度矩阵生成连通区域,在每个连通区域内动态调整近邻点,利用约束传递进一步提高聚类准确率。实验表明,所提算法和约束谱聚类相比,在算法效率方面具有明显优势,且准确率没有明显下降;和快速谱聚类方法相比,在聚类准确率上有所提升。
Abstract:
The semi-supervised spectral clustering algorithm does not deal with large-scale datasets effectively and does not fully utilize the constraint information because it does not consider the constraint propagation. To address these drawbacks, this paper proposes a semi-supervised spectral clustering algorithm that combines sparse representation and constraint propagation. The algorithm first generates the constraint matrix according to the constraint information, introduces it into the spectral clustering, and then constructs a sparse representation matrix by taking the data points in the constrained sets as the landmarks to approximate the graph similarity matrix, thereby revising the constrained spectral clustering model. Meanwhile, the connected region is generated according to the similarity matrix of the landmark data points, and the neighboring nodes are dynamically adjusted in each connected region. The clustering accuracy is further improved using the constraint propagation. Experimental results show that the proposed method is more efficient than constrained spectral clustering algorithms, and their accuracy levels are similar. Moreover, its clustering accuracy exceeds those of the fast spectral clustering algorithms.

参考文献/References:

[1] VON Luxburg U. A tutorial on spectral clustering[J]. Statistics and computing, 2007, 17(4):395-416.
[2] SHI Jianbo, MALIK J. Normalized cuts and image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2000, 22(8):888-905.
[3] HU Han, FENG Jianjiang, YU Chuan, et al. Multi-class constrained normalized cut with hard, soft, unary and pairwise priors and its applications to object segmentation[J]. IEEE transactions on image processing, 2013, 22(11):4328-4340.
[4] 刘建伟, 刘媛, 罗雄麟. 半监督学习方法[J]. 计算机学报, 2015, 38(8):1592-1617 LIU Jianwei, LIU Yuan, LUO Xionglin. semi-supervised learning methods[J]. Chinese journal of computers, 2015, 38(8):1592-1617
[5] ALUSH A, FRIEDMAN A, Goldberger J. Pairwise clustering based on the mutual-information criterion[J]. Neurocomputing, 2016, 182:284-293.
[6] FORESTIER G, WEMMERT C. Semi-supervised learning using multiple clusterings with limited labeled data[J]. Information sciences, 2016, 361-362:48-65.
[7] KAMVAR S D, KLEIN D, MANNING C D. Spectral learning[C]//Proceedings of the 18th International Joint Conference on Artificial Intelligence. Acapulco, Mexico, 2003:561-566.
[8] 蒋伟进, 许宇晖, 王欣. 基于谱图和成对约束的主动半监督聚类算法[J]. 控制与决策, 2013, 28(6):904-908 JIANG Weijin, XU Yuhui, WANG Xin. Active semi-supervised clustering algorithm based-on pair-wise constraints[J]. Control and decision, 2013, 28(6):904-908
[9] DING Shifei, JIA Hongjie, ZHANG Liwen, et al. Research of semi-supervised spectral clustering algorithm based on pairwise constraints[J]. Neural computing and applications, 2014, 24(1):211-219.
[10] CUCURINGU M, KOUTIS I, CHAWLA S, et al. Simple and scalable constrained clustering:a generalized spectral method[C]//Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Cadiz, Spain, 2016:445-454.
[11] WANG Xiang, QIAN Buyue, DAVIDSON I. On constrained spectral clustering and its applications[J]. Data mining and knowledge discovery, 2014, 28(1):1-30.
[12] YU Zhiwen, LUO Peinan, YOU J, et al. Incremental semi-supervised clustering ensemble for high dimensional data clustering[J]. IEEE transactions on knowledge and data engineering, 2016, 28(3):701-714.
[13] LU Zhiwu, PENG Yuxin. Exhaustive and efficient constraint propagation:a graph-based learning approach and its applications[J]. International journal of computer vision, 2013, 103(3):306-325.
[14] CAI Deng, CHEN Xinlei. Large scale spectral clustering via landmark-based sparse representation[J]. IEEE transactions on cybernetics, 2015, 45(8):1669-1680.
[15] SEMERTZIDIS T, RAFAILIDIS D, STRINTZIS M G, et al. Large-scale spectral clustering based on pairwise constraints[J]. Information processing and management, 2015, 51(5):616-624.

相似文献/References:

[1]张继福,张素兰,胡立华.约束概念格及其构造方法[J].智能系统学报,2006,1(02):31.
 ZHANG Ji-fu,ZHANG Su-lan,HU Li-hua.Constrained concept lattice and its construction method[J].CAAI Transactions on Intelligent Systems,2006,1(05):31.
[2]王国胤,张清华,胡 军.粒计算研究综述[J].智能系统学报,2007,2(06):8.
 WANG Guo-yin,ZHANG Qing-hua,HU Jun.An overview of granular computing[J].CAAI Transactions on Intelligent Systems,2007,2(05):8.
[3]何清.物联网与数据挖掘云服务[J].智能系统学报,2012,7(03):189.
 HE Qing.The Internet of things and the data mining cloud service[J].CAAI Transactions on Intelligent Systems,2012,7(05):189.
[4]刘秀梅,赵克勤.基于联系数的不确定空情意图识别[J].智能系统学报,2012,7(05):450.
 LIU Xiumei,ZHAO Keqin.Inference method on intention with uncertain aerial information based on the connection number[J].CAAI Transactions on Intelligent Systems,2012,7(05):450.
[5]尹宁,刘富,张玉.采用最小误差阈值分割算法的基因芯片图像分析[J].智能系统学报,2013,8(01):28.[doi:10.3969/j.issn.1673-4785.201207015]
 YIN Ning,LIU Fu,ZHANG Yu.Image analysis of gene chip using minimum error threshold segmentation algorithm[J].CAAI Transactions on Intelligent Systems,2013,8(05):28.[doi:10.3969/j.issn.1673-4785.201207015]
[6]卿铭,孙晓梅.一种新的聚类有效性函数:模糊划分的模糊熵[J].智能系统学报,2015,10(01):75.[doi:10.3969/j.issn.1673-4785.201410004]
 QING Ming,SUN Xiaomei.A new clustering effectiveness function: fuzzy entropy of fuzzy partition[J].CAAI Transactions on Intelligent Systems,2015,10(05):75.[doi:10.3969/j.issn.1673-4785.201410004]
[7]李海林,郭韧,万校基.基于特征矩阵的多元时间序列最小距离度量方法[J].智能系统学报,2015,10(03):442.[doi:10.3969/j.issn.1673-4785.201405047]
 LI Hailin,GUO Ren,WAN Xiaoji.A minimum distance measurement method for amultivariate time series based on the feature matrix[J].CAAI Transactions on Intelligent Systems,2015,10(05):442.[doi:10.3969/j.issn.1673-4785.201405047]
[8]王德文,孙志伟.一种基于内存计算的电力用户聚类分析方法[J].智能系统学报,2015,10(04):569.[doi:10.3969/j.issn.1673-4785.201411011]
 WANG Dewen,SUN Zhiwei.A method for cluster analysis of electric power consumers based on in-memory computing[J].CAAI Transactions on Intelligent Systems,2015,10(05):569.[doi:10.3969/j.issn.1673-4785.201411011]
[9]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报,2015,10(04):607.[doi:10.3969/j.issn.1673-4785.201411036]
 SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10(05):607.[doi:10.3969/j.issn.1673-4785.201411036]
[10]张永库,尹灵雪,孙劲光.基于改进的遗传算法的模糊聚类算法[J].智能系统学报,2015,10(04):627.[doi:10.3969/j.issn.1673-4785.201503033]
 ZHANG Yongku,YIN Lingxue,SUN Jinguang.Fuzzy clustering algorithm based on the improved genetic algorithm[J].CAAI Transactions on Intelligent Systems,2015,10(05):627.[doi:10.3969/j.issn.1673-4785.201503033]

备注/Memo

备注/Memo:
收稿日期:2017-03-10。
基金项目:国家自然科学基金项目(61373126).
作者简介:赵晓晓,女,1993年生,硕士研究生,主要研究方向为数据挖掘;周治平,男,1962年生,教授,博士,主要研究方向为智能检测、自动化装置、网络安全。发表学术论文80余篇。
通讯作者:赵晓晓.E-mail:6151905019@vip.jiangnan.edu.cn.
更新日期/Last Update: 2018-10-25