[1]赵晓晓,周治平.结合稀疏表示与约束传递的半监督谱聚类算法[J].智能系统学报,2018,13(5):855-863.[doi:10.11992/tis.201703013]
ZHAO Xiaoxiao,ZHOU Zhiping.A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation[J].CAAI Transactions on Intelligent Systems,2018,13(5):855-863.[doi:10.11992/tis.201703013]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
13
期数:
2018年第5期
页码:
855-863
栏目:
学术论文—机器学习
出版日期:
2018-09-05
- Title:
-
A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation
- 作者:
-
赵晓晓, 周治平
-
江南大学 物联网技术应用教育部工程研究中心, 江苏 无锡 214122
- Author(s):
-
ZHAO Xiaoxiao, ZHOU Zhiping
-
Engineering Research Center of Internet of Things Technology Applications Ministry of Education, Jiangnan University, Wuxi 214122, China
-
- 关键词:
-
数据挖掘; 聚类分析; 谱聚类; 半监督学习; 稀疏表示; 约束传递
- Keywords:
-
data mining; cluster analysis; spectral clustering; semi-supervised learning; sparse representation; constraint propagation
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.201703013
- 摘要:
-
针对半监督谱聚类不能有效处理大规模数据,没有考虑约束传递不能充分利用有限约束信息的问题,提出一种结合稀疏表示和约束传递的半监督谱聚类算法。首先,根据约束信息生成约束矩阵,将其引入到谱聚类中;然后,将约束集合中的数据作为地标点构造稀疏表示矩阵,近似获得图相似度矩阵,从而改进约束谱聚类模型;同时,根据地标点的相似度矩阵生成连通区域,在每个连通区域内动态调整近邻点,利用约束传递进一步提高聚类准确率。实验表明,所提算法和约束谱聚类相比,在算法效率方面具有明显优势,且准确率没有明显下降;和快速谱聚类方法相比,在聚类准确率上有所提升。
- Abstract:
-
The semi-supervised spectral clustering algorithm does not deal with large-scale datasets effectively and does not fully utilize the constraint information because it does not consider the constraint propagation. To address these drawbacks, this paper proposes a semi-supervised spectral clustering algorithm that combines sparse representation and constraint propagation. The algorithm first generates the constraint matrix according to the constraint information, introduces it into the spectral clustering, and then constructs a sparse representation matrix by taking the data points in the constrained sets as the landmarks to approximate the graph similarity matrix, thereby revising the constrained spectral clustering model. Meanwhile, the connected region is generated according to the similarity matrix of the landmark data points, and the neighboring nodes are dynamically adjusted in each connected region. The clustering accuracy is further improved using the constraint propagation. Experimental results show that the proposed method is more efficient than constrained spectral clustering algorithms, and their accuracy levels are similar. Moreover, its clustering accuracy exceeds those of the fast spectral clustering algorithms.
备注/Memo
收稿日期:2017-03-10。
基金项目:国家自然科学基金项目(61373126).
作者简介:赵晓晓,女,1993年生,硕士研究生,主要研究方向为数据挖掘;周治平,男,1962年生,教授,博士,主要研究方向为智能检测、自动化装置、网络安全。发表学术论文80余篇。
通讯作者:赵晓晓.E-mail:6151905019@vip.jiangnan.edu.cn.
更新日期/Last Update:
2018-10-25