[1]王一宾,李田力,程玉胜.结合谱聚类的标记分布学习[J].智能系统学报,2019,14(5):966-973.[doi:10.11992/tis.201809019]
WANG Yibin,LI Tianli,CHENG Yusheng.Label distribution learning based on spectral clustering[J].CAAI Transactions on Intelligent Systems,2019,14(5):966-973.[doi:10.11992/tis.201809019]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
14
期数:
2019年第5期
页码:
966-973
栏目:
学术论文—机器学习
出版日期:
2019-09-05
- Title:
-
Label distribution learning based on spectral clustering
- 作者:
-
王一宾1,2, 李田力1, 程玉胜1,2
-
1. 安庆师范大学 计算机与信息学院, 安徽 安庆 246011;
2. 安徽省高校智能感知与计算重点实验室, 安徽 安庆 246011
- Author(s):
-
WANG Yibin1,2, LI Tianli1, CHENG Yusheng1,2
-
1. School of Computer and Information, Anqing Normal University, Anqing 246011, China;
2. Key Laboratory of Intelligent Perception and Computing of Anhui Province, Anqing 246011, China
-
- 关键词:
-
谱聚类; 标记分布学习; 相似度矩阵; 拉普拉斯变换; K-均值; 参数模型; 标记分布; 机器学习
- Keywords:
-
spectral clustering; label distribution learning; similarity matrix; Laplace transform; K-means; parametric model; label distribution; machine learning
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.201809019
- 摘要:
-
标记分布是一种新的学习范式,现有算法大多数直接使用条件概率建立参数模型,未充分考虑样本之间的相关性,导致计算复杂度增大。基于此,引入谱聚类算法,通过样本之间相似性关系将聚类问题转化为图的全局最优划分问题,进而提出一种结合谱聚类的标记分布学习算法(label distribution learning with spectral clustering,SC-LDL)。首先,计算样本相似度矩阵;然后,对矩阵进行拉普拉斯变换,构造特征向量空间;最后,通过K-means算法对数据进行聚类建立参数模型,预测未知样本的标记分布。与现有算法在多个数据集上的实验表明,本算法优于多个对比算法,统计假设检验进一步说明算法的有效性和优越性。
- Abstract:
-
Label distribution is a new learning paradigm. Most of the existing algorithms use conditional probability to build parametric models but do not consider the links between samples fully, which increases computational complexity. On this basis, the spectral clustering algorithm is introduced to transform the clustering problem into the global optimum graph partitioning problem based on the similarity relation between samples. Thus, a label distribution learning algorithm combined with spectral clustering (SC-LDL) is proposed. First, we calculate the similarity matrix of the samples. Then, we transform the matrix using the Laplace transform to construct the feature vector space. Finally, we cluster the data to establish the parameter model with K-means algorithm and use this new model to predict the label distribution of unknown samples. The comparison between SC-LDL and the existing algorithm on multiple data sets shows that this algorithm is superior to multiple contrast algorithms. Furthermore, statistical hypothesis testing illustrates the effectiveness and superiority of the SC-LDL algorithm.
备注/Memo
收稿日期:2018-09-13。
基金项目:安徽省高校重点科研项目(KJ2017A352).
作者简介:王一宾,男,1970年生,教授,主要研究方向为多标记学习、机器学习和软件安全。发表学术论文40余篇;李田力,男,1996年生,硕士研究生,主要研究方向为标记分布学习;程玉胜,男,1969年生,教授,博士,主要研究方向为数据挖掘、粗糙集。发表学术论文90余篇。
通讯作者:程玉胜.E-mail:chengyshaq@163.com
更新日期/Last Update:
1900-01-01