[1]邓廷权,王强.半监督类保持局部线性嵌入方法[J].智能系统学报,2021,16(1):98-107.[doi:10.11992/tis.202003007]
DENG Tingquan,WANG Qiang.Semi-supervised class preserving locally linear embedding[J].CAAI Transactions on Intelligent Systems,2021,16(1):98-107.[doi:10.11992/tis.202003007]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
16
期数:
2021年第1期
页码:
98-107
栏目:
学术论文—知识工程
出版日期:
2021-01-05
- Title:
-
Semi-supervised class preserving locally linear embedding
- 作者:
-
邓廷权, 王强
-
哈尔滨工程大学 数学科学学院,黑龙江 哈尔滨 150001
- Author(s):
-
DENG Tingquan, WANG Qiang
-
College of Mathematical Sciences, Harbin engineering university, Harbin 150001, China
-
- 关键词:
-
非线性特征提取; 流形学习; 半监督; 标记信息; 聚类; 可视化
- Keywords:
-
nonlinear feature extraction; manifold learning; semi-supervised; labeled information; clustering; visualization
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202003007
- 摘要:
-
为使局部线性嵌入(local linear embedding, LLE)这一无监督高维数据的非线性特征提取方法提取出的特征在分类或聚类学习上更优,提出一种半监督类保持局部线性嵌入(semi-supervised class preserving local linear embedding, SSCLLE)的非线性特征提取方法。该方法将半监督信息融入到LLE中,首先对标记样本近邻赋予伪标签,增大标记样本数量。其次,对标记样本之间的距离进行局部调整,缩小同类样本间距,扩大异类样本间距。同时在局部线性嵌入优化目标函数中增加全局同类样本间距和异类样本间距的约束项,使得提取出的低维特征可以确保同类样本点互相靠近,而异类样本点彼此分离。在一系列实验中,其聚类精确度以及可视化效果明显高于无监督LLE和现有半监督流特征提取方法,表明该方法提取出的特征具有很好的类保持特性。
- Abstract:
-
To make local linear embedding (LLE), the nonlinear feature extraction method for unsupervised high-dimensional data, more optimal in classification or clustering learning, we propose a nonlinear semi-supervised class preserving local linear embedding (SSCLLE) feature extraction method. This method integrates semi-supervised information into LLE. First, pseudo-labels are assigned to the nearby neighbors of the labeled samples to increase the number of labeled samples. Second, the distance between the labeled samples is partially adjusted to reduce the distance between similar samples and expand the distance between heterogeneous samples. Simultaneously, the constraints of the globally same sample spacing and heterogeneous sample spacing are added in the local linear embedding optimization objective function so that the extracted low-dimensional features can ensure that the same sample points are near each other, whereas the heterogeneous sample points are separated from each other. In a series of experiments, the clustering accuracy and visualization effect of the proposed method are significantly higher than those of unsupervised LLE and the existing semi-supervised flow feature extraction methods, indicating that the features extracted by this method have good class retention characteristics.
更新日期/Last Update:
2021-02-25