[1]赵光华,杨焘,付冬梅.数据流形边界及其分布条件的增量式降维算法[J].智能系统学报,2023,18(5):975-983.[doi:10.11992/tis.202205007]
ZHAO Guanghua,YANG Tao,FU Dongmei.Incremental dimensionality reduction algorithm based on data manifold boundaries and distribution state[J].CAAI Transactions on Intelligent Systems,2023,18(5):975-983.[doi:10.11992/tis.202205007]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第5期
页码:
975-983
栏目:
学术论文—机器学习
出版日期:
2023-09-05
- Title:
-
Incremental dimensionality reduction algorithm based on data manifold boundaries and distribution state
- 作者:
-
赵光华1, 杨焘1,2, 付冬梅1,2
-
1. 北京科技大学 自动化学院, 北京 100083;
2. 北京科技大学 顺德创新学院, 广东 佛山 528300
- Author(s):
-
ZHAO Guanghua1, YANG Tao1,2, FU Dongmei1,2
-
1. School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China;
2. Shunde Innovation School, University of Science and Technology Beijing, Foshan 528300, China
-
- 关键词:
-
增量式学习; 流形降维; 噪声; 流形边界; 概率分布; 投影; 离群点检测; 分类
- Keywords:
-
incremental learning; manifold dimension reduction; noise; manifold boundary; probability distribution; projection; outlier detection; classification
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202205007
- 摘要:
-
为了解决增量流形学习中的噪声干扰,以及对不同分布状态下的新数据进行流形降维问题,本文提出一种数据流形边界及其分布条件的增量式降维算法(incremental dimensionality reduction algorithm based on data manifold boundaries and distribution state, IDR-DMBDS)。该算法首先分析噪声概率分布同时对数据降噪,确定降噪数据的流形形态为主流形,并在主流形上表征出噪声的分布形式,以此获得近似的原数据流形边界,然后基于流形边界判别新数据的分布状态,最后将分布于原流形形态之上以及之外的新数据分别映射至低维空间。实验表明,该算法能够有效实现基于流形的增量式高维含噪数据的低维特征挖掘。
- Abstract:
-
To eliminate the impact of noise on incremental manifold learning and conduct manifold dimensionality reduction on new data under different distribution states, an incremental dimensionality reduction algorithm is proposed based on data manifold boundaries and distribution state. In the algorithm, the probability distribution of noises is analyzed while simultaneously performing data noise reduction. The manifold shape of the data with noise reduction is determined as the main manifold, wherein the distribution form of noise is represented to obtain the approximate manifold boundary of the original data. Subsequently, the distribution state of the new data is determined based on the manifold boundary. Finally, the new data distributed inside and outside the original manifold shape are mapped to the low-dimensional space. Experiments reveal that the algorithm can effectively achieve the excavation of the low-dimensional features of incremental high-dimensional noisy data based on manifold learning.
更新日期/Last Update:
1900-01-01