[1]BI Zhizhen,YANG Degang,FENG Ji.Self-adaptive spectral clustering algorithm for ultra-large-scale data[J].CAAI Transactions on Intelligent Systems,2023,18(2):251-259.[doi:10.11992/tis.202110038]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
18
Number of periods:
2023 2
Page number:
251-259
Column:
学术论文—机器学习
Public date:
2023-05-05
- Title:
-
Self-adaptive spectral clustering algorithm for ultra-large-scale data
- Author(s):
-
BI Zhizhen1; YANG Degang1; 2; FENG Ji1; 2
-
1. College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China;
2. Chongqing Engineering Research Center of Educational Big Data Intelligent Perception and Application, Chongqing Normal University, Chongqing 401331, China
-
- Keywords:
-
data clustering; ultra-scalable; approximate natural neighbor; spectral clustering; natural neighbor; bipartite graph; adaptive; no parameter
- CLC:
-
TP311
- DOI:
-
10.11992/tis.202110038
- Abstract:
-
An approximate natural neighbor-based self-adaptive ultra-scalable spectral clustering algorithm (AN3-SUSC) is proposed to address the problems of artificially set neighborhood parameters and huge calculation amounts in the process of super-large-scale data clustering. First, the data size is reduced by the algorithm through mixed random selection. Then, approximate natural neighbors are used to determine local neighborhood parameters adaptively, and a similarity matrix is constructed. Finally, the bipartite graph is utilized for migration and segmentation to map the data space to the original ultra-large-scale data space, thereby completing the spectral clustering analysis. Experimental results on super-large-scale data sets show that the algorithm improves the clustering effect of super-large-scale data sets and reduces the computational scale while having high robustness and strong adaptability.