[1]LYU Li,CHEN Wei,XIAO Renbin,et al.Density peak clustering algorithm based on weighted reverse nearest neighbor for uneven density datasets[J].CAAI Transactions on Intelligent Systems,2024,19(1):165-175.[doi:10.11992/tis.202212015]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
19
Number of periods:
2024 1
Page number:
165-175
Column:
学术论文—人工智能基础
Public date:
2024-01-05
- Title:
-
Density peak clustering algorithm based on weighted reverse nearest neighbor for uneven density datasets
- Author(s):
-
LYU Li1; 2; CHEN Wei1; 2; XIAO Renbin3; HAN Longzhe1; 2; TAN Dekun1; 2
-
1. School of Information Engineering, Nanchang Institute of Technology, Nanchang 330099, China;
2. Nanchang Key Laboratory of IoT Perception and Collaborative Computing for Smart City, Nanchang Institute of Technology, Nanchang 330099, China;
3. School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China
-
- Keywords:
-
density peak clustering; uneven density distribution; reverse nearest neighbor; shared reverse nearest neighbor; sample similarity; local density; distribution strategy; data mining
- CLC:
-
TP301
- DOI:
-
10.11992/tis.202212015
- Abstract:
-
For data with uneven density distribution, the density peak clustering algorithm disregards the sparsity difference among intercluster samples, causing an inaccurate selection of the cluster center. Moreover, the allocation strategy easily divides the samples in sparse areas into dense areas by mistake, leading to a poor clustering effect. Therefore, the density peak clustering algorithm based on the weighted reverse nearest neighbor (DPC-WR) against datasets with uneven density distribution is proposed in this paper. First, the weight coefficient based on the sigmoid function is introduced to the local density formula to increase the weight of samples in sparse areas. Combined with the concept of reverse nearest neighbor, the local density of samples is then redesigned to improve the recognition rate of cluster centers effectively. Second, an improved sample similarity strategy is introduced, which utilizes reverse nearest neighbors and shares this neighbor’s information between samples to increase the similarity of samples in the same cluster. This effectively solves the problem of sample allocation error in sparse areas. Experiments on uneven density distribution, complex morphology, and UCI datasets show that the clustering effect of the DPC-WR algorithm outperforms that of IDPC-FA, FNDPC, FKNN-DPC, DPC, and DPCSA algorithms.