[1]LI Na,XU Sen,XU Xiufang,et al.A three-level weighted approach for text clustering ensemble[J].CAAI Transactions on Intelligent Systems,2024,19(4):807-816.[doi:10.11992/tis.202303029]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
19
Number of periods:
2024 4
Page number:
807-816
Column:
学术论文—机器学习
Public date:
2024-07-05
- Title:
-
A three-level weighted approach for text clustering ensemble
- Author(s):
-
LI Na1; 2; XU Sen1; XU Xiufang1; XU Heyang1; GUO Naixuan1; 2; LIU Xuanqi1; ZHOU Tian3
-
1. School of Information Engineering, Yancheng Institute of Technology, Yancheng 224051, China;
2. Key Laboratory of Computer Network and Information Integration, Southeast University, Nanjing 211189, China;
3. School of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, China
-
- Keywords:
-
text clustering; clustering ensemble; weighted clustering ensemble; three-level weighting; weighted clustering; multi-level weighting; cluster analysis; unsupervised learning
- CLC:
-
TP181;TP301
- DOI:
-
10.11992/tis.202303029
- Abstract:
-
To improve the clustering ensemble effect, this paper designs a unified framework for weighted points, clusters and partitions, and proposes a three-level weighted approach for text clustering ensemble. Firstly, the hypergraph adjacency matrix is generated according to the base clustering, and then the weighted adjacency matrix is obtained by successively weighting the points, clusters and partitions. Finally, the final result is obtained by the hierarchical condensation clustering algorithm. Experiments were carried out on multiple real text datasets. The results show that compared with the unweighted results and other level weighted results, this approach has better clustering effect. The average increase of three-layer weighted compared with that unweighted is 12.02%. Compared with the other 8 weighted methods in recent years, the average ranking of this algorithm is the first in all datasets, which verifies the effectiveness of the proposed method.