[1]高小方,贾宗翰,梁吉业.基于趋势一致性学习的对比聚类算法[J].智能系统学报,2026,21(2):389-398.[doi:10.11992/tis.202506027]
GAO Xiaofang,JIA Zonghan,LIANG Jiye.Contrastive clustering algorithm based on trend consistency learning[J].CAAI Transactions on Intelligent Systems,2026,21(2):389-398.[doi:10.11992/tis.202506027]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
21
期数:
2026年第2期
页码:
389-398
栏目:
学术论文—机器学习
出版日期:
2026-03-05
- Title:
-
Contrastive clustering algorithm based on trend consistency learning
- 作者:
-
高小方1, 贾宗翰1, 梁吉业1,2
-
1. 山西大学 计算机与信息技术学院, 山西 太原 030006;
2. 计算智能与中文处理教育部实验室, 山西 太原 030006
- Author(s):
-
GAO Xiaofang1, JIA Zonghan1, LIANG Jiye1,2
-
1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;
2. Laboratory of Computational Intelligence and Chinese Processing, Ministry of Education, Taiyuan 030006, China
-
- 关键词:
-
对比聚类; 对比学习; 假负例; 趋势一致性; 伪标签; 语义信息; 类间区分度; 掩码矩阵
- Keywords:
-
contrast clustering; contrastive learning; false negatives; trend consistency; pseudo labels; semantic information; inter-class distinguishability; mask matrix
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.202506027
- 摘要:
-
近年来,对比聚类已成为数据挖掘与机器学习领域的研究热点,旨在利用对比学习强大的特征表示能力来提升聚类性能,然而对比学习的使用往往会引入类别冲突的假负例问题,从而降低了对比聚类性能。为解决这一问题,本文提出一种基于趋势一致性约束策略的对比聚类算法(contrastive clustering algorithm based on trend consistency learning),通过在趋势一致性数组中标记具有一致性类别信息的高置信度样本对,并利用这种语义信息计算出趋势约束矩阵,辅助挑选正样本,同时结合实例级和聚类级一致性损失函数实现聚类级与实例级样本信息的动态交互,增强样本的一致性及类间区分度。相较于其他对比聚类算法,该方法能够利用多轮训练过程中的伪标签变化趋势,得到具有高置信度的类别趋势一致性的样本对,从而提高模型的聚类性能。实验证明了该算法的有效性。
- Abstract:
-
In recent years, contrastive clustering has become a research hotspot in the fields of data mining and machine learning, aiming to enhance clustering performance by leveraging the powerful feature representation capabilities of contrastive learning. However, the use of contrastive learning often introduces the problem of false negative examples due to category conflicts, thereby reducing the performance of contrastive clustering. To address this issue, this paper proposes a contrastive clustering algorithm based on a trend consistency constraint strategy (CCTC). By marking high-confidence sample pairs with consistent category information in the trend consistency array and using this semantic information to calculate the trend constraint matrix to assist in selecting positive samples, the algorithm achieves dynamic interaction between cluster-level and instance-level sample information through the combination of instance-level and cluster-level consistency loss functions, thereby enhancing sample consistency and inter-class distinguishability. Compared with other contrastive clustering algorithms, this method can utilize the pseudo-label change trends in the multi-round training process to obtain sample pairs with high-confidence category trend consistency, thus improving the clustering performance of the model. Experiments have demonstrated the effectiveness of the algorithm.
备注/Memo
收稿日期:2025-6-24。
基金项目:山西省基础研究计划项目(202203021221001).
作者简介:高小方,副教授,博士,中国计算机学会会员,主要研究方向为数据挖掘与机器学习。主持和完成国家自然科学基金项目1项、国家社会科学基金项目1项、山西省自然科学基金项目2项、山西省留学基金项目1项,参与国家自然科学基金项目和省部级科研项目7项,发表学术论文10余篇。E-mail:gxfhtp@sxu.edu.cn。;贾宗翰,硕士研究生,主要研究方向为深度聚类。E-mail:582069778@qq.com。;梁吉业,教授,博士生导师,博士,电子电气工程师协会会士,中国计算机学会会士,中国人工智能学会会士, 主要研究方向为数据挖掘与机器学习、大数据分析技术、人工智能。先后主持国家级重大项目1项、国家级项目10余项。发表学术论文400余篇。 E-mail:ljy@sxu.edu.cn。
通讯作者:高小方. E-mail:gxfhtp@sxu.edu.cn
更新日期/Last Update:
1900-01-01