[1]王英博,郭凯雪.视图映射和循环一致性生成的不完整多视图聚类[J].智能系统学报,2025,20(2):316-328.[doi:10.11992/tis.202311044]
WANG Yingbo,GUO Kaixue.Incomplete multiview clustering based on view mapping and cyclic consistency generation[J].CAAI Transactions on Intelligent Systems,2025,20(2):316-328.[doi:10.11992/tis.202311044]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第2期
页码:
316-328
栏目:
学术论文—机器学习
出版日期:
2025-03-05
- Title:
-
Incomplete multiview clustering based on view mapping and cyclic consistency generation
- 作者:
-
王英博, 郭凯雪
-
辽宁工程技术大学 软件学院, 辽宁 葫芦岛 125105
- Author(s):
-
WANG Yingbo, GUO Kaixue
-
Software College, Liaoning Technical University, Huludao 125105, China
-
- 关键词:
-
数据挖掘; 聚类; 多视图学习; 不完全多视图聚类; 深度学习; 自动编码器; 生成对抗性网络; KL散度
- Keywords:
-
data mining; clustering; multiview learning; incomplete multiview clustering; deep learning; autoencoder; generate adversarial networks; KL-divergence
- 分类号:
-
TP391
- DOI:
-
10.11992/tis.202311044
- 摘要:
-
传统聚类假设每个视图都完整,没有考虑数据损坏、设备故障导致的不完整视图情况。针对此问题,已有方法大多基于核和非负矩阵分解提出,没有明确补偿每个视图丢失的数据,学习的潜在表示也没有考虑聚类任务。为此设计视图映射和循环一致性生成的不完整多视图聚类(incomplete multi-view clustering generated by view mapping and cyclic consistency, MG_IMC),利用已有数据信息得到各视图的风格编码和共享潜在表示,并通过生成对抗网络生成缺失的数据,在完整数据集上利用加权自适应融合捕获更好的通用结构,并在深度嵌入聚类层完成聚类任务。使用KL散度(Kullback-Leibler divergence)联合训练模型,学习的公共表示有助于生成缺失的数据,而补全的数据进一步生成聚类友好的公共表示。实验表明,相比已有方法,该算法得到更好的聚类效果。
- Abstract:
-
Traditional clustering assumes that each view is complete without accounting for incomplete views caused by data corruption, device failures, and other factors. To address this issue, most existing methods rely on kernel and nonnegative matrix factorization without explicitly compensating for data loss in each view, and the potential representation of learning does not fully account for clustering tasks. An incomplete multiview clustering method (MG-IMC) with view mapping and cyclic consistency generation is designed to address the aforementioned limitation. This method leverages existing data information to generate missing data for each view through a single generative adversarial network, using shared potential representations provided by other views. Weighted adaptive fusion is applied to capture enhanced generic structures on the generated complete dataset, followed by clustering based on KL-divergence loss. The joint training of encoding common representations and generating missing data allows the model to recover missing data while simultaneously generating clustering-friendly common representations. Experiment results demonstrate that this algorithm outperforms existing methods in clustering performance.
更新日期/Last Update:
2025-03-05