[1]陈涛,谢在鹏,屈志昊.基于动态阈值增强原型网络的联邦半监督学习模型[J].智能系统学报,2024,19(3):534-545.[doi:10.11992/tis.202311015]
CHEN Tao,XIE Zaipeng,QU Zhihao.Federated semi-supervised learning model based on dynamic threshold enhanced prototype network[J].CAAI Transactions on Intelligent Systems,2024,19(3):534-545.[doi:10.11992/tis.202311015]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第3期
页码:
534-545
栏目:
学术论文—机器学习
出版日期:
2024-05-05
- Title:
-
Federated semi-supervised learning model based on dynamic threshold enhanced prototype network
- 作者:
-
陈涛, 谢在鹏, 屈志昊
-
河海大学 计算机与信息学院, 江苏 南京 211100
- Author(s):
-
CHEN Tao, XIE Zaipeng, QU Zhihao
-
College of Computer and Information, Hohai University, Nanjing 211100, China
-
- 关键词:
-
联邦学习; 半监督学习; 知识共享; 原型网络; 伪标签; 动态阈值; 无标签数据; 数据异质性
- Keywords:
-
federated learning; semi-supervised learning; knowledge sharing; prototypical network; pseudo label; dynamic threshold; unlabeled data; heterogeneous data
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202311015
- 文献标志码:
-
2024-04-30
- 摘要:
-
目前,联邦半监督学习面临着有效利用训练过程中大量无标签数据的挑战。尽管通过轻量级的原型网络实现客户端之间的知识共享可以缓解伪标签质量问题,但仍然有瓶颈。本文提出一种新的动态阈值增强下的原型网络联邦半监督学习算法。通过引入课程伪标签技术,其核心是对不同类别样本的学习状态动态调整阈值,使模型能够学习高质量的样本,显著提高模型的预测性能。实验结果表明,本算法在多个数据集上均取得优异的测试性能。在CIFAR-10数据集上,本算法相对于同类算法至少提高3%的测试精度。此外在SVHN和STL-10数据集上也有1%~7%的领先优势。值得注意的是,本算法在处理异质性和同质性数据时表现出色,且对于不同比例的有标签和无标签数据都具有良好的适应性。本算法不仅提高测试精度,而且未带来额外的通信开销和计算成本。这些结果表明本算法在联邦半监督学习领域具有巨大潜力,并为实际应用提供了一个性能卓越且高效的解决方案。
- Abstract:
-
Currently, federated semi-supervised learning (FSSL) faces the challenge of making effective use of a large amount of unlabeled data during training. Although knowledge sharing between clients through a lightweight prototyping network can alleviate pseudo-label quality issues, there are still bottlenecks. In this paper, we propose a federated semi-supervised learning model based on dynamic threshold enhanced prototype network. By introducing Curriculum Pseudo labeling, the core is to dynamically adjust the threshold of the learning state of different classes of samples, so that the model can learn high-quality samples and significantly improve the prediction performance of the model. Experimental results show that our proposal has achieved excellent test performance on multiple datasets. On the CIFAR-10 dataset, our proposal improves the test accuracy by at least 3% compared with similar algorithms. In addition, there is a 1%~7% lead on SVHN and STL-10 datasets. It is worth noting that our proposal performs well in handling heterogeneous and homogeneous data, and has good adaptability to different proportions of labeled and unlabeled data. Our proposal can improve the test accuracy. What’s more, it does not add additional communication overhead and computational cost. These results suggest that our proposal has great potential in the field of federated semi-supervised learning, and provides a high-performance and high-efficiency solution for practical applications.
备注/Memo
收稿日期:2023-11-13。
基金项目:水灾害防御全国重点实验室“一带一路”水资源与可持续发展科技基金项目(2021490811);国家自然科学基金青年项目(62102131);江苏省自然科学基金青年项目(BK20210361).
作者简介:陈涛,硕士研究生,主要研究方向为分布式机器学习、联邦学习。E-mail:1033296297@qq.com;谢在鹏,副教授,博士,主要研究方向为分布式机器学习,可持续计算理论及应用。获发明专利授权15项,发表学术论文30余篇。E-mail:zaipengxie@hhu.edu.cn;屈志昊,副教授,博士,主要研究方向为边缘计算、边缘智能、联邦学习。主持国家自然科学基金青年基金、江苏省青年基金等项目5项。发表学术论文20余篇。E-mail:quzhihao@hhu.edu.cn
通讯作者:谢在鹏. E-mail:zaipengxie@hhu.edu.cn
更新日期/Last Update:
1900-01-01