[1]翟永杰,张智柏,王亚茹.基于改进TransGAN的零样本图像识别方法[J].智能系统学报,2023,18(2):352-359.[doi:10.11992/tis.202111002]
ZHAI Yongjie,ZHANG Zhibai,WANG Yaru.An image recognition method of zero -shot learning based on an improved TransGAN[J].CAAI Transactions on Intelligent Systems,2023,18(2):352-359.[doi:10.11992/tis.202111002]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第2期
页码:
352-359
栏目:
学术论文—机器感知与模式识别
出版日期:
2023-05-05
- Title:
-
An image recognition method of zero -shot learning based on an improved TransGAN
- 作者:
-
翟永杰, 张智柏, 王亚茹
-
华北电力大学 自动化系,河北 保定 071003
- Author(s):
-
ZHAI Yongjie, ZHANG Zhibai, WANG Yaru
-
Department of Automation, North China Electric Power University, Baoding 071003, China
-
- 关键词:
-
零样本学习; 生成对抗网络; TransGAN; 深度学习; 图像识别; 图像特征; 卷积层; 非线性激活函数
- Keywords:
-
zero -shot learning; generative adversarial network; TransGAN; deep learning; image recognition; image feature; convolutional layer; nonlinear activation function
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.202111002
- 摘要:
-
零样本学习算法旨在解决样本极少甚至缺失情况下的图像识别问题。生成式模型通过生成缺失类别的图像,将此问题转化为传统的基于监督学习的图像识别,但生成图像的质量不稳定、容易出现模式崩塌,影响图像识别准确性。为此,通过对TransGAN模型进行改进,提出基于改进TransGAN的零样本图像识别方法。将TransGAN的生成器连接卷积层进行降维,并进一步提取图像特征,使生成图像特征和真实图像特征更加接近,提高特征的稳定性;同时,对判别器加入非线性激活函数,并进行结构简化,使判别器更好地指导生成器,并减小计算量。在公共数据集上的实验结果表明,所提方法的图像识别准确率较基线模型提高了29.02%,且具有较好的泛化性能。
- Abstract:
-
Zero-shot learning algorithms aim to address the challenge of image recognition with limited or even missing samples. By transforming the problem into a supervised learning task through the use of generative models, the method generates images of missing classes. However, the quality of generated images can be inconsistent and is susceptible to pattern collapse, affecting image recognition accuracy. To address this issue, we propose an improved zero-shot learning image recognition method based on an improved TransGAN. The generator of TransGAN is linked to a convolutional layer for dimensionality reduction, leading to a more effective extraction of image features and improved stability. Moreover, the addition of a nonlinear activation function to the discriminator and simplifying its structure enhances its ability to guide the generator and reduces computational requirements. Experiment results on public datasets show that our proposed method increases image recognition accuracy by 29.02% compared to the baseline model and demonstrates improved generalization performance.
备注/Memo
收稿日期:2021-11-01。
基金项目:国家自然科学基金面上项目(U21A20486, 61871182);河北省自然科学基金青年科学基金项目(F2021502008);中央高校基本科研业务费专项资金面上项目(2021MS081).
作者简介:翟永杰,教授,博士,主要研究方向为电力视觉。主持国家自然科学基金面上项目1项,河北省自然科学基金项目1项,主持横向科研项目12项,参与国家重点研发计划项目1项,授权发明专利10项,获得山东省科技进步一等奖1项。编著1部,参编教材1部、著作3部,发表学术论文30余篇;张智柏,硕士研究生,主要研究方向为零样本学习与人工智能;王亚茹,讲师,博士,主要研究方向为模式识别与计算机视觉、数据挖掘、电力视觉。发表学术论文10余篇
通讯作者:王亚茹. E-mail:wangyaru@ncepu.edu.cn
更新日期/Last Update:
1900-01-01