[1]管凤旭,张涵宇,路斯棋,等.扩散模型在计算机视觉领域的研究现状[J].智能系统学报,2025,20(2):265-282.[doi:10.11992/tis.202312041]
GUAN Fengxu,ZHANG Hanyu,LU Siqi,et al.Research status of diffusion models in computer vision[J].CAAI Transactions on Intelligent Systems,2025,20(2):265-282.[doi:10.11992/tis.202312041]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
20
期数:
2025年第2期
页码:
265-282
栏目:
综述
出版日期:
2025-03-05
- Title:
-
Research status of diffusion models in computer vision
- 作者:
-
管凤旭, 张涵宇, 路斯棋, 赖海涛, 杜雪, 郑岩
-
哈尔滨工程大学 智能科学与工程学院, 黑龙江 哈尔滨 150001
- Author(s):
-
GUAN Fengxu, ZHANG Hanyu, LU Siqi, LAI Haitao, DU Xue, ZHENG Yan
-
College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
-
- 关键词:
-
扩散模型; 去噪扩散概率模型; 分数扩散模型; 深度学习; 计算机视觉; 图像生成; 生成模型; 生成对抗网络
- Keywords:
-
diffusion model; denoising diffusion probabilistic model; score-based generative model; deep learning; computer vision; image generation; generative model; generative adversarial network
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.202312041
- 摘要:
-
扩散模型是受分子热力学启发而来的一类新的生成模型,具有训练稳定、对模型设置依赖性弱等优点。近年来,扩散模型被广泛应用于各项任务,并且取得了相比于以往生成模型更多样、更高质量的结果。目前,扩散模型已成为计算机视觉领域热门的基准方法。为更好地促进扩散模型在计算机视觉领域的发展,对扩散模型进行综述:首先对比了扩散模型与其他生成模型的优劣,介绍了扩散模型的数学原理;随后,从扩散模型存在的普遍问题出发,介绍了相关学者近年来所做的改进工作,以及扩散模型在多种视觉任务上的应用实例;最后,探讨了扩散模型存在的问题,并提出了一些未来可能的发展趋势。
- Abstract:
-
The diffusion model is a new generative model inspired by molecular thermodynamics. This model offers stable training and low dependence on model settings, making it a popular benchmark in computer vision. In recent years, the diffusion model has been widely applied to various tasks, yielding diverse and high-quality results compared to traditional generative models. At present, the diffusion model is a prominent method in the field of computer vision. This paper provides a comprehensive overview of the diffusion model to further stimulate its development in this domain. First, the paper compares the advantages and disadvantages of diffusion models with other generative models and introduces the underlying mathematical principles. Then, the study presents recent efforts by researchers to improve diffusion models, starting with common challenges and highlighting application examples in various visual tasks. Finally, the study discusses existing issues with diffusion models and outlines potential future development trends.
备注/Memo
收稿日期:2023-12-27。
基金项目:国家自然科学基金项目(62101156).
作者简介:管凤旭,副教授,博士,主要研究方向为无人系统自主控制、机器视觉目标检测与跟踪、计算机控制及应用。获授权发明专利近20项,发表学术论文40余篇,出版教材5部。E-mail:guanfengxu@hrbeu.edu.cn;张涵宇,硕士研究生,主要研究方向为图像去雾、计算机视觉。E-mail:zhy875329435@163.com;路斯棋,硕士研究生,主要研究方向为水下图像处理、计算机视觉。E-mail:lusiqi9803@163.com。
通讯作者:管凤旭. E-mail:guanfengxu@hrbeu.edu.cn
更新日期/Last Update:
2025-03-05