[1]丁贵广,陈辉,王澳,等.视觉深度学习模型压缩加速综述[J].智能系统学报,2024,19(5):1072-1081.[doi:10.11992/tis.202311011]
DING Guiguang,CHEN Hui,WANG Ao,et al.Review of model compression and acceleration for visual deep learning[J].CAAI Transactions on Intelligent Systems,2024,19(5):1072-1081.[doi:10.11992/tis.202311011]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第5期
页码:
1072-1081
栏目:
综述
出版日期:
2024-09-05
- Title:
-
Review of model compression and acceleration for visual deep learning
- 作者:
-
丁贵广1,2, 陈辉2, 王澳1,2,3, 杨帆1,2,3, 熊翊哲1,2,3, 梁伊雯1,2,3
-
1. 清华大学 软件学院, 北京 100084;
2. 清华大学 北京信息科学与技术国家研究中心, 北京 100084;
3. 涿溪脑与智能研究所, 浙江 杭州 311121
- Author(s):
-
DING Guiguang1,2, CHEN Hui2, WANG Ao1,2,3, YANG Fan1,2,3, XIONG Yizhe1,2,3, LIANG Yiwen1,2,3
-
1. School of Software, Tsinghua University, Beijing 100084, China;
2. Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China;
3. Zhuoxi Institute of Brain and Intelligence, Hangzhou 311121, China
-
- 关键词:
-
视觉深度学习; 模型压缩; 轻量化结构; 模型剪枝; 模型量化; 模型蒸馏; Transformer; 序列剪枝
- Keywords:
-
visual deep learning; model compression; lightweight structure; model pruning; model quantization; model distillation; Transformer; token pruning
- 分类号:
-
TP18
- DOI:
-
10.11992/tis.202311011
- 文献标志码:
-
2024-08-28
- 摘要:
-
近年来,深度学习模型规模越来越大,在嵌入式设备等资源受限环境中,大规模视觉深度学习模型难以实现高效推理部署。模型压缩加速可以有效解决该挑战。尽管已经出现相关工作的综述,但相关工作集中在卷积神经网络的压缩加速,缺乏对视觉Transformer模型压缩加速方法的整理和对比分析。因此,本文以视觉深度学习模型压缩技术为核心,对卷积神经网络和视觉Transformer模型2个最重要的视觉深度模型进行了相关技术手段的整理,并对技术热点和挑战进行了总结和分析。本文旨在为研究者提供一个全面了解模型压缩和加速领域的视角,促进深度学习模型压缩加速技术的发展。
- Abstract:
-
Deep learning models have increasingly grown in scale in recent years. Large-scale visual deep learning models are difficult to efficiently infer and deploy in resource-constrained environments, such as embedded devices. Model compression and acceleration can effectively solve this challenge. Although reviews of related works are available, they generally focus on the compressing and acceleration of convolutional neural networks and lack the organization and comparative analysis of the compression and acceleration methods for visual Transformer models. This study focuses on visual deep learning model compression technology and summarizes and analyzes the relevant technical means for convolutional neural networks and visual Transformer models. Technical hotspots and challenges are also summarized and explored. This study provides researchers with a comprehensive understanding of model compression and acceleration fields, which promotes the development of compression and acceleration techniques for deep learning models.
备注/Memo
收稿日期:2023-11-10。
基金项目:国家自然科学基金项目(61925107,62271281);浙江省自然科学基金项目(LDT23F01013F01).
作者简介:丁贵广,教授,博士,主要研究方向为多媒体信息处理、计算机视觉感知。主持和参与国家自然科学基金面上项目等国家级项目数十项。曾获国家科技进步奖二等奖、吴文俊人工智能科技进步奖一等奖、中国电子学会技术发明奖一等奖等。发表学术论文近百篇,引用量超17000次。E-mail:dinggg@tsinghua.edu.cn;陈辉,助理研究员,主要研究方向为计算机视觉、多媒体信息处理。主持国家自然科学基金面上项目1项、科技部—“新一代人工智能2030”子课题1项。E-mail:jichenhui2012@gmail.com;王澳, 博士研究生,主要研究方向为深度学习模型设计和优化。E-mail:wa22@mails.tsinghua.edu.cn。
通讯作者:陈辉. E-mail:jichenhui2012@gmail.com
更新日期/Last Update:
2024-09-05