[1]DING Guiguang,CHEN Hui,WANG Ao,et al.Review of model compression and acceleration for visual deep learning[J].CAAI Transactions on Intelligent Systems,2024,19(5):1072-1081.[doi:10.11992/tis.202311011]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
19
Number of periods:
2024 5
Page number:
1072-1081
Column:
综述
Public date:
2024-09-05
- Title:
-
Review of model compression and acceleration for visual deep learning
- Author(s):
-
DING Guiguang1; 2; CHEN Hui2; WANG Ao1; 2; 3; YANG Fan1; 2; 3; XIONG Yizhe1; 2; 3; LIANG Yiwen1; 2; 3
-
1. School of Software, Tsinghua University, Beijing 100084, China;
2. Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China;
3. Zhuoxi Institute of Brain and Intelligence, Hangzhou 311121, China
-
- Keywords:
-
visual deep learning; model compression; lightweight structure; model pruning; model quantization; model distillation; Transformer; token pruning
- CLC:
-
TP18
- DOI:
-
10.11992/tis.202311011
- Abstract:
-
Deep learning models have increasingly grown in scale in recent years. Large-scale visual deep learning models are difficult to efficiently infer and deploy in resource-constrained environments, such as embedded devices. Model compression and acceleration can effectively solve this challenge. Although reviews of related works are available, they generally focus on the compressing and acceleration of convolutional neural networks and lack the organization and comparative analysis of the compression and acceleration methods for visual Transformer models. This study focuses on visual deep learning model compression technology and summarizes and analyzes the relevant technical means for convolutional neural networks and visual Transformer models. Technical hotspots and challenges are also summarized and explored. This study provides researchers with a comprehensive understanding of model compression and acceleration fields, which promotes the development of compression and acceleration techniques for deep learning models.