[1]宁欣,赵文尧,宗易昕,等.神经网络压缩联合优化方法的研究综述[J].智能系统学报,2024,19(1):36-57.[doi:10.11992/tis.202306042]
NING Xin,ZHAO Wenyao,ZONG Yixin,et al.An overview of the joint optimization method for neural network compression[J].CAAI Transactions on Intelligent Systems,2024,19(1):36-57.[doi:10.11992/tis.202306042]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
19
期数:
2024年第1期
页码:
36-57
栏目:
综述
出版日期:
2024-01-05
- Title:
-
An overview of the joint optimization method for neural network compression
- 作者:
-
宁欣1, 赵文尧2, 宗易昕3, 张玉贵1, 陈灏4, 周琦1, 马骏骁1
-
1. 中国科学院 半导体研究所, 北京 100083;
2. 合肥工业大学 微电子学院, 安徽 合肥 230009;
3. 中国科学院 前沿科学与教育局, 北京 100864;
4. 南开大学 人工智能学院, 天津 300071
- Author(s):
-
NING Xin1, ZHAO Wenyao2, ZONG Yixin3, ZHANG Yugui1, CHEN Hao4, ZHOU Qi1, MA Junxiao1
-
1. Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China;
2. School of Microelectronics, Hefei University of Technology, Hefei 230009, China;
3. Bureau of Frontier Sciences and Education, Chinese Academy of Sciences, Beijing 100864, China;
4. College of Artificial Intelligence, Nankai University, Tianjin 300071, China
-
- 关键词:
-
神经网络; 压缩; 剪枝; 量化; 知识蒸馏; 模型压缩; 深度学习
- Keywords:
-
neural network; compression; pruning; quantization; knowledge distillation; model compression; deep learning
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202306042
- 文献标志码:
-
2024-01-03
- 摘要:
-
随着人工智能应用的实时性、隐私性和安全性需求增大,在边缘计算平台上部署高性能的神经网络成为研究热点。由于常见的边缘计算平台在存储、算力、功耗上均存在限制,因此深度神经网络的端侧部署仍然是一个巨大的挑战。目前,克服上述挑战的一个思路是对现有的神经网络压缩以适配设备部署条件。现阶段常用的模型压缩算法有剪枝、量化、知识蒸馏,多种方法优势互补同时联合压缩可实现更好的压缩加速效果,正成为研究的热点。本文首先对常用的模型压缩算法进行简要概述,然后总结了“知识蒸馏+剪枝”、“知识蒸馏+量化”和“剪枝+量化”3种常见的联合压缩算法,重点分析论述了联合压缩的基本思想和方法,最后提出了神经网络压缩联合优化方法未来的重点发展方向。
- Abstract:
-
With the increasing demand for real-time, privacy and security of AI applications, deploying high-performance neural network on an edge computing platform has become a research hotspot. Since common edge computing platforms have limitations in storage, computing power, and power consumption, the edge deployment of deep neural networks is still a huge challenge. Currently, one method to overcome the challenges is to compress the existing neural network to adapt to the device deployment conditions. The commonly used model compression algorithms include pruning, quantization, and knowledge distillation. By taking advantage of complementary multiple methods, the combined compression can achieve better compression acceleration effect, which is becoming a hot spot in research. This paper first makes a brief overview of the commonly used model compression algorithms, and then summarizes three commonly used joint compression algorithms: “knowledge distillation + pruning”, “knowledge distillation + quantification” and "pruning + quantification", focusing on the analysis and discussion of basic ideas and methods of joint compression. Finally, the future key development direction of the neural network compression joint optimization method is put forward.
备注/Memo
收稿日期:2023-06-21。
基金项目:国家自然科学基金项目 (62373343);北京市自然科学基金项目(L233036).
作者简介:宁欣,青年研究员,IEEE/CCF/CAAI高级会员,主要研究方向为计算视觉、神经网络理论与优化计算。主持国家自然科学基金等项目5项。发表学术论文100余篇。E-mail:ningxin@semi.ac.cn;赵文尧,本科生,主要研究方向为神经网络轻量化算法和硬件加速。E-mail:2020214817@mail.hfut.edu.cn;张玉贵,助理研究员,IEEE和CCF会员,主要研究方向为计算机视觉、模型优化加速和中医数字化。参与国家重点研发计划项目2项、国家自然科学基金项目4项、工信部揭榜挂帅项目1项。发表学术论文20余篇。E-mail:zhangyugui@semi.ac.cn
通讯作者:张玉贵. E-mail:zhangyugui@semi.ac.cn
更新日期/Last Update:
1900-01-01