[1]YU Chengyu,LI Zhiyuan,MAO Wenyu,et al.Design and implementation of an efficient accelerator for sparse convolutional neural network[J].CAAI Transactions on Intelligent Systems,2020,15(2):323-333.[doi:10.11992/tis.201902007]
Copy

Design and implementation of an efficient accelerator for sparse convolutional neural network

References:
[1] RUSSAKOVSKY O, DENG Jia, SU Hao, et al. ImageNet large scale visual recognition challenge[J]. International journal of computer vision, 2014, 115(3): 211-252.
[2] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[C]// The 3rd International Conference on Learning Representations (ICLR2015).San Diego, CA, 2015.
[3] ZHANG Chen, LI Peng, SUN Guangyu, et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks[C]//Proceedings of 2015 ACM/SIGDA International Symposium on Field-programmable Gate Arrays. New York, NY, USA, 2015.
[4] NATALE G, BACIS M, SANTAMBROGIO M D. On how to design dataflow FPGA-based accelerators for convolutional neural networks[C]//2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). Bochum, Germany, 2017.
[5] SHEN Yongming, FERDMAN M, MILDER P. Maximizing CNN accelerator efficiency through resource partitioning[C]//2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). Toronto, ON, Canada, 2016.
[6] MA Yufei, CAO Yu, VRUDHULA S, et al. Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks[C]//ACM/SIGDA International Symposium on Field-programmable Gate Arrays. Monterey, California, USA, 2017.
[7] SHI Shaohuai, CHU Xiaowen. Speeding up convolutional neural networks by exploiting the sparsity of rectifier units[J]. Computer vision and pattern recognition, 2017, 4: 1-7.
[8] CHEN Y H, KRISHNA T, EMER J S, et al. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks[J]. IEEE journal of solid-state circuits, 2017, 52(1): 127-138.
[9] ALBERICIO J, JUDD P, HETHERINGTON T, et al. Cnvlutin: ineffectual-neuron-free deep neural network computing[C]//2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). Seoul, South Korea, 2016.
[10] ZHANG Shijin, DU Zidong, ZHANG Lei, et al. Cambricon-X: an accelerator for sparse neural networks[C]//201649th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Taipei, China, 2016: 1-12.
[11] HAN Song, LIU Xingyu, MAO Huizi, et al. EIE: efficient inference engine on compressed deep neural network[J]. International symposium on computer architecture, 2016, 44(3): 243-254.
[12] Parashar A, RHU M, Mukkara A, et al. SCNN: An accelerator for compressed-sparse convolutional neural networks[C]//The 44th Annual International Symposium. TorontoCanada, 2017.
[13] OLSHAUSEN B A, FIELD D J. Sparse coding with an overcomplete basis set: a strategy employed by V1?[J]. Vision research, 1997, 37(23): 3311-3325.
[14] DAYAN P, ABBOTT L F. Theoretical neuroscience: computational and mathematical modeling of neural systems[M]. Cambridge, USA: The MIT Press, 2001.
[15] NAIR V, HINTON G E. Rectified linear units improve restricted Boltzmann machines[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning. Haifa, Israel, 2010.
[16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, 2012.
[17] HAN Song, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada, 2015: 1135-1143.
[18] HAN Song, MAO Huizi, DALLY W J, et al. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding[C]//International Conference on Learning Representations, San Juan, Puerto Rico,2016.
[19] MA Yufei, SUDA N, CAO Yu, et al. Scalable and modularized RTL compilation of convolutional neural networks onto FPGA[C]//201626th International Conference on Field Programmable Logic and Applications (FPL). Lausanne, Switzerland, 2016: 1-8.
[20] KU NG. Why systolic architectures[J]. IEEE Computer, 1982, 15(1): 300-309.
[21] GYSEL P, MOTAMEDI M, GHIASI S. Hardware-oriented approximation of convolutional neural networks[J]. Computer Vision and Pattern Recognition, 2016, 10: 1-8.
[22] DORRANCE R, REN Fengbo, MARKOVI? D, et al. A scalable sparse matrix-vector multiplication kernel for energy-efficient sparse-blas on FPGAs[C]//ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Monterey, California, USA, 2014: 161-170.
[23] QIU Jiantao, WANG Jie, YAO Song, et al. Going deeper with embedded FPGA platform for convolutional neural network[C]//Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Monterey, California, USA, 2016: 26-35.
[24] SUDA N, CHANDRA V, DASIKA G, et al. Throughput-optimized openCL-based FPGA accelerator for large-scale convolutional neural networks[C]//Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Monterey, California, USA, 2016.
[25] ZHANG Chen, FANG Zhenman, ZHOU Peipei, et al. Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks[C]//Proceedings of the 35th International Conference on Computer-Aided Design. Austin, Texas, USA, 2016.
Similar References:

Memo

-

Last Update: 1900-01-01

Copyright © CAAI Transactions on Intelligent Systems