<-上一篇/Previous Article 下一篇/Next Article->

[1]莫凌飞,蒋红亮,李煊鹏.基于深度学习的视频预测研究综述[J].智能系统学报,2018,13(1):85-96.[doi:10.11992/tis.201707032]
　MO Lingfei,JIANG Hongliang,LI Xuanpeng.Review of deep learning-based video prediction[J].CAAI Transactions on Intelligent Systems,2018,13(1):85-96.[doi:10.11992/tis.201707032]

点击复制

基于深度学习的视频预测研究综述

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 13 期数: 2018年第1期页码: 85-96 栏目: 综述出版日期: 2018-01-24

Title:: Review of deep learning-based video prediction

作者:: 莫凌飞, 蒋红亮, 李煊鹏; 东南大学仪器科学与工程学院, 江苏南京 210096

Author(s):: MO Lingfei, JIANG Hongliang, LI Xuanpeng; College of Instrument Science and Engineering, Southeast University, Nanjing 210096, China

关键词:: 视频预测; 深度学习; 无监督学习; 运动预测; 动作识别; 卷积神经网络; 递归神经网络; 自编码器

Keywords:: video prediction; deep learning; unsupervised learning; motion prediction; action recognition; convolution neural network; recurrent neural network; auto encoder

分类号:: TP391

DOI:: 10.11992/tis.201707032

摘要:: 近年来，深度学习算法在众多有监督学习问题上取得了卓越的成果，其在精度、效率和智能化等方面的性能远超传统机器学习算法，部分甚至超越了人类水平。当前，深度学习研究者的研究兴趣逐渐从监督学习转移到强化学习、半监督学习以及无监督学习领域。视频预测算法，因其可以利用海量无标注自然数据去学习视频的内在表征，且在机器人决策、无人驾驶和视频理解等领域具有广泛的应用价值，近两年来得到快速发展。本文论述了视频预测算法的发展背景和深度学习的发展历史，简要介绍了人体动作、物体运动和移动轨迹的预测，重点介绍了基于深度学习的视频预测的主流方法和模型，最后总结了当前该领域存在的问题和发展前景。

Abstract:: In recent years, deep learning algorithms have made significant achievements on various supervised learning problems, with their accuracy, efficiency, and intelligence outperforming traditional machine learning algorithms, in some instances even beyond human capability. Currently, deep learning researchers are gradually turning their interests from supervised learning to the areas of reinforcement learning, weakly supervised learning, and unsupervised learning. Video prediction algorithms have developed rapidly in the last two years due to its capability of using a large amount of unlabeled and naturalistic data to construct the forthcoming video as well as its widespread application value in decision making, autonomous driving, video comprehension, and other fields. In this paper, we review the development background of the video prediction algorithms and the history of deep learning. Then, we briefly introduce the human activity, object movement, and trajectory prediction algorithms, with a focus on mainstream video prediction methods that are based on deep learning. We summarize current problems related to this research and consider the future prospects of this field.

参考文献/References:: [1] LECUN Y. Predictive Learning[R]//Proceedings of the 30th Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016
[2] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444.
[3] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012. South Lake Tahoe, NV, USA, 2012: 1097-1105.
[4] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 1026-1034.
[5] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[Z]. arXiv preprint arXiv: 1409.1556, 2014.
[6] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016: 770-778.
[7] HINTON G, DENG Li, YU Dong, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. IEEE signal processing magazine, 2012, 29(6): 82-97.
[8] SUTSKEVER I, VINYALS O, LE Q V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Quebec, Canada, 2014: 3104-3112.
[9] BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of machine learning research, 2003, 3: 1137-1155.
[10] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[Z]. arXiv preprint arXiv: 1312.5602, 2013.
[11] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484-489.
[12] DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA, 2009: 248-255.
[13] SRIVASTAVA N, MANSIMOV E, SALAKHUDINOV R. Unsupervised learning of video representations using LSTMs[C]//Proceedings of the 32nd International Conference on Machine Learning. Lille, France, 2015: 843-852.
[14] MCCULLOCH W S, PITTS W. A logical calculus of the ideas immanent in nervous activity[J]. The bulletin of mathematical biophysics, 1943, 5(4): 115-133.
[15] HEBB D O. The organization of behavior: A neuropsychological theory[M]. New York: Chapman & Hall, 1949.
[16] MINSKY M L, PAPERT S A. Perceptrons: an introduction to computational geometry[M]. 2nd ed. Cambridge, UK: MIT Press, 1988.
[17] RUMELHART D E, HINTON G E, WILLIAMS R J. Learning representations by back-propagating errors[J]. Nature, 1986, 323(6088): 533-536.
[18] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[19] HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural computation, 2006, 18(7): 1527-1554.
[20] JORDAN M I. Serial order: A parallel distributed processing approach[J]. Advances in psychology, 1997, 121: 471-495.
[21] BENGIO Y. Learning deep architectures for AI[J]. Foundations and trends in machine learning, 2009, 2(1): 1-127.
[22] GOODFELLOW I J, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Quebec, Canada, 2014: 2672-2680.
[23] BENGIO Y, COURVILLE A, VINCENT P. Representation learning: a review and new perspectives[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(8): 1798-1828.
[24] HUBEL D H, WIESEL T N. Receptive fields and functional architecture of monkey striate cortex[J]. The journal of physiology, 1968, 195(1): 215-243.
[25] FUKUSHIMA K, MIYAKE S. Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition[M]//AMARI S I, ARBIB M A. Competition and Cooperation in Neural Nets. Berlin Heidelberg: Springer, 1982: 267-285.
[26] ZEILER M D, KRISHNAN D, TAYLOR G W, et al. Deconvolutional networks[C]//Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA, 2010: 2528-2535.
[27] NOH H, HONG S, HAN B. Learning deconvolution network for semantic segmentation[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 1520-1528.
[28] RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[Z]. arXiv preprint arXiv: 1511.06434, 2015.
[29] JI Shuiwang, XU Wei, YANG Ming, et al. 3D convolutional neural networks for human action recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(1): 221-231.
[30] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8): 1735-1780.
[31] GERS F A, SCHMIDHUBER J. Recurrent nets that time and count[C]//Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. Como, Italy, 2000, 3: 189-194.
[32] CHO K, VAN MERRIENBOER B, GULCEHRE C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[Z]. arXiv preprint arXiv: 1406.1078, 2014.
[33] SHI Xingjian, CHEN Zhourong, WANG Hao, et al. Convolutional LSTM network: a machine learning approach for precipitation nowcasting[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Quebec, Canada, 2015: 802-810.
[34] VINCENT P, LAROCHELLE H, LAJOIE I, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11: 3371-3408.
[35] NG A. Sparse autoencoder[R]. CS294A Lecture Notes, 2011: 72.
[36] KINGMA D P, WELLING M. Auto-encoding variational bayes[Z]. arXiv preprint arXiv: 1312.6114, 2013.
[37] REZENDE D J, MOHAMED S, WIERSTRA D. Stochastic backpropagation and approximate inference in deep generative models[Z]. arXiv preprint arXiv: 1401.4082, 2014.
[38] MIRZA M, OSINDERO S. Conditional generative adversarial nets[Z]. arXiv preprint arXiv: 1411.1784, 2014.
[39] CHEN Xi, DUAN Yan, HOUTHOOFT R, et al. InfoGAN: interpretable representation learning by information maximizing generative adversarial nets[C]//Proceedings of the 30th Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 2172-2180.
[40] LEDIG C, THEIS L, HUSZáR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[Z]. arXiv preprint arXiv: 1609.04802, 2016.
[41] WU Jiajun, ZHANG Chengkai, XUE Tianfan, et al. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling[C]//Proceedings of the 30th Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 82-90.
[42] ISOLA P, ZHU Junyan, ZHOU Tinghui, et al. Image-to-image translation with conditional adversarial networks[Z]. arXiv preprint arXiv: 1611.07004, 2016.
[43] VONDRICK C, PIRSIAVASH H, TORRALBA A. Generating videos with scene dynamics[C]//Proceedings of the 30th Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 613-621.
[44] VONDRICK C, PIRSIAVASH H, TORRALBA A. Anticipating visual representations from unlabeled video[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, Nevada, USA, 2016: 98-106.
[45] LAN Tian, CHEN T C, SAVARESE S. A hierarchical representation for future action prediction[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 689-704.
[46] HOAI M, DE LA TORRE F. Max-margin early event detectors[J]. International journal of computer vision, 2014, 107(2): 191-202.
[47] RYOO M S. Human activity prediction: Early recognition of ongoing activities from streaming videos[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 1036-1043.
[48] VU T H, OLSSON C, LAPTEV I, et al. Predicting actions from static scenes[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 421-436.
[49] PEI Mingtao, JIA Yunde, ZHU Songchun. Parsing video events with goal inference and intent prediction[C]//Proceedings of the 2011 IEEE International Conference on Computer vision. Barcelona, Spain, 2011: 487-494.
[50] FOUHEY D F, ZITNICK C L. Predicting object dynamics in scenes[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA, 2014: 2027-2034.
[51] KOPPULA H S, SAXENA A. Anticipating human activities using object affordances for reactive robotic response[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(1): 14-29.
[52] HUANG Dean, KITANI K M. Action-reaction: Forecasting the dynamics of human interaction[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 489-504.
[53] PICKUP L C, PAN Zheng, WEI Donglai, et al. Seeing the arrow of time[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA, 2014: 2043-2050.
[54] LAMPERT C H. Predicting the future behavior of a time-varying probability distribution[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA, 2015: 942-950.
[55] PINTEA S L, VAN GEMERT J C, SMEULDERS A W M. Déja vu: Motion prediction in static images[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 172-187.
[56] KITANI K M, ZIEBART B D, BAGNELL J A, et al. Activity forecasting[C]//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy, 2012: 201-214.
[57] GONG Haifeng, SIM J, LIKHACHEV M, et al. Multi-hypothesis motion planning for visual object tracking[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain, 2011: 619-626.
[58] KOOIJ J F P, SCHNEIDER N, FLOHR F, et al. Context-based pedestrian path prediction[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 618-633.
[59] WALKER J, DOERSCH C, GUPTA A, et al. An uncertain future: Forecasting from static images using variational autoencoders[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands, 2016: 835-851.
[60] WALKER J, GUPTA A, HEBERT M. Dense optical flow prediction from a static image[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile, 2015: 2443-2451.
[61] WALKER J, GUPTA A, HEBERT M. Patch to the future: Unsupervised visual prediction[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA, 2014: 3302-3309.
[62] YUEN J, TORRALBA A. A data-driven approach for event prediction[C]//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece, 2010: 707-720.
[63] MOTTAGHI R, RASTEGARI M, GUPTA A, et al. “What happens if...” learning to predict the effect of forces in images[C]//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands, 2016: 269-285.
[64] SCHUKDT C, LAPTEV I, CAPUTO B. Recognizing human actions: a local SVM approach[C]//Proceedings of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004, 3: 32-36.
[65] VUKOTI V, PINTEA S L, RAYMOND C, et al. One-step time-dependent future video frame prediction with a convolutional encoder-decoder neural network[C]//Proceedings of the 19th International Conference on Image Analysis and Processing. Catania, Italy, 2017: 140-151.
[66] IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(7): 1325-1339.
[67] YAN Yichao, XU Jingwei, NI Bingbing, et al. Skeleton-aided articulated motion generation[Z]. arXiv preprint arXiv: 1707.01058, 2017.
[68] VILLEGAS R, YANG Jimei, ZOU Yuliang, et al. Learning to generate long-term future via hierarchical prediction[Z]. arXiv preprint arXiv: 1704.05831, 2017.
[69] SOOMRO K, ZAMIR A R, SHAH M. UCF101: A dataset of 101 human actions classes from videos in the wild[Z]. arXiv preprint axXiv:1202.0402, 2012
[70] MATHIEU M, COUPRIE C, LECUN Y. Deep multi-scale video prediction beyond mean square error[Z]. arXiv preprint arXiv: 1511.05440, 2015.
[71] HINTZ J J. Generative adversarial reservoirs for natural video prediction[D]. Austin, USA: The University of Texas.
[72] VILLEGAS R, YANG Jimei, HONG S, et al. Decomposing motion and content for natural video sequence prediction[C]//Proceedings of the 2017 International Conference on Learning Representations. Toulon, France, 2017.
[73] LIU Ziwei, et al. Video frame synthesis using deep voxel flow[C]//Proceeding of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, Hawaii, USA, 2017:4463-4471
[74] GORBAN A, IDREES H, JIANG Yugang, et al. THUMOS challenge: Action recognition with a large number of classes[EB/OL]. (2015–05). http://www.thumos.info.
[75] GEIGER A, LENZ P, STILLER C, et al. Vision meets robotics: the KITTI dataset[J]. The international journal of robotics research, 2013, 32(11): 1231-1237.
[76] LOTTER W, KREIMAN G, COX D. Deep predictive coding networks for video prediction and unsupervised learning[Z]. arXiv preprint arXiv: 1605.08104, 2016.
[77] Kuehne H, Jhuang H, Garrote E, et al. HMDB: A large video database for human motion recognition[C]//Proceeding of the 2011 IEEE International Conference on Computer Vision, ICCV. Barcelona, Spain, 2011:2556-2563.
[78] CORDTS M, OMRAN M, RAMOS S, et al. The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA, 2016: 3213-3223.
[79] JIN Xiaojie, LI Xin, XIAO Huaxin, et al. Video scene parsing with predictive feature learning[Z]. arXiv preprint arXiv: 1612.00119, 2016.
[80] LOTTER W, KREIMAN G, COX D. Unsupervised learning of visual structure using predictive generative networks[Z]. arXiv preprint arXiv: 1511.06380, 2015.
[81] YAN Xing, CHANG Hong, SHAN Shiguang, et al. Modeling video dynamics with deep dynencoder[C]//Proceedings of the 13th European Conference on Computer Vision. Zürich, Switzerland, 2014: 215-230.
[82] RANZATO M, SZLAM A, BRUNA J, et al. Video (language) modeling: a baseline for generative models of natural videos[Z]. arXiv preprint arXiv: 1412.6604, 2014.
[83] OH J, GUO Xiaoxiao, LEE H, et al. Action-conditional video prediction using deep networks in atari games[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Quebec, Canada, 2015: 2863-2871.
[84] FINN C, GOODFELLOW I, LEVINE S. Unsupervised learning for physical interaction through video prediction[C]//Proceedings of the 30th Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 64-72.
[85] LUC P, NEVEROVA N, COUPRIE C, et al. Predicting deeper into the future of semantic segmentation[Z]. arXiv preprint arXiv: 1703.07684, 2017.
[86] CHEN Xiongtao, WANG Wenmin, WANG Jinzhou, et al. Long-term video interpolation with bidirectional predictive network[Z]. arXiv preprint arXiv: 1706.03947, 2017.
[87] XUE Tianfan, WU Jiajun, BOUMAN K, et al. Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks[C]//Proceedings of the 30th Annual Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 91-99.
[88] DENTON E, BIRODKAR V. Unsupervised learning of disentangled representations from video[Z]. arXiv preprint arXiv: 1705.10915, 2017.

相似文献/References:: [1]张媛媛,霍静,杨婉琪,等.深度信念网络的二代身份证异构人脸核实算法[J].智能系统学报,2015,10(2):193.[doi:10.3969/j.issn.1673-4785.201405060]
　ZHANG Yuanyuan,HUO Jing,YANG Wanqi,et al.A deep belief network-based heterogeneous face verification method for the second-generation identity card[J].CAAI Transactions on Intelligent Systems,2015,10():193.[doi:10.3969/j.issn.1673-4785.201405060]
[2]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
　DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10():1.[doi:10.3969/j.issn.1673-4785.201403072]
[3]马晓,张番栋,封举富.基于深度学习特征的稀疏表示的人脸识别方法[J].智能系统学报,2016,11(3):279.[doi:10.11992/tis.201603026]
　MA Xiao,ZHANG Fandong,FENG Jufu.Sparse representation via deep learning features based face recognition method[J].CAAI Transactions on Intelligent Systems,2016,11():279.[doi:10.11992/tis.201603026]
[4]刘帅师,程曦,郭文燕,等.深度学习方法研究新进展[J].智能系统学报,2016,11(5):567.[doi:10.11992/tis.201511028]
　LIU Shuaishi,CHENG Xi,GUO Wenyan,et al.Progress report on new research in deep learning[J].CAAI Transactions on Intelligent Systems,2016,11():567.[doi:10.11992/tis.201511028]
[5]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
　MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11():728.[doi:10.11992/tis.201611021]
[6]王亚杰,邱虹坤,吴燕燕,等.计算机博弈的研究与发展[J].智能系统学报,2016,11(6):788.[doi:10.11992/tis.201609006]
　WANG Yajie,QIU Hongkun,WU Yanyan,et al.Research and development of computer games[J].CAAI Transactions on Intelligent Systems,2016,11():788.[doi:10.11992/tis.201609006]
[7]黄心汉.A3I:21世纪科技之光[J].智能系统学报,2016,11(6):835.[doi:10.11992/tis.201605022]
　HUANG Xinhan.A3I: the star of science and technology for the 21st century[J].CAAI Transactions on Intelligent Systems,2016,11():835.[doi:10.11992/tis.201605022]
[8]宋婉茹,赵晴晴,陈昌红,等.行人重识别研究综述[J].智能系统学报,2017,12(6):770.[doi:10.11992/tis.201706084]
　SONG Wanru,ZHAO Qingqing,CHEN Changhong,et al.Survey on pedestrian re-identification research[J].CAAI Transactions on Intelligent Systems,2017,12():770.[doi:10.11992/tis.201706084]
[9]杨梦铎,栾咏红,刘文军,等.基于自编码器的特征迁移算法[J].智能系统学报,2017,12(6):894.[doi:10.11992/tis.201706037]
　YANG Mengduo,LUAN Yonghong,LIU Wenjun,et al.Feature transfer algorithm based on an auto-encoder[J].CAAI Transactions on Intelligent Systems,2017,12():894.[doi:10.11992/tis.201706037]
[10]王科俊,赵彦东,邢向磊.深度学习在无人驾驶汽车领域应用的研究进展[J].智能系统学报,2018,13(1):55.[doi:10.11992/tis.201609029]
　WANG Kejun,ZHAO Yandong,XING Xianglei.Deep learning in driverless vehicles[J].CAAI Transactions on Intelligent Systems,2018,13():55.[doi:10.11992/tis.201609029]

备注/Memo

收稿日期:2017-07-19。
基金项目:国家十二五科技支撑计划重点项目（2015BAG09B01）.
作者简介:莫凌飞,男,1981年生,副教授,博士,主要研究方向为机器学习与人工智能、物联网与边缘计算、智能机器人。发表学术论文多篇,其中被SCI、EI检索40余篇;蒋红亮,男,1993年生,硕士研究生,主要研究方向为深度无监督学习和计算机视觉;李煊鹏,男,1985年生,讲师,博士,主要研究方向为机器视觉、驾驶辅助系统、环境感知与信息融合。
通讯作者:莫凌飞.E-mail:lfmo@seu.edu.cn.

更新日期/Last Update: 2018-02-01

基于深度学习的视频预测研究综述 PDF下载HTML

备注/Memo

基于深度学习的视频预测研究综述

PDF下载 HTML