[1]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,(6):728-742.[doi:10.11992/tis.201611021]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,(6):728-742.[doi:10.11992/tis.201611021]
点击复制

大数据与深度学习综述(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
期数:
2016年6期
页码:
728-742
栏目:
出版日期:
2017-01-20

文章信息/Info

Title:
Deep learning with big data: state of the art and development
作者:
马世龙 乌尼日其其格 李小平
北京航空航天大学 软件开发环境国家重点实验室, 北京 100191
Author(s):
MA Shilong WUNIRI Qiqige LI Xiaoping
State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China
关键词:
大数据机器学习深层结构深度学习神经网络人工智能学习算法派生树
Keywords:
big datamachine learningdeep networkdeep learningneural networkartificial intelligencelearning algorithmderivation tree
分类号:
TP311
DOI:
10.11992/tis.201611021
摘要:
大数据时代改变了基于数理统计的传统数据科学,促进了数据分析方法的创新,从机器学习和多层神经网络演化而来的深度学习是当前大数据处理与分析的研究前沿。从机器学习到深度学习,经历了早期的符号归纳机器学习、统计机器学习、神经网络和20世纪末开始的数据挖掘等几十年的研究和实践,发现深度学习可以挖掘大数据的潜在价值。本文给出大数据和深度学习的综述,特别是,给出了各种深层结构及其学习算法之间关联的图谱,给出了深度学习在若干领域应用的知名案例。最后,展望了大数据上深度学习的发展与挑战。
Abstract:
As the era of the big data arrives, it is accompanied by profound changes to traditional data science based on statistics. Big data also pushes innovations in the methods of data analysis. Deep learning that evolves from machine learning and multilayer neural networks are currently extremely active research areas. From the symbolic machine learning and statistical machine learning to the artificial neural network, followed by data mining in the 90s, this has built a solid foundation for deep learning (DL) that makes it a notable tool for discovering the potential value behind big data. This survey compactly summarized big data and DL, proposed a generative relationship tree of the major deep networks and the algorithms, illustrated a broad area of applications based on DL, and highlighted the challenges to DL with big data, as well as identified future trends.

参考文献/References:

[1] TOLLE K M, TANSLEY D, HEY A J G. The fourth paradigm:data-intensive scientific discovery[J]. Proceedings of the IEEE, 2012, 99(8):1334-7.
[2] MASHEY J R. Big Data and the next wave of infra stress[D] Berkeley:University of California, 1997.
[3] MAYER-SCHÖNBERGER V, CUKIER K. A big data:a revolution that will transform how we live, work, and think[M]. Boston:Eamon Dolan, 2013.
[4] HILBERT M, LÓPEZ P. The world’s technological capacity to store, communicate, and compute information[J]. Science, 2011, 332(6025):60-65.
[5] LANEY D. 3D data management:controlling data volume, velocity, and variety[R]. META Group Research Note, 2001.
[6] IDC. IIIS:the "four vs" of big data[EB/OL].[2016-11-11]. http://www.computerworld.com.au/article/396198/iiis_four_vs_big_data/.
[7] SCHROECK M J, SHOCKLEY R, SMART J, et al. Analytics:the real-world use of big data[R]. Oxford:IBM, 2012.
[8] IBM. The four v’s of big data[EB/OL]. 2014[2016-11-11]. http://www.ibmbigdatahub.com/infographic/four-vs-big-data.
[9] 郭平, 王可, 罗阿理, 等. 大数据分析中的计算智能研究现状与展望[J]. 软件学报, 2015, 26(11):3010-3025. GUO Ping, WANG Ke, LUO Ali, et al. Computational intelligence for big data analysis:current status and future prospect[J]. Journal of software, 2015, 26(11):3010-3025.
[10] Gartner. Big data[EB/OL].[2016-11-11]. http://www.gartner.com/it-glossary/big-data/.
[11] MANYIKA J, CHUI M, BROWN B, et al. Big data:the next frontier for innovation, competition, and productivity[R]. Analytics:McKinsey & Company, 2011.
[12] Wikiprdia. Big data[EB/OL]. 2009.[2016-11-11]. https://en.wikipedia.org/wiki/Big_data.
[13] JAMES J. How much data is created every minute?[EB/OL].[2016-11-11]. https://www.domo.com/blog/how-much-data-is-created-every-minute/.
[14] 维克托·迈尔·舍恩伯格, 周涛. 大数据时代生活、工作与思维的大变革[M]. 周涛, 译. 杭州:浙江人民出版社, 2013:136-136.
[15] 孟小峰,慈祥. 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1):146-169. MENG Xiaofeng, CI Xiang. Big data management:concepts, techniques and challenges[J]. Journal of computer research and development, 2013, 50(1):146-169.
[16] 王意洁, 孙伟东, 周松, 等. 云计算环境下的分布存储关键技术[J]. 软件学报, 2012, 23(4):962-986. WANG Yijie, SUN Weidong, ZHOU Song, et al. Key technologies of distributed storage for cloud computing[J]. Journal of software, 2012, 23(4):962-986.
[17] GANTZ J, REINSEL D. Extracting value from chaos[R]. Idcemc2 Report, 2011.
[18] 程学旗, 靳小龙, 王元卓, 等. 大数据系统和分析技术综述[J]. 软件学报, 2014, 25(9):1889-1908. CHENG Xueqi, JIN Xiaolong, WANG Yuanzhuo, et al. Survey on big data system and analytic technology[J]. Journal of software, 2014, 25(9):1889-1908.
[19] STUART D. The data revolution:big data, open data, data infrastructures and their consequences[J]. Online information review, 2015, 39(2):272.
[20] LABRINIDIS A, JAGADISH H V. Challenges and opportunities with big data[J]. Proceedings of the vldb endowment, 2012, 5(12):2032-2033.
[21] ABU-MOSTAFA Y S, MAGDON-ISMAIL M, LIN H T. Learning from data:a short course[M]. Chicago:Amlbook, 2012.
[22] 洪家荣. 机器学习--回顾与展望[J]. 计算机科学, 1991, 18(2):1-8. HONG Jiarong. Machine learning-review and vision[J]. Computer science, 1991,18(2):1-8.
[23] SAMUEL A L. Some studies in machine learning using the game of checkers. II-recent progress[J]. Annual review in automatic programming, 1969, 6:1-36.
[24] ROSENBLATT F. The perceptron-a perceiving and recognizing automaton[R]. Ithaca, NY:Cornell Aeronautical Laboratory, 1957.
[25] WIDROW B, LEHR M A. 30 years of adaptive neural networks:perceptron, Madaline, and backpropagation[J]. Proceedings of the IEEE, 1990, 78(9):1415-1442.
[26] MINSKY M, PAPERT S A. Perceptrons:an introduction to computational geometry, expanded edition[M]. Cambridge, Mass:MIT Press, 1988:449-452.
[27] 王珏, 石纯一. 机器学习研究[J]. 广西师范大学学报:自然科学版, 2003, 21(2):1-15. WANG Jue, SHI Chunyi. Investigations on machine learning[J]. Journal of Guangxi normal university:natural science edition, 2003, 21(2):1-15.
[28] CORTES C, VAPNIK V. Support-vector networks[J]. Machine learning, 1995, 20(3):273-297.
[29] REYNOLDS D A, ROSE R C, SMITH M J T. A mixture modeling approach to text-independent speaker identification[J]. Journal of the acoustical society of america, 1990, 87(S1):109.
[30] RUMELHART D E, MCCLELLAND J L. Parallel distributed processing:explorations in the microstructure of cognition:foundations[M]. Cambridge, Mass:MIT Press, 1987.
[31] WERBOS P J. Backpropagation through time:what it does and how to do it[J]. Proceedings of the IEEE, 1990, 78(10):1550-1560.
[32] WU Xindong, KUMAR V, QUINLAN J R, et al. Top 10 algorithms in data mining[J]. Knowledge and information systems, 2008, 14(1):1-37.
[33] GORI M, TESI A. on the problem of local minima in backpropagation[J]. IEEE transactions on pattern analysis and machine intelligence, 1992, 14(1):76-86.
[34] FLETCHER L, KATKOVNIK V, STEFFENS F E, et al. Optimizing the number of hidden nodes of a feedforward artificial neural network[C]//Proceedings of 1998 IEEE International Joint Conference Neural Networks. Anchorage, AK:IEEE, 1998, 2:1608-1612.
[35] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553):436-444.
[36] BENGIO Y, COURVILLE A, VINCENT P. A courville and P vincent, representation learning:a review and new perspectives[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(8):1798-1828.
[37] ACKLEY D H, HINTON G E, SEJNOWSKI T J. A learning algorithm for boltzmann machines[J]. Cognitive science, 1985, 9(1):147-169.
[38] 刘建伟, 刘媛, 罗雄麟. 玻尔兹曼机研究进展[J]. 计算机研究与发展, 2014, 51(1):1-16. LIU Jianwei, LIU Yuan, LUO Xionglin. Research and development on boltzmann machine[J]. Journal of computer research and development, 2014, 51(1):1-16.
[39] SALAKHUTDINOV R, HINTON G. Deep boltzmann machines[J]. Journal of machine learning research, 2009, 5(2):1997-2006.
[40] SMOLENSKY P. Information processing in dynamical systems:foundations of harmony theory[M]. Cambridge, Mass:MIT Press, 1986:194-281.
[41] HINTON G E, SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786):504-507.
[42] HINTON G E. Training products of experts by minimizing contrastive divergence[J]. Neural computation, 2002, 14(8):1771-800.
[43] 张建明, 詹智财, 成科扬, 等. 深度学习的研究与发展[J]. 江苏大学学报:自然科学版, 2015, 36(2):191-200. ZHANG Jianming, ZHAN Zhicai, CHENG Keyang, et al. Review on development of deep learning[J]. Journal of Jiangsu university:natural science edition, 2015, 36(2):191-200.
[44] 孙志远, 鲁成祥, 史忠植, 等. 深度学习研究与进展[J]. 计算机科学, 2016, 43(2):1-8. SUN Zhiyuan, LU Chengxiang, SHI Zhongzhi, et al. Research and advances on deep learning[J]. Computer science, 2016, 43(2):1-8.
[45] SCHMIDHUBER J. Deep learning in neural networks:an overview[J]. Neural networks, 2015,
[46] CHEN H, MURRAY A. A continuous restricted boltzmann machine with a hardware-amenable learning algorithm[J]. Lecture notes in computer science, 2002, 2415:358-363.
[47] LUO Heng, SHEN Ruimin, NIU Changyong. Sparse group restricted boltzmann machines[C]//Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. San Francisco, California, Usa:AAAI Press, 2010.
[48] LEE H, LARGMAN Y, PHAM P, et al. Unsupervised feature learning for audio classification using convolutional deep belief networks[C]//Advances in Neural Information Processing Systems 22:Conference on Neural Information Processing Systems 2009. Vancouver, British Columbia, Canada, 2009.
[49] HALKIAS X, PARIS S, GLOTIN H. Sparse penalty in deep belief networks:using the mixed norm constraint[J]. Computer science, 2013.
[50] POUGETABADIE J,MIRZA M,XU Bing,et al. Generative adversarial nets[J]. Advances in neural information processing systems,2014,3:2672-2680.
[51] YOSHUA Bengio, PASCAL Lamblin, DAN Popovici, et al. Greedy layer-wise training of deep networks[C]//Proceedings of the Nips, Canada, 2006:153-160.
[52] RANZATO M A, BOUREAU Y L, LECUN Y. Sparse feature learning for deep belief networks[J]. Advances in neural information processing systems, 2007, 20:1185-1192.
[53] VINCENT P, LAROCHELLE H, BENGIO Y, et al. Extracting and composing robust features with denoising autoencoders[C]//Proceedings of the International Conference, F, 2008.
[54] VINCENT P, LAROCHELLE H, LAJOIE I, et al. Stacked denoising autoencoders:learning useful representations in a deep network with a local denoising criterion[J]. Journal of machine learning research, 2010, 11(12):3371-408.
[55] JIANG Xiaojuan, ZHANG Yinghua, ZHANG Wensheng, et al. A novel sparse auto-encoder for deep unsupervised learning[C]//Proceedings of Sixth International Conference on Advanced Computational Intelligence. Hangzhou:IEEE, 2013:256-261.
[56] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-324.
[57] WANG Wei, OOI B C, YANG Xiaoyan, et al. Effective multi-modal retrieval based on stacked auto-encoders[J]. Proceedings of the VLDB endowment, 2014, 7(8):649-660.
[58] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25(2):2012.
[59] ELMAN J L. Finding structure in time[J]. Cognitive science, 1990, 14(2):179-211.
[60] HIHI S E, HC-J M Q, BENGIO Y. Hierarchical recurrent neural networks for long-term dependencies[J]. Advances in neural information processing systems, 1995, 8(493-9.
[61] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural computation, 1997, 9(8):1735-1780.
[62] CHO K, MERRIENBOER B V, BAHDANAU D, et al. On the properties of neural machine translation:encoder-decoder approaches[J]. Computer science, 2014,
[63] MNIH V, HEESS N, GRAVES A, et al. Recurrent models of visual attention[J]. Computer science, 2014, 3(2204-12.
[64] GOODFELLOW I, POUGETABADIE J, MIRZA M, et al. Generative adversarial Nets[J]. Advances in neural information processing systems, 2014, 2672-80.
[65] RADFORD A, METZ L, CHINTALA S. Unsupervised representation learning with deep convolutional generative adversarial networks[J]. Computer science, 2015,
[66] XUE J H, TITTERINGTON D M. Comment on "on discriminative vs. generative classifiers:a comparison of logistic regression and naive Bayes"[J]. Neural processing letters, 2008, 2(3):169-87.
[67] HINTON G E, OSINDERO S, TEH Y W. A fast learning algorithm for deep belief nets[J]. Neural computation, 2006, 18(7):1527-1554.
[68] LECUN Y, JACKEL L D, BOTTOU L, et al. Learning algorithms for classification:a comparison on handwritten digit recognition[M]//OH J H, CHO S. Neural Networks:The Statistical Mechanics Perspective. Singapore:World Scientific, 1995.
[69] LE Q V. Building high-level features using large scale unsupervised learning[C]//Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Vancouver, BC:IEEE, 2013:8595-8598.
[70] NYTIMES. In a big network of computers evidence of machine learning[EB/OL].[2016-11-11].http://www.nytimes.com/2012/06/26/technology/in-a-big-network-of-computers-evidence-of-machine-learning.html?pagewanted=all.
[71] SUN Yi, WANG Xiaogang, TANG Xiaoou. Deep learning face representation from predicting 10,000 classes[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH:IEEE, 2014.
[72] BBC. Artificial intelligence:Google’s AlphaGo beats Go master lee Se-dol[EB/OL]. 2016.[2016-11-11]. http://www.bbc.com/news/technology-35785875.
[73] SILVER D, HUANG J, MADDISON C J, et al. Mastering the game of go with deep neural networks and tree search[J]. Nature, 2016, 529(7587):484-489.
[74] MOHAMED A R, DAHL G E, HINTON G. Acoustic modeling using deep belief networks[J]. IEEE transactions on audio, speech, and language processing, 2012, 20(1):14-22.
[75] PAN Jia, LIU Cong, WANG Zhiguo, et al. Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition:why DNN surpasses GMMS in acoustic modeling[C]//Proceedings of the 8th International Symposium on Chinese Spoken Language Processing (ISCSLP). Kowloon:IEEE, 2012:301-305.
[76] Microsoft. Microsoft audio video indexing service[EB/OL].[2016-11-11]. https://www.microsoft.com/en-us/research/project/mavis/.
[77] SEIDE F, LI Gang, YU Dong. Conversational speech transcription using context-dependent deep neural networks[C]//INTERSPEECH 2011, Conference of the International Speech Communication Association. Florence, Italy, 2011.
[78] MORIN F, BENGIO Y. Hierarchical probabilistic neural network language model[C]//Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics. Society for Artificial Intelligence and Statistics, 2005.
[79] COLLOBERT R, WESTON J. A unified architecture for natural language processing:deep neural networks with multitask learning[C]//Proceedings of the 25th International Conference on Machine Learning (ICML). NEC Laboratories America, Inc, 2008.
[80] MNIH A, HINTON G. A scalable hierarchical distributed language model[C]. Proceedings of the Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2008.
[81] MIKOLOV T, KOMBRINK S, ?ERNOCKI J, et al. Extensions of recurrent neural network language model[C]//Proceedings of 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Prague:IEEE, 2011.
[82] MIKOLOV T, DEORAS A, POVEY D, et al. Strategies for training large scale neural network language models[C]//Proceedings of 2011 IEEE Workshop on Automatic Speech Recognition and Understanding. Waikoloa, HI:IEEE, 2011.
[83] MIKOLOV T, ZWEIG G. Context dependent recurrent neural network language model[C]//Proceedings of 2012 IEEE Spoken Language Technology Workshop (SLT). Miami, FL:IEEE, 2012.
[84] MIKOLOV T, KARAFIÁT M, BURGET L, et al. Recurrent neural network based language model[C]//Proceedings of the INTERSPEECH 2010, 11th Conference of the International Speech Communication Association. Makuhari, Chiba, Japan, 2010.
[85] HUANG E H, SOCHER R, MANNING C D, et al. Improving word representations via global context and multiple word prototypes[C]//Proceedings of the Meeting of the Association for Computational Linguistics:Long Papers, F, 2012.
[86] MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J]. Computer science, 2013,
[87] BAEZA-YATES R A, RIBEIRO-NETO B. Modern information retrieval:the concepts and technology behind search[M]. 2nd ed. New York:Addison Wesley,2011:26-28.
[89] HARRINGTON P. Machine learning in action[M]. Shelter Island, N.Y.:Manning Publications Co, 2012.
[90] 郑胤, 陈权崎, 章毓晋. 深度学习及其在目标和行为识别中的新进展[J]. 中国图象图形学报, 2014, 19(2):175-184. ZHENG Yin, CHEN Quanqi, ZHANG Yujin. Deep learning and its new progress in object and behavior recognition[J]. Journal of image and graphics, 2014, 19(2):175-184.
[91] CHEN Xuewen, LIN Xiaotong. Big data deep learning:challenges and perspectives[J]. IEEE access, 2014, 2:514-525.

相似文献/References:

[1]叶志飞,文益民,吕宝粮.不平衡分类问题研究综述[J].智能系统学报,2009,(02):148.
 YE Zhi-fei,WEN Yi-min,LU Bao-liang.A survey of imbalanced pattern classification problems[J].CAAI Transactions on Intelligent Systems,2009,(6):148.
[2]刘奕群,张 敏,马少平.基于非内容信息的网络关键资源有效定位[J].智能系统学报,2007,(01):45.
 LIU Yi-qun,ZHANG Min,MA Shao-ping.Web key resource page selection based on non-content inf o rmation[J].CAAI Transactions on Intelligent Systems,2007,(6):45.
[3]马世龙,眭跃飞,许 可.优先归纳逻辑程序的极限行为[J].智能系统学报,2007,(04):9.
 MA Shi-long,SUI Yue-fei,XU Ke.Limit behavior of prioritized inductive logic programs[J].CAAI Transactions on Intelligent Systems,2007,(6):9.
[4]姚伏天,钱沄涛.高斯过程及其在高光谱图像分类中的应用[J].智能系统学报,2011,(05):396.
 YAO Futian,QIAN Yuntao.Gaussian process and its applications in hyperspectral image classification[J].CAAI Transactions on Intelligent Systems,2011,(6):396.
[5]文益民,强保华,范志刚.概念漂移数据流分类研究综述[J].智能系统学报,2013,(02):95.[doi:10.3969/j.issn.1673-4785.201208012]
 WEN Yimin,QIANG Baohua,FAN Zhigang.A survey of the classification of data streams with concept drift[J].CAAI Transactions on Intelligent Systems,2013,(6):95.[doi:10.3969/j.issn.1673-4785.201208012]
[6]杨成东,邓廷权.综合属性选择和删除的属性约简方法[J].智能系统学报,2013,(02):183.[doi:10.3969/j.issn.1673-4785.201209056]
 YANG Chengdong,DENG Tingquan.An approach to attribute reduction combining attribute selection and deletion[J].CAAI Transactions on Intelligent Systems,2013,(6):183.[doi:10.3969/j.issn.1673-4785.201209056]
[7]胡小生,钟勇.基于加权聚类质心的SVM不平衡分类方法[J].智能系统学报,2013,(03):261.
 HU Xiaosheng,ZHONG Yong.Support vector machine imbalanced data classification based on weighted clustering centroid[J].CAAI Transactions on Intelligent Systems,2013,(6):261.
[8]辛雨璇,闫子飞.基于手绘草图的图像检索技术研究进展[J].智能系统学报,2015,(02):167.[doi:10.3969/j.issn.1673-4785.201401045]
 XIN Yuxuan,YAN Zifei.Research progress of image retrieval based on hand-drawn sketches[J].CAAI Transactions on Intelligent Systems,2015,(6):167.[doi:10.3969/j.issn.1673-4785.201401045]
[9]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,(01):1.[doi:10.3969/j.issn.1673-4785.201403072]
 DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,(6):1.[doi:10.3969/j.issn.1673-4785.201403072]
[10]孔庆超,毛文吉,张育浩.社交网站中用户评论行为预测[J].智能系统学报,2015,(03):349.[doi:10.3969/j.issn.1673-4785.201403019]
 KONG Qingchao,MAO Wenji,ZHANG Yuhao.User comment behavior prediction in social networking sites[J].CAAI Transactions on Intelligent Systems,2015,(6):349.[doi:10.3969/j.issn.1673-4785.201403019]

备注/Memo

备注/Memo:
收稿日期:2016-11-15。
基金项目:国家自然科学基金项目(61003016,61300007,61305054);科技部基本科研业务费重点科技创新类项目(YWF-14-JSJXY-007);软件开发环境国家重点实验室自主探索基金项目(SKLSDE-2012ZX-28,SKLSDE-2014ZX-06).
作者简介:马世龙,男,1953年生,教授、博士生导师、中国人工智能学会常务理事、中国人工智能学会人工智能基础专业委员会主任。主要研究方向为海量信息处理的计算模型、自动推理、软件工程。近年来获得2012年度国防科技进步二等奖等奖项,在国内外学术刊物和学术会议发表论文160多篇;乌尼日其其格,女,1979年生,博士研究生,主要研究方向为云计算与大数据、计算机软件形式化方法;李小平,男,1979年生,博士研究生,主要研究方向为云计算与大数据、计算机软件形式化方法。
通讯作者:李小平.E-mail:lee.rex@163.com.
更新日期/Last Update: 1900-01-01