[1]黄河燕,曹朝,冯冲.大数据情报分析发展机遇及其挑战[J].智能系统学报,2016,11(6):719-727.[doi:10.11992/tis.201610025]
 HUANG Heyan,CAO Zhao,FENG Chong.Opportunities and challenges of big data intelligence analysis[J].CAAI Transactions on Intelligent Systems,2016,11(6):719-727.[doi:10.11992/tis.201610025]
点击复制

大数据情报分析发展机遇及其挑战(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年6期
页码:
719-727
栏目:
出版日期:
2017-01-20

文章信息/Info

Title:
Opportunities and challenges of big data intelligence analysis
作者:
黄河燕12 曹朝12 冯冲12
1. 北京理工大学 计算机学院, 北京 100081;
2. 北京市海量语言信息处理与云计算应用工程研究中心, 北京 100081
Author(s):
HUANG Heyan12 CAO Zhao12 FENG Chong12
1. School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
2. Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
关键词:
大数据情报分析情报学机遇与挑战云计算
Keywords:
big dataintelligence analysisinformation sciencesopportunities and challengescloud computing
分类号:
TP18
DOI:
10.11992/tis.201610025
摘要:
大数据时代,情报信息的分析处理面临着前所未有的机遇和挑战。本文从情报学发展范式的角度阐述了情报分析的现状;以事实数据、工具方法和专家智慧相融合的情报处理理念为指导,剖析了大数据情报分析在大数据融合、大数据处理技术与工具、信息深度挖掘方面的应用需求和面临的挑战;最后以大数据情报分析过程中的数据采集、预处理、分析和应用为主线展望了大数据情报分析的应用发展机遇和技术趋势。
Abstract:
In the era of big data, information and intelligence analysis is facing unprecedented opportunities and challenges. This paper describes the status of intelligence analysis from the perspective of the information science development paradigm. With the guidance of information processing concepts, which is an integration of factual data, tools, methods and expert wisdom, the application requirements and challenges of big data intelligence analysis were analyzed in terms of big data integration, big data processing technology, tools and deep information mining. Finally, because the big data intelligence analysis process consists of data collection, pre-processing, analysis and application as the main components, the application development opportunities and technical trends of big data intelligence analysis were forecasted.

参考文献/References:

[1] GINSBERG J, MOHEBBI M H, PATEL R S, et al. Detecting influenza epidemics using search engine query data[J]. Nature, 2009, 457(7232):1012-1014.
[2] 包昌火. 情报研究方法论[M]. 北京:科学技术文献出版社, 1990. BAO Changhuo. Information research methodology[M]. Beijing:Science and Technology Literature Publishing House, 1990.
[3] WEISS G. A Modern approach to distributed artificial intelligence[J]. IEEE transactions on systems man & cybernetics-part c applications & reviews, 1999, 22(2).
[4] MANYIKA J, CHUI M, BUGHIN J, et al. Big data:the next frontier for innovation, competition, and productivity[R]. McKinsey Global Institute, 2011.
[5] ETEMADPOUR R, MURRAY P, FORBES A G. Evaluating density-based motion for big data visual analytics[C]//Proceedings of IEEE International Conference on Big Data. Washington, DC, USA, 2014:451-460.
[6] SONG Jingkuan, YANG Yang, YANG Yi, et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. New York, NY, USA, 2013:785-796.
[7] RAGHUPATHI W, RAGHUPATHI V. Big data analytics in healthcare:promise and potential[J]. Health information science and systems, 2014, 2:3.
[8] PIRES A J M. Big data analytics in healthcare:are end-users ready[D]. Braga:Universidade Católica Portuguesa, 2014.
[9] SHVACHKO K, KUANG Hairong, RADIA S, et al. The hadoop distributed file system[C]//Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies. Incline Village, NV, USA, 2010:1-10.
[10] ZAHARIA M, CHOWDHURY M, FRANKLIN M J, et al. Spark:cluster computing with working sets[C]//Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Berkeley, CA, USA, 2010:10.
[11] JUNG K, KIM K I, JAIN A K. Text information extraction in images and video:a survey[J]. Pattern recognition, 2004, 37(5):977-997.
[12] SODERLAND S. Learning information extraction rules for semi-structured and free text[J]. Machine learning, 1999, 34(1/2/3):233-272.
[13] ZHANG Yongmian, JI Qiang. Active and dynamic information fusion for facial expression understanding from image sequences[J]. IEEE transactions on pattern analysis and machine intelligence, 2005, 27(5):699-714.
[14] SU Xueyuan, SWART G. Oracle in-database hadoop:when mapreduce meets RDBMS[C]//Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. Scottsdale, AZ, USA, 2012:779-790.
[15] TAHANI H, KELLER J M. Information fusion in computer vision using the fuzzy integral[J]. IEEE transactions on systems, man, and cybernetics, 1990, 20(3):733-741.
[16] WANG Jun, HU Yiming. WOLF-a novel reordering write buffer to boost the performance of log-structured file system[C]//Proceedings of the 1st USENIX Conference on File and Storage Technologies. Monterey, CA, USA, 2002:4.
[17] 孟小峰, 慈祥. 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1):146-169. MENG Xiaofeng, CI Xiang. Big data management:concepts, techniques and challenges[J]. Journal of computer research and development, 2013, 50(1):146-169.
[18] WU Xindong, ZHU Xingquan, WU Gongqing, et al. Data mining with big data[J]. IEEE transactions on knowledge and data engineering, 2014, 26(1):97-107.
[19] KOVAR L, GLEICHER M. Automated extraction and parameterization of motions in large data sets[J]. ACM transactions on graphics, 2004, 23(3):559-568.
[20] LAZER D, KENNEDY R, KING G, et al. The parable of Google flu:traps in big data analysis[J]. Science, 2014, 343(6176):1203-1205.
[21] FAN Jianqing, HAN Fang, LIU Han. Challenges of big data analysis[J]. National science review, 2014, 1(2):293-314.
[22] SCHMIDHUBER J. Deep learning in neural networks:an overview[J]. Neural networks, 2015, 61:85-117.
[23] CARLSON A, BETTERIDGE J, KISIEL B, et al. Toward an architecture for never-ending language learning[C]//AAAI 2010 Twenty-Fourth AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA, 2010:529-573.
[24] BLUM A L, LANGLEY P. Selection of relevant features and examples in machine learning[J]. Artificial intelligence, 1997, 97(1/2):245-271.
[25] JIN Songchang, LIN Wangqun, YIN Hong, et al. Community structure mining in big data social media networks with MapReduce[J]. Cluster computing, 2015, 18(3):999-1010.
[26] TANG Jiliang, LIU Huan. Unsupervised feature selection for linked social media data[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing, China, 2012:904-912.
[27] CASSIDY A S, MEROLLA P, ARTHUR J V, et al. Cognitive computing building block:a versatile and efficient digital neuron model for neurosynaptic cores[C]//Proceedings of the 2013 International Joint Conference on Neural Networks. Dallas, TX, USA, 2013:1-10.
[28] PREISSL R, WONG T M, DATTA P, et al. Compass:a scalable simulator for an architecture for cognitive computing[C]//Proceedings of the 2012 International Conference on High Performance Computing, Networking, Storage and Analysis. Salt Lake City, UT, USA, 2012:1-11.
[29] KEIM D, QU Huamin, MA K L. Big-data visualization[J]. IEEE computer graphics and applications, 2013, 33(4):20-21.
[30] MEYEROVICH L A, TOROK M E, ATKINSON E, et al. Superconductor:a language for big data visualization[M]. Shenzhen, China:ACM, 2013.
[31] HACHET M, KRUIJFF E. Guest editor’s introduction:special section on the ACM symposium on virtual reality software and technology[J]. IEEE transactions on visualization and computer graphics, 2010, 16(1):2-3.
[32] CHILDS H, BRUGGER E, BONNELL K, et al. A contract based system for large data visualization[C]//Proceedings of VIS 05. IEEE Visualization. Minneapolis, MN, USA, 2005:191-198.
[33] KANOV K, PERLMAN E, BURNS R, et al. I/O streaming evaluation of batch queries for data-intensive computational turbulence[C]//Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. Seattle, WA, USA, 2011:1-10.
[34] FRASCA M, PRABHAKAR R, RAGHAVAN P, et al. Virtual I/O caching:dynamic storage cache management for concurrent workloads[C]//Proceedings of 2011 International Conference on High Performance Computing Networking, Storage and Analysis. Seattle, WA, USA, 2011:1-11.
[35] 张建勋, 古志民, 郑超. 云计算研究进展综述[J]. 计算机应用研究, 2010, 27(2):429-433. ZHANG Jianxun, GU Zhimin, ZHENG Chao. Survey of research progress on cloud computing[J]. Application research of computers, 2010, 27(2):429-433.
[36] WANG Guojun, LIU Qin, WU Jie. Hierarchical attribute-based encryption for fine-grained access control in cloud storage services[C]//Proceedings of the 17th ACM conference on Computer and communications security. Chicago, Illinois, USA, 2010:735-737.
[37] CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable:a distributed storage system for structured data[J]. ACM transactions on computer systems, 2008, 26(2):4.
[38] ARMBRUST M, FOX A, GRIFFITH R, et al. Above the clouds:a Berkeley view of cloud computing[R]. Technical Report No. UCB/EECS-2009-28. Berkeley:EECS Department University of California Berkeley, 2009:50-58.
[39] DEAN J, Ghemawat S. MapReduce:simplified data processing on large clusters[C]//Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation. San Francisco, CA, USA, 2004:107-113.
[40] IQBAL M H, SOOMRO T R. Big data analysis:apache storm perspective[J]. International journal of computer trends and technology, 2015, 19(1):9-14.
[41] WANG Cong, CHOW S S M, WANG Qian, et al. Privacy-preserving public auditing for secure cloud storage[J]. IEEE transactions on computers, 2013, 62(2):362-375.
[42] KATSUNO H, MENDELZON A O. Propositional knowledge base revision and minimal change[J]. Artificial intelligence, 1991, 52(3):263-294.
[43] HOFFART J, SUCHANEK F M, BERBERICH K, et al. YAGO2:a spatially and temporally enhanced knowledge base from Wikipedia[J]. Artificial intelligence, 2013, 194:28-61.
[44] LEHMANN D, MAGIDOR M. What does a conditional knowledge base entail[J]. Artificial intelligence, 1992, 55(1):1-60.
[45] BARBARá D, GARCIA-MOLINA H, PORTER D. The management of probabilistic data[J]. IEEE transactions on knowledge and data engineering, 1992, 4(5):487-502.
[46] KOUBARAKIS M, SKIADOPOULOS S, TRYFONOPOULOS C. Logic and computational complexity for Boolean information retrieval[J]. IEEE transactions on knowledge and data engineering, 2006, 18(12):1659-1666.

相似文献/References:

[1]辛雨璇,闫子飞.基于手绘草图的图像检索技术研究进展[J].智能系统学报,2015,10(02):167.[doi:10.3969/j.issn.1673-4785.201401045]
 XIN Yuxuan,YAN Zifei.Research progress of image retrieval based on hand-drawn sketches[J].CAAI Transactions on Intelligent Systems,2015,10(6):167.[doi:10.3969/j.issn.1673-4785.201401045]
[2]王德文,孙志伟.一种基于内存计算的电力用户聚类分析方法[J].智能系统学报,2015,10(04):569.[doi:10.3969/j.issn.1673-4785.201411011]
 WANG Dewen,SUN Zhiwei.A method for cluster analysis of electric power consumers based on in-memory computing[J].CAAI Transactions on Intelligent Systems,2015,10(6):569.[doi:10.3969/j.issn.1673-4785.201411011]
[3]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报,2015,10(04):607.[doi:10.3969/j.issn.1673-4785.201411036]
 SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10(6):607.[doi:10.3969/j.issn.1673-4785.201411036]
[4]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报,2016,11(6):728.[doi:10.11992/tis.201611021]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11(6):728.[doi:10.11992/tis.201611021]
[5]苗夺谦,张清华,钱宇华,等.从人类智能到机器实现模型——粒计算理论与方法[J].智能系统学报,2016,11(6):743.[doi:10.11992/tis.201612014]
 MIAO Duoqian,ZHANG Qinghua,QIAN Yuhua,et al.From human intelligence to machine implementation model: theories and applications based on granular computing[J].CAAI Transactions on Intelligent Systems,2016,11(6):743.[doi:10.11992/tis.201612014]
[6]严新平,柳晨光.智能航运系统的发展现状与趋势[J].智能系统学报,2016,11(6):807.[doi:10.11992/tis.201605007]
 YAN Xinping,LIU Chenguang.Review and prospect for intelligent waterway transportation system[J].CAAI Transactions on Intelligent Systems,2016,11(6):807.[doi:10.11992/tis.201605007]
[7]许立波,潘旭伟,袁平,等.知识智能涌现创新:概念、体系与路径[J].智能系统学报,2017,12(01):47.[doi:10.11992/tis.201610014]
 XU Libo,PAN Xuwei,YUAN Ping,et al.Knowledge innovation by intelligent emergence—concept, framework and its pathway[J].CAAI Transactions on Intelligent Systems,2017,12(6):47.[doi:10.11992/tis.201610014]
[8]何明,常盟盟,刘郭洋,等.基于SQL-on-Hadoop查询引擎的日志挖掘及其应用[J].智能系统学报,2017,12(05):717.[doi:10.11992/tis.201706016]
 HE Ming,CHANG Mengmeng,LIU Guoyang,et al.Log mining and application based on sql-on-hadoop query engine[J].CAAI Transactions on Intelligent Systems,2017,12(6):717.[doi:10.11992/tis.201706016]
[9]马钰,张岩,王宏志,等.面对智能导诊的个性化推荐算法[J].智能系统学报,2018,13(03):352.[doi:10.11992/tis.201711036]
 MA Yu,ZHANG Yan,WANG Hongzhi,et al.A personalized recommendation algorithm for intelligent guidance[J].CAAI Transactions on Intelligent Systems,2018,13(6):352.[doi:10.11992/tis.201711036]
[10]牛德姣,刘亚文,蔡涛,等.基于递归神经网络的跌倒检测系统[J].智能系统学报,2018,13(03):380.[doi:10.11992/tis.201710013]
 NIU Dejiao,LIU Yawen,CAI Tao,et al.Fall detection system based on recurrent neural network[J].CAAI Transactions on Intelligent Systems,2018,13(6):380.[doi:10.11992/tis.201710013]

备注/Memo

备注/Memo:
收稿日期:2016-10-24。
基金项目:国家重点研发计划项目(2016YFB1000902).
作者简介:黄河燕,女,1963年生,教授。任中国人工智能学会和中国中文信息学会副理事长。主要研究方向为机器翻译、自然语言处理、社会计算。曾获国家科技进步一等奖、中国科学院科技进步一等奖和北京市科学技术一等奖等奖励。发表学术论文多篇;曹朝,男,1982年生,副研究员,博士,中国计算机学会数据库专委会委员。主要研究方向为数据库管理系统、分布式系统、智能信息处理。发表学术论文多篇;冯冲,男,1977年生,副研究员,博士,中文信息学会社会媒体处理专委会委员、语言与知识计算专委会委员。主要研究方向为网络信息抽取和多语机器翻译。曾获部级科技奖励3项。发表学术论文30余篇、编著1部,申请专利10余项。
通讯作者:黄河燕.E-mail:hhy63@bit.edu.cn.
更新日期/Last Update: 1900-01-01