HUANG Heyan,CAO Zhao,FENG Chong.Opportunities and challenges of big data intelligence analysis[J].CAAI Transactions on Intelligent Systems,2016,11(6):719-727.[doi:10.11992/tis.201610025]





Opportunities and challenges of big data intelligence analysis
黄河燕12 曹朝12 冯冲12
1. 北京理工大学 计算机学院, 北京 100081;
2. 北京市海量语言信息处理与云计算应用工程研究中心, 北京 100081
HUANG Heyan12 CAO Zhao12 FENG Chong12
1. School of Computer Science, Beijing Institute of Technology, Beijing 100081, China;
2. Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing 100081, China
big dataintelligence analysisinformation sciencesopportunities and challengescloud computing
In the era of big data, information and intelligence analysis is facing unprecedented opportunities and challenges. This paper describes the status of intelligence analysis from the perspective of the information science development paradigm. With the guidance of information processing concepts, which is an integration of factual data, tools, methods and expert wisdom, the application requirements and challenges of big data intelligence analysis were analyzed in terms of big data integration, big data processing technology, tools and deep information mining. Finally, because the big data intelligence analysis process consists of data collection, pre-processing, analysis and application as the main components, the application development opportunities and technical trends of big data intelligence analysis were forecasted.


[1] GINSBERG J, MOHEBBI M H, PATEL R S, et al. Detecting influenza epidemics using search engine query data[J]. Nature, 2009, 457(7232):1012-1014.
[2] 包昌火. 情报研究方法论[M]. 北京:科学技术文献出版社, 1990. BAO Changhuo. Information research methodology[M]. Beijing:Science and Technology Literature Publishing House, 1990.
[3] WEISS G. A Modern approach to distributed artificial intelligence[J]. IEEE transactions on systems man & cybernetics-part c applications & reviews, 1999, 22(2).
[4] MANYIKA J, CHUI M, BUGHIN J, et al. Big data:the next frontier for innovation, competition, and productivity[R]. McKinsey Global Institute, 2011.
[5] ETEMADPOUR R, MURRAY P, FORBES A G. Evaluating density-based motion for big data visual analytics[C]//Proceedings of IEEE International Conference on Big Data. Washington, DC, USA, 2014:451-460.
[6] SONG Jingkuan, YANG Yang, YANG Yi, et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources[C]//Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. New York, NY, USA, 2013:785-796.
[7] RAGHUPATHI W, RAGHUPATHI V. Big data analytics in healthcare:promise and potential[J]. Health information science and systems, 2014, 2:3.
[8] PIRES A J M. Big data analytics in healthcare:are end-users ready[D]. Braga:Universidade Católica Portuguesa, 2014.
[9] SHVACHKO K, KUANG Hairong, RADIA S, et al. The hadoop distributed file system[C]//Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies. Incline Village, NV, USA, 2010:1-10.
[10] ZAHARIA M, CHOWDHURY M, FRANKLIN M J, et al. Spark:cluster computing with working sets[C]//Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Berkeley, CA, USA, 2010:10.
[11] JUNG K, KIM K I, JAIN A K. Text information extraction in images and video:a survey[J]. Pattern recognition, 2004, 37(5):977-997.
[12] SODERLAND S. Learning information extraction rules for semi-structured and free text[J]. Machine learning, 1999, 34(1/2/3):233-272.
[13] ZHANG Yongmian, JI Qiang. Active and dynamic information fusion for facial expression understanding from image sequences[J]. IEEE transactions on pattern analysis and machine intelligence, 2005, 27(5):699-714.
[14] SU Xueyuan, SWART G. Oracle in-database hadoop:when mapreduce meets RDBMS[C]//Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. Scottsdale, AZ, USA, 2012:779-790.
[15] TAHANI H, KELLER J M. Information fusion in computer vision using the fuzzy integral[J]. IEEE transactions on systems, man, and cybernetics, 1990, 20(3):733-741.
[16] WANG Jun, HU Yiming. WOLF-a novel reordering write buffer to boost the performance of log-structured file system[C]//Proceedings of the 1st USENIX Conference on File and Storage Technologies. Monterey, CA, USA, 2002:4.
[17] 孟小峰, 慈祥. 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1):146-169. MENG Xiaofeng, CI Xiang. Big data management:concepts, techniques and challenges[J]. Journal of computer research and development, 2013, 50(1):146-169.
[18] WU Xindong, ZHU Xingquan, WU Gongqing, et al. Data mining with big data[J]. IEEE transactions on knowledge and data engineering, 2014, 26(1):97-107.
[19] KOVAR L, GLEICHER M. Automated extraction and parameterization of motions in large data sets[J]. ACM transactions on graphics, 2004, 23(3):559-568.
[20] LAZER D, KENNEDY R, KING G, et al. The parable of Google flu:traps in big data analysis[J]. Science, 2014, 343(6176):1203-1205.
[21] FAN Jianqing, HAN Fang, LIU Han. Challenges of big data analysis[J]. National science review, 2014, 1(2):293-314.
[22] SCHMIDHUBER J. Deep learning in neural networks:an overview[J]. Neural networks, 2015, 61:85-117.
[23] CARLSON A, BETTERIDGE J, KISIEL B, et al. Toward an architecture for never-ending language learning[C]//AAAI 2010 Twenty-Fourth AAAI Conference on Artificial Intelligence. Atlanta, Georgia, USA, 2010:529-573.
[24] BLUM A L, LANGLEY P. Selection of relevant features and examples in machine learning[J]. Artificial intelligence, 1997, 97(1/2):245-271.
[25] JIN Songchang, LIN Wangqun, YIN Hong, et al. Community structure mining in big data social media networks with MapReduce[J]. Cluster computing, 2015, 18(3):999-1010.
[26] TANG Jiliang, LIU Huan. Unsupervised feature selection for linked social media data[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing, China, 2012:904-912.
[27] CASSIDY A S, MEROLLA P, ARTHUR J V, et al. Cognitive computing building block:a versatile and efficient digital neuron model for neurosynaptic cores[C]//Proceedings of the 2013 International Joint Conference on Neural Networks. Dallas, TX, USA, 2013:1-10.
[28] PREISSL R, WONG T M, DATTA P, et al. Compass:a scalable simulator for an architecture for cognitive computing[C]//Proceedings of the 2012 International Conference on High Performance Computing, Networking, Storage and Analysis. Salt Lake City, UT, USA, 2012:1-11.
[29] KEIM D, QU Huamin, MA K L. Big-data visualization[J]. IEEE computer graphics and applications, 2013, 33(4):20-21.
[30] MEYEROVICH L A, TOROK M E, ATKINSON E, et al. Superconductor:a language for big data visualization[M]. Shenzhen, China:ACM, 2013.
[31] HACHET M, KRUIJFF E. Guest editor’s introduction:special section on the ACM symposium on virtual reality software and technology[J]. IEEE transactions on visualization and computer graphics, 2010, 16(1):2-3.
[32] CHILDS H, BRUGGER E, BONNELL K, et al. A contract based system for large data visualization[C]//Proceedings of VIS 05. IEEE Visualization. Minneapolis, MN, USA, 2005:191-198.
[33] KANOV K, PERLMAN E, BURNS R, et al. I/O streaming evaluation of batch queries for data-intensive computational turbulence[C]//Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. Seattle, WA, USA, 2011:1-10.
[34] FRASCA M, PRABHAKAR R, RAGHAVAN P, et al. Virtual I/O caching:dynamic storage cache management for concurrent workloads[C]//Proceedings of 2011 International Conference on High Performance Computing Networking, Storage and Analysis. Seattle, WA, USA, 2011:1-11.
[35] 张建勋, 古志民, 郑超. 云计算研究进展综述[J]. 计算机应用研究, 2010, 27(2):429-433. ZHANG Jianxun, GU Zhimin, ZHENG Chao. Survey of research progress on cloud computing[J]. Application research of computers, 2010, 27(2):429-433.
[36] WANG Guojun, LIU Qin, WU Jie. Hierarchical attribute-based encryption for fine-grained access control in cloud storage services[C]//Proceedings of the 17th ACM conference on Computer and communications security. Chicago, Illinois, USA, 2010:735-737.
[37] CHANG F, DEAN J, GHEMAWAT S, et al. Bigtable:a distributed storage system for structured data[J]. ACM transactions on computer systems, 2008, 26(2):4.
[38] ARMBRUST M, FOX A, GRIFFITH R, et al. Above the clouds:a Berkeley view of cloud computing[R]. Technical Report No. UCB/EECS-2009-28. Berkeley:EECS Department University of California Berkeley, 2009:50-58.
[39] DEAN J, Ghemawat S. MapReduce:simplified data processing on large clusters[C]//Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation. San Francisco, CA, USA, 2004:107-113.
[40] IQBAL M H, SOOMRO T R. Big data analysis:apache storm perspective[J]. International journal of computer trends and technology, 2015, 19(1):9-14.
[41] WANG Cong, CHOW S S M, WANG Qian, et al. Privacy-preserving public auditing for secure cloud storage[J]. IEEE transactions on computers, 2013, 62(2):362-375.
[42] KATSUNO H, MENDELZON A O. Propositional knowledge base revision and minimal change[J]. Artificial intelligence, 1991, 52(3):263-294.
[43] HOFFART J, SUCHANEK F M, BERBERICH K, et al. YAGO2:a spatially and temporally enhanced knowledge base from Wikipedia[J]. Artificial intelligence, 2013, 194:28-61.
[44] LEHMANN D, MAGIDOR M. What does a conditional knowledge base entail[J]. Artificial intelligence, 1992, 55(1):1-60.
[45] BARBARá D, GARCIA-MOLINA H, PORTER D. The management of probabilistic data[J]. IEEE transactions on knowledge and data engineering, 1992, 4(5):487-502.
[46] KOUBARAKIS M, SKIADOPOULOS S, TRYFONOPOULOS C. Logic and computational complexity for Boolean information retrieval[J]. IEEE transactions on knowledge and data engineering, 2006, 18(12):1659-1666.


 XIN Yuxuan,YAN Zifei.Research progress of image retrieval based on hand-drawn sketches[J].CAAI Transactions on Intelligent Systems,2015,10(6):167.[doi:10.3969/j.issn.1673-4785.201401045]
 WANG Dewen,SUN Zhiwei.A method for cluster analysis of electric power consumers based on in-memory computing[J].CAAI Transactions on Intelligent Systems,2015,10(6):569.[doi:10.3969/j.issn.1673-4785.201411011]
 SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10(6):607.[doi:10.3969/j.issn.1673-4785.201411036]
 MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11(6):728.[doi:10.11992/tis.201611021]
 MIAO Duoqian,ZHANG Qinghua,QIAN Yuhua,et al.From human intelligence to machine implementation model: theories and applications based on granular computing[J].CAAI Transactions on Intelligent Systems,2016,11(6):743.[doi:10.11992/tis.201612014]
 YAN Xinping,LIU Chenguang.Review and prospect for intelligent waterway transportation system[J].CAAI Transactions on Intelligent Systems,2016,11(6):807.[doi:10.11992/tis.201605007]
 XU Libo,PAN Xuwei,YUAN Ping,et al.Knowledge innovation by intelligent emergence—concept, framework and its pathway[J].CAAI Transactions on Intelligent Systems,2017,12(6):47.[doi:10.11992/tis.201610014]
 HE Ming,CHANG Mengmeng,LIU Guoyang,et al.Log mining and application based on sql-on-hadoop query engine[J].CAAI Transactions on Intelligent Systems,2017,12(6):717.[doi:10.11992/tis.201706016]
 MA Yu,ZHANG Yan,WANG Hongzhi,et al.A personalized recommendation algorithm for intelligent guidance[J].CAAI Transactions on Intelligent Systems,2018,13(6):352.[doi:10.11992/tis.201711036]
 NIU Dejiao,LIU Yawen,CAI Tao,et al.Fall detection system based on recurrent neural network[J].CAAI Transactions on Intelligent Systems,2018,13(6):380.[doi:10.11992/tis.201710013]


更新日期/Last Update: 1900-01-01