[1]李雪,蒋树强.智能交互的物体识别增量学习技术综述[J].智能系统学报,2017,(02):140-149.[doi:10.11992/tis.201701006]
 LI Xue,JIANG Shuqiang.Incremental learning and object recognition system based on intelligent HCI: a survey[J].CAAI Transactions on Intelligent Systems,2017,(02):140-149.[doi:10.11992/tis.201701006]
点击复制

智能交互的物体识别增量学习技术综述(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
期数:
2017年02期
页码:
140-149
栏目:
出版日期:
2017-04-25

文章信息/Info

Title:
Incremental learning and object recognition system based on intelligent HCI: a survey
作者:
李雪12 蒋树强2
1. 山东科技大学 计算机科学与工程学院, 山东 青岛 266590;
2. 中国科学院计算技术研究所 智能信息处理重点实验室, 北京 100190
Author(s):
LI Xue12 JIANG Shuqiang2
1. College of Information Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China;
2. Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
关键词:
人工智能人机交互计算机视觉物体识别机器学习多模态机器人交互学习
Keywords:
artificial intelligencehuman-computer interactioncomputer visionobject recognitionmachine learningmultimodalityroboticsinteractive learning
分类号:
TP391
DOI:
10.11992/tis.201701006
摘要:
智能交互系统是研究人与计算机之间进行交流与通信,使计算机能够在最大程度上完成交互者的某个指令的一个领域。其发展的目标是实现人机交互的自主性、安全性和友好性。增量学习是实现这个发展目标的一个途径。本文对智能交互系统的任务、背景和获取信息来源进行简要介绍,主要对增量学习领域的已有工作进行综述。增量学习是指一个学习系统能不断地从新样本中学习新的知识,非常类似于人类自身的学习模式。它使智能交互系统拥有自我学习,提高交互体验的能力。文中对主要的增量学习算法的基本原理和特点进行了阐述,分析各自的优点和不足,并对进一步的研究方向进行展望。
Abstract:
Intelligent HCI systems focus on the interaction between computers and humans and study whether computers are able to apprehend human instructions. Moreover, this study aims to make the interaction more independent and interactive. To some extent, incremental learning is a way to realize this goal. This study briefly introduces the tasks, background, and information source of intelligent HCI systems; in addition, it focuses on the summary of incremental learning. Similar to the learning mechanism of humans, incremental learning involves acquiring new knowledge on a continuous basis. This allows for the intelligent HCI systems to have the ability of self-growth. This study surveys the works that focus on incremental learning, including the mechanisms and their respective advantages and disadvantages, and highlights the future research directions.

参考文献/References:

[1] ERNST M O, BüLTHOFF H H. Merging the senses into a robust percept[J]. Trends in cognitive sciences, 2004, 8(4): 162-169.
[2] CORRADINI A, MEHTA M, BERNSEN N O, et al. Multimodal input fusion in human-computer interaction[J]. NATO Science Series Sub Series III Computer and Systems Sciences, 2005, 198: 223.
[3] NODA K, ARIE H, SUGA Y, et al. Multimodal integration learning of robot behavior using deep neural networks[J]. Robotics and autonomous systems, 2014, 62(6): 721-736.
[4] MERI?LI C, KLEE S D, PAPARIAN J, et al. An interactive approach for situated task specification through verbal instructions[C]//Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. Paris, France: International Foundation for Autonomous Agents and Multiagent Systems, 2014: 1069-1076.
[5] CANTRELL R, BENTON J, TALAMADUPULA K, et al. Tell me when and why to do it! Run-time planner model updates via natural language instruction[C]//Proceedings of the 2012 IEEE International Conference on Human-Robot Interaction. Boston, MA: IEEE, 2012: 471-478.
[6] THOMASON J, ZHANG S Q, MOONEY R, et al. Learning to interpret natural language commands through human-robot dialog[C]//Proceedings of the 24th international conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015.
[7] EBERHARD K M, NICHOLSON H, SANDRA K, et al. The Indiana “Cooperative Remote Search Task”(CReST) corpus[C]//Proceedings of the 2010 International Conference on Language Resources and Evaluation. Valletta, Malta: LREC, 2010.
[8] LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International journal of computer vision, 2004, 60(2): 91-110.
[9] MORISSET B, RUSU R B, SUNDARESAN A, et al. Leaving flatland: toward real-time 3D navigation[C]//Proceedings of the 2009 IEEE International Conference on Robotics and Automation. Kobe: IEEE, 2009: 3786-3793.
[10] HINTERSTOISSER S, HOLZER S, CAGNIART C, et al. Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes[C]//Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona: IEEE, 2011: 858-865.
[11] WANG Anran, LU Jiwen, CAI Jianfei, et al. Large-margin multi-modal deep learning for RGB-D object recognition[J]. IEEE transactions on multimedia, 2015, 17(11): 1887-1898.
[12] LECUN Y, BOSER B, DENKER J S, et al. Backpropagation applied to handwritten zip code recognition[J]. Neural computation, 1989, 1(4): 541-551.
[13] THOMASON J, SINAPOV J, SVETLIK M, et al. Learning multi-modal grounded linguistic semantics by playing I spy[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York, 2016.
[14] LIU C S, CHAI J Y. Learning to mediate perceptual differences in situated human-robot dialogue[C]//Proceedings of the Twenty-Ninth American Association Conference on Artificial Intelligence. Austin, Texas: AAAI Press, 2015: 2288-2294.
[15] PARDE N, HAIR A, PAPAKOSTAS M, et al. Grounding the meaning of words through vision and interactive gameplay[J]. Proceedings of the 24th International Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015.
[16] MATUSZEK C, FITZGERALD N, ZETTLEMOYER L, et al. A joint model of language and perception for grounded attribute learning[C]//Proceedings of the 29th International Conference on Machine Learning. Edinburgh, Scotland, 2012.
[17] 赵鹏, 陈浩, 刘慧婷, 等. 一种基于图的多模态随机游走重排序算法[J]. 哈尔滨工程大学学报, 2016, 37(10): 1387-1393. ZHAO Peng, CHEN Hao, LIU Huiting, et al. A multimodal graph-based re-ranking through random walk algrithm[J]. Journal of Harbin Engineering University, 2016, 37(10): 1387-1393.
[18] 段喜萍, 刘家锋, 王建华, 等. 多模态特征联合稀疏表示的视频目标跟踪[J]. 哈尔滨工程大学学报, 2015, 36(12): 1609-1613. DUAN Xiping, LIU Jiafeng, WANG Jianhua, et al. Visual target tracking via multi-cue joint sparse representation[J]. Journal of Harbin Engineering University, 2015, 36(12): 1609-1613.
[19] FISHER J W, DARRELL T. Signal level fusion for multimodal perceptual user interface[C]//Proceedings of the 2001 Workshop on Perceptive User Interfaces. New York, NY, USA: ACM, 2001: 1-7.
[20] JOHNSTON M, BANGALORE S. Finite-state multimodal parsing and understanding[C]//Proceedings of the 18th conference on Computational linguistics. Saarbrücken, Germany: ACM, 2000: 369-375.
[21] BETTERIDGE J, CARLSON A, HONG S A, et al. Toward never ending language learning[C]//Proceedings of the American Association for Artificial Intelligence. 2009: 1-2.
[22] CHERNOVA S, THOMAZ A L. Robot learning from human teachers[M]. San Rafael, CA, USA: IEEE, 2014.
[23] MATUSZEK C, BO L F, ZETTLEMOYER L, et al. Learning from unscripted deictic gesture and language for human-robot interactions[C]//Proceedings of the 28th American Association Conference on Artificial Intelligence. Québec City, Québec, Canada: AAAI Press, 2014: 2556-2563.
[24] CUAYáHUITL H, DETHLEFS N. Dialogue systems using online learning: beyond empirical methods[C]//Proceedings of the NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community: Tools and Data. Montreal, Canada: Association for Computational Linguistics, 2012: 7-8.
[25] 顾海巍, 樊绍巍, 金明河, 等. 基于灵巧手触觉信息的未知物体类人探索策略[J]. 哈尔滨工程大学学报, 2016, 37(10): 1400-1407. GU Haiwei, FAN Shaowei, JIN Minghe, et al. An anthropomorphic exploration strategy of unknown object based on haptic information of dexterous robot hand[J]. Journal of Harbin Engineering University, 2016, 37(10): 1400-1407.
[26] KEIZER S, FOSTER M E, WANG Z R, et al. Machine learning for social multiparty human-robot interaction[J]. ACM transactions on interactive intelligent systems (TIIS), 2014, 4(3): 14.
[27] BOHUS D, SAW C W, HORVITZ E. Directions robot: In-the-wild experiences and lessons learned[C]//Proceedings of the 2014 International Conference on Autonomous Agents and Multi-agent Systems. Richland, SC, 2014: 637-644.
[28] KRAUSE E A, ZILLICH M, WILLIAMS T E, et al. Learning to recognize novel objects in one shot through human-robot interactions in natural language dialogues[C]//Proceedings of the 28th American Association Conference on Artificial Intelligence. Québec City, Québec, Canada: AAAI Press, 2014: 2796-2802.
[29] MENSINK T, VERBEEK J J, PERRONNIN F, et al. Distance-based image classification: generalizing to new classes at near-zero cost[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 35(11): 2624-2637.
[30] IBA W, WOGULIS J, LANGLEY P A T. Trading off simplicity and coverage in incremental concept learning[C]//Proceedings of the Fifth International Conference on Machine Learning. Ann Arbor: University of Michigan, 1988: 73.
[31] GROSSBERG S. Nonlinear neural networks: Principles, mechanisms, and architectures[J]. Neural networks, 1988, 1(1): 17-61.
[32] POLIKAR R, UPDA L, UPDA S S, et al. Learn++: An incremental learning algorithm for supervised neural networks[J]. IEEE transactions on systems, man, and cybernetics, part C (Applications and reviews), 2001, 31(4): 497-508.
[33] 贾刚, 王宗义. 混合迁移学习方法在医学图像检索中的应用[J]. 哈尔滨工程大学学报, 2015, 36(7): 938-942. JIA Gang, WANG Zongyi. The application of mixed migration learning in medical image retrieval[J]. Journal of Harbin Engineering University, 2015, 36(7): 938-942.
[34] RüPING S. Incremental learning with support vector machines[C]//Proceedings of the 2011 IEEE International Conference on Data Mining. Washington, DC, USA: IEEE, 2001: 641.
[35] CAUWENBERGHS G, POGGIO T. Incremental and decremental support vector machine learning[C]//Proceedings of the 13th International Conference on Advances in neural information processing systems. Cambridge, MA, USA: MIT Press, 2000, 13: 409.
[36] JORDAN M I, JACOBS R A. Hierarchical mixtures of experts and the EM algorithm[J]. Neural computation, 1994, 6(2): 181-214.
[37] WANG E H C, KUH A. A smart algorithm for incremental learning[C]//Proceedings of the 1992 IEEE International Joint Conference on Neural Networks. Baltimore: IEEE, 1992, 3: 121-126.
[38] ENGELBRECHT A P, CLOETE I. Incremental learning using sensitivity analysis[C]//Proceedings of the 1999 International Joint Conference on Neural Networks. Washington DC: IEEE, 1999.
[39] ZHANG B T. An incremental learning algorithm that optimizes network size and sample size in one trial[C]//Proceedings of the 1994 IEEE World Congress on Computational Intelligence. Orlando, FL, USA: IEEE, 1994, 1: 215-220.
[40] LI F F, FERGUS R, PERONA P. One-shot learning of object categories[J]. IEEE transactions on pattern analysis and machine intelligence, 2006, 28(4): 594-611.
[41] TOMMASI T, ORABONA F, CAPUTO B. Learning categories from few examples with multi model knowledge transfer[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(5): 928-941.
[42] LAMPERT C H, NICKISCH H, HARMELING S. Learning to detect unseen object classes by between-class attribute transfer[C]//Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL: IEEE, 2009: 951-958.
[43] KUZBORSKIJ I, ORABONA F, CAPUTO B. From N to N+1: Multiclass transfer incremental learning[C]//Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR: IEEE, 2013: 3358-3365.
[44] RISTIN M, GUILLAUMIN M, GALL J, et al. Incremental learning of NCM forests for large-scale image classification[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH: IEEE, 2014: 3654-3661.
[45] DA Qing, YU Yang, ZHOU Zhihua. Learning with augmented class by exploiting unlabeled data[C]//Proceedings of the 28th American Association Conference on Artificial Intelligence. Québec, Canada: AAAI Press, 2014: 1760-1766.
[46] CARPENTER G A, GROSSBERG S, REYNOLDS J H. ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network[J]. Neural networks, 1991, 4(5): 565-588.
[47] VIJAYAKUMAR S, OGAWA H. RKHS-based functional analysis for exact incremental learning[J]. Neurocomputing, 1999, 29(1/2/3): 85-113.
[48] KARASUYAMA M, TAKEUCHI I. Multiple incremental decremental learning of support vector machines[J]. IEEE transactions on neural networks archive, 2010, 21(7): 1048-1059.
[49] GRETTON A, DESOBRY F. On-line one-class support vector machines. an application to signal segmentation[C]//Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing. Hong Kong, China: IEEE, 2003.
[50] LASKOV P, GEHL C, KRüGER S, et al. Incremental support vector learning: Analysis, implementation and applications[J]. The Journal of machine learning research archive, 2006, 7: 1909-1936.
[51] XIAO Tianjun, ZHANG Jiaxing, YANG Kuiyuan, et al. Error-driven incremental learning in deep convolutional neural network for large-scale image classification[C]//Proceedings of the 22nd ACM international conference on Multimedia. New York, NY: ACM, 2014: 177-186.
[52] LOMONACO V, MALTONI D. Comparing incremental learning strategies for convolutional neural networks[M]//SCHWENKER F, ABBAS H, EL GAYAR N, et al, eds. Artificial Neural Networks in Pattern Recognition. ANNPR 2016. Lecture Notes in Computer Science. Cham: Springer, 2016.
[53] GRIPPO L. Convergent on-line algorithms for supervised learning in neural networks[J]. IEEE transactions on neural networks, 2000, 11(6): 1284-1299.
[54] FU Limin, HSU H H, PRINCIPE J C. Incremental backpropagation learning networks[J]. IEEE transactions on neural networks, 1996, 7(3): 757-761.
[55] GOBET F, LANE P C R, CROKER S, et al. Chunking mechanisms in human learning[J]. Trends in cognitive sciences, 2001, 5(6): 236-243.

相似文献/References:

[1]李德毅.网络时代人工智能研究与发展[J].智能系统学报,2009,(01):1.
 LI De-yi.AI research and development in the network age[J].CAAI Transactions on Intelligent Systems,2009,(02):1.
[2]赵克勤.二元联系数A+Bi的理论基础与基本算法及在人工智能中的应用[J].智能系统学报,2008,(06):476.
 ZHAO Ke-qin.The theoretical basis and basic algorithm of binary connection A+Bi and its application in AI[J].CAAI Transactions on Intelligent Systems,2008,(02):476.
[3]徐玉如,庞永杰,甘 永,等.智能水下机器人技术展望[J].智能系统学报,2006,(01):9.
 XU Yu-ru,PANG Yong-jie,GAN Yong,et al.AUV—state-of-the-art and prospect[J].CAAI Transactions on Intelligent Systems,2006,(02):9.
[4]王志良.人工心理与人工情感[J].智能系统学报,2006,(01):38.
 WANG Zhi-liang.Artificial psychology and artificial emotion[J].CAAI Transactions on Intelligent Systems,2006,(02):38.
[5]赵克勤.集对分析的不确定性系统理论在AI中的应用[J].智能系统学报,2006,(02):16.
 ZHAO Ke-qin.The application of uncertainty systems theory of set pair analysis (SPU)in the artificial intelligence[J].CAAI Transactions on Intelligent Systems,2006,(02):16.
[6]秦裕林,朱新民,朱 丹.Herbert Simon在最后几年里的两个研究方向[J].智能系统学报,2006,(02):11.
 QIN Yu-lin,ZHU Xin-min,ZHU Dan.Herbert Simons two research directions in his lost years[J].CAAI Transactions on Intelligent Systems,2006,(02):11.
[7]谷文祥,李 丽,李丹丹.规划识别的研究及其应用[J].智能系统学报,2007,(01):1.
 GU Wen-xiang,LI Li,LI Dan-dan.Research and application of plan recognition[J].CAAI Transactions on Intelligent Systems,2007,(02):1.
[8]杨春燕,蔡 文.可拓信息-知识-智能形式化体系研究[J].智能系统学报,2007,(03):8.
 YANG Chun-yan,CAI Wen.A formalized system of extension information-knowledge-intelligence[J].CAAI Transactions on Intelligent Systems,2007,(02):8.
[9]张 菁,沈兰荪,David Dagan FENG.图像搜索中人机交互技术的新进展[J].智能系统学报,2007,(04):14.
 ZHANG Jing,SHEN Lan-sun,David Dagan FENG.computer interaction technology in image searches: a survey[J].CAAI Transactions on Intelligent Systems,2007,(02):14.
[10]赵克勤.SPA的同异反系统理论在人工智能研究中的应用[J].智能系统学报,2007,(05):20.
 ZHAO Ke-qin.The application of SPAbased identicaldiscrepancycontrary system theory in artificial intelligence research[J].CAAI Transactions on Intelligent Systems,2007,(02):20.

备注/Memo

备注/Memo:
收稿日期:2017-1-9;改回日期:。
基金项目:国家“973”计划项目(2012CB316400).
作者简介:李雪,女,1992年生,硕士研究生,主要研究方向为智能信息处理与机器学习;蒋树强,男,1977年生,博士生导师,主要研究方向为图像/视频等多媒体信息的分析、理解与检索技术。IEEE和CCF高级会员,获得2008年度北京市科技新星计划支持,2012年度中国科学院卢嘉锡青年人才奖,2012年度中国计算机学会科学技术奖,2013年度中国科学院青年科学家国际合作奖,获2013年度国家自然科学基金优秀青年科学基金支持,入选2014年度中组部万人计划青年拔尖人才计划。发表学术论文100余篇,授权专利10项。
通讯作者:蒋树强. E-mail:sqjiang@ict.ac.cn.
更新日期/Last Update: 1900-01-01