<-上一篇/Previous Article 下一篇/Next Article->

[1]赵骞,李敏,赵晓杰,等.基于感受野学习的特征词袋模型简化算法[J].智能系统学报,2016,11(5):663-669.[doi:10.11992/tis.201601001]
　ZHAO Qian,LI Min,ZHAO Xiaojie,et al.Learning receptive fields for compact bag-of-feature model[J].CAAI Transactions on Intelligent Systems,2016,11(5):663-669.[doi:10.11992/tis.201601001]

点击复制

基于感受野学习的特征词袋模型简化算法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 11 期数: 2016年第5期页码: 663-669 栏目: 学术论文—自然语言处理与理解出版日期: 2016-11-01

Title:: Learning receptive fields for compact bag-of-feature model

作者:: 赵骞, 李敏, 赵晓杰, 陈雪勇; 电子科技大学计算机科学与工程学院, 四川成都 611731

Author(s):: ZHAO Qian, LI Min, ZHAO Xiaojie, CHEN Xueyong; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China

关键词:: 视觉词袋模型; 感受野学习; 目标识别; 图像分类; 特征学习

Keywords:: bag-of-features model; receptive field learning; object recognition; image classification; feature learning

分类号:: TP391.4

DOI:: 10.11992/tis.201601001

摘要:: 本文研究了在图像识别任务中，感受野学习对于特征词袋模型的影响。在特征词袋模型中，一个特征的感受野主要取决于视觉词典中的视觉单词和池化过程中所使用的区域。视觉单词决定了特征的选择性，池化区域则影响特征的局部性。文中提出了一种改进的感受野学习算法，用于寻找针对具体的图像识别任务最具有效性的感受野，同时考虑到了视觉单词数量增长所带来的冗余问题。通过学习，低效、冗余的视觉单词和池化区域会被发现，并从特征词袋模型中移除，从而产生一个针对具体分类任务更精简的、更具可分性的图像表达。最后，通过实验显示了该算法的有效性，学习到的模型除了结构精简，在识别精度上相比原有方法也能有一定提升。

Abstract:: In this work, the effects of receptive field learning on a bag-of-features pipeline were investigated for an image identification task. In a bag-of-features model, the receptive field of a feature depends mostly on use of visual words in a visual dictionary and the region used during the pooling process. Codewords make the feature respond to specific image patches and the pooling regions determine the spatial scope of the features. A modified graft feature selecting algorithm was proposed to find the most efficient receptive fields for identification purposes; this considers the redundancy problem created by simultaneously increasing visual words. Using learning receptive fields, inefficient and redundant codewords and pooling regions were found and subsequently eliminated from the pooling region, this made the pipeline more compact and separable for the specified classification task. The experiments show that the modified learning algorithm is effective and the learned pipeline useful for both structural simplification and improving classification accuracy compared with the baseline method.

参考文献/References:: [1] HUANG Yongzhen, WU Zifeng, WANG Liang, et al. Feature coding in image classification:a comprehensive study[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(3):493-506.
[2] YANG Jianchao, YU Kai, GONG Yihong, et al. Linear spatial pyramid matching using sparse coding for image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, 2009:1794-1801.
[3] YU Kai, ZHANG Tong, GONG Yihong. Nonlinear learning using local coordinate coding[C]//Advances in Neural Information Processing Systems 22:23rd Annual Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada, 2009:2223-2231.
[4] COATES A, NG A, LEE H. An analysis of single-layer networks in unsupervised feature learning[J]. Journal of machine learning research, 2011, 15:215-223.
[5] GREGOR K, LECUN Y. Learning fast approximations of sparse coding[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel, 2010.
[6] LAZEBNIK S, SCHMID C, Ponce J. Beyond bags of features:spatial pyramid matching for recognizing natural scene categories[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA, 2006, 2:2169-2178.
[7] COATES A, NG A Y. Selecting receptive fields in deep networks[C]//Advances in Neural Information Processing Systems 24:25th Annual Conference on Neural Information Processing Systems. Granada, Spain, 2011:2528-2536.
[8] JIA Yangqing, HUANG Chang, DARRELL T. Beyond spatial pyramids:receptive field learning for pooled image features[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, 2012:3370-3377.
[9] SIVIC J, ZISSERMAN A. Video google:a text retrieval approach to object matching in videos[C]//Proceedings of the Ninth IEEE International Conference on Computer Vision. Nice, France, 2003:1470-1477.
[10] COATES A, NG A Y. The importance of encoding versus training with sparse coding and vector quantization[C]//Proceedings of the 28th International Conference on Machine Learning. Bellevue, WA, USA, 2011.
[11] JAIN A K. Data clustering:50 years beyond K-means[J]. Pattern recognition letters, 2010, 31(8):651-666.
[12] 汪启伟. 图像直方图特征及其应用研究[D]. 合肥:中国科学技术大学, 2014. WANG Qiwei. Study on image histogram feature and application[D]. Hefei, China:University of Science and Technology of China, 2014.
[13] BOUREAU Y L, ROUX N L, BACH F, et al. Ask the locals:multi-way local pooling for image recognition[C]//Proceedings of the 2011 International Conference on Computer Vision. Barcelona, Spain, 2011:2651-2658.
[14] PERKINS S, LACKER K, THEILER J. Grafting:fast, incremental feature selection by gradient descent in function space[J]. Journal of machine learning research, 2003, 3:1333-1356.
[15] LECUN Y, BOSER B E, DENKER J S, et al. Handwritten digit recognition with a back-propagation network[C]//Advances in Neural Information Processing Systems 2:3rd Annual Conference on Neural Information Processing Systems. Vancouver, British Columbia, Canada. San Francisco, CA, USA, 1989:396-404.
[16] KRIZHEVSKY A. Learning multiple layers of features from tiny images[D]. Toronto, Canada:University of Toronto, 2009.

备注/Memo

收稿日期:2016-01-01。
基金项目:国家自然科学基金项目（61371182）.
作者简介:赵骞,男,1986年生,博士研究生,主要研究方向为计算机视觉、神经网络。参与"863"项目1项,国家自然科学基金项目1项;李敏,男,1981年生,讲师,博士,主要研究方向为仿生机器人、外骨骼机器人。参与"863"项目2项。曾获得教育部技术发明奖一等奖1项,授权国家发明专利5项,发表学术论文7篇;赵晓杰,男,1972年生,博士研究生,主要研究方向为航迹规划、传感器网络,参与"973"项目1项。
通讯作者:赵骞.E-mail:zhokyia@gmail.com

更新日期/Last Update: 1900-01-01

基于感受野学习的特征词袋模型简化算法 PDF下载HTML

备注/Memo

基于感受野学习的特征词袋模型简化算法

PDF下载 HTML