[1]殷瑞,苏松志,李绍滋.一种卷积神经网络的图像矩正则化策略[J].智能系统学报编辑部,2016,11(1):43-48.[doi:10.11992/tis.201509018]
 YIN Rui,SU Songzhi,LI Shaozi.Convolutional neural network’s image moment regularizing strategy[J].CAAI Transactions on Intelligent Systems,2016,11(1):43-48.[doi:10.11992/tis.201509018]
点击复制

一种卷积神经网络的图像矩正则化策略(/HTML)
分享到:

《智能系统学报》编辑部[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年1期
页码:
43-48
栏目:
出版日期:
2016-02-25

文章信息/Info

Title:
Convolutional neural network’s image moment regularizing strategy
作者:
殷瑞12 苏松志12 李绍滋12
1. 厦门大学信息科学与技术学院, 福建厦门 361005;
2. 厦门大学福建省仿脑智能系统重点实验室, 福建厦门 361005
Author(s):
YIN Rui12 SU Songzhi12 LI Shaozi12
1. School of Information Science and Technology, Xiamen University, Xiamen 361005, China;
2. Fujian Key Laboratory of the Brain-Like Intelligent System, Xiamen University, Xiamen 361005, China
关键词:
中心矩随机选择池化卷积神经网络过抑合
Keywords:
central momentrandom selectionpoolingconvolutional neural networkoverfitting
分类号:
TP391.4
DOI:
10.11992/tis.201509018
摘要:
卷积神经网络的池化策略包含极大池化和平均池化,极大池化选择池化区域中的最大值,极易出现过抑合现象;平均池化对池化区域中所有元素赋予相同权重,降低了高频分量的权重。本文提出将矩池化作为卷积神经网络的正则化策略,矩池化将几何矩概念引入到卷积神经网络的池化过程中,首先计算池化区域的中心矩,然后根据类插值法依概率随机地从中心矩的4个邻域中选择响应值。在数据集MNIST、CIFAR10、CIFAR100上的实验结果表明随着训练迭代次数的增加,矩池化的训练误差和测试误差最低,矩池化的高差别性和强鲁棒性使其获得了比极大池化和平均池化更好的泛化能力。
Abstract:
There are two kinds of pooling strategies for convolutional neural network (CNN) as follows: max pooling and average pooling. Max pooling simply chooses the maximum element, which makes this strategy extremely prone to overfitting. Average pooling endows all elements with the same weight, which lowers the weight of the high-frequency components. In this study, we propose moment pooling as a regularization strategy for CNN. First, we introduce the geometric moment to CNN pooling and calculate the central moment of the pooling region. Then, we randomly select the response values based on the probability-like interpolation method from the four neighbors of the moment as per their probability. Experiments on the MNIST, CIFAR10, and CIFAR100 datasets show that moment pooling obtains the fewest training and test errors with training iteration increments. This strategy’s robustness and strong discrimination capability yield better generalization results than those from the max and average pooling methods.

参考文献/References:

[1] MONTAVON G, ORR G, MÜLLER K R. Neural networks:tricks of the trade[M]. 2nd ed. Berlin Heidelberg:Springer, 2012.
[2] HINTON G E, SRIVASTAVE N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[EB/OL].[2012-07-03]. http://arxiv.org/pdf/1207.0580.pdf.
[3] NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel, 2010.
[4] RANZATO M, BOUREAU Y L, LECUN Y. Sparse feature learning for deep belief networks[C]//Proceedings of Advances in Neural Information Processing Systems (NIPS). Cambridge, MA, 2007.
[5] LECUN Y, BOSER B E, DENKER J S, et al. Handwritten digit Recognition with a back-propagation network[C]//Proceedings of Advances in Neural Information Processing Systems (NIPS). Cambridge, MA, 1989.
[6] HU M K. Visual pattern recognition by moment invariants[J]. IRE Transactions on Information Theory, 1962, 8(2):179-187.
[7] ROSIN P L. Measuring corner properties[J]. Computer vision and image understanding, 1999, 73(2):291-307.
[8] RUBLEE E, RABAUD V, KONOLIGE K, et al. ORB:an efficient alternative to SIFT or SURF[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV). Barcelona, 2011:2564-2571.
[9] EVANS O D, KIM Y. Efficientimplementation of image warping on a multimedia processor[J]. Real-time imaging, 1998, 4(6):417-428.
[10] GONZALEZ R C, WOODS R E. Digital image processing[M]. 2nd ed.New Jersey:Prentice-Hall, 2002.
[11] JIA Y, SHEHAMER E, DONAHUE J,et al.Caffe:convolutional architecture for fast feature emibedding[C]//Proceedings of the ACM International conference on Multimedia. ACM, 2014:625-678.
[12] BOTTOU L. Stochastic gradient descent tricks[M]//MONTAVON G, ORR G B, MÜLLER K R. Neural Networks:Tricks of the Trade. 2nd ed. Berlin Heidelberg:Springer, 2012:421-436.
[13] KRIZHEVSKY A. The CIFAR-10, CIFAR-100 database[EB/OL]. http://www.cs.toronto.edu/~kriz/cifar.html. LECUN Y, CORTES C, BURGES C J C. The MNIST database of handwritten digits[EB/OL]. http://yann.lecun.com/exdb/mnist/.
[14] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.

备注/Memo

备注/Memo:
收稿日期:2015-09-16;改回日期:。
基金项目:国家自然科学基金资助项目(61202143,61572409);福建省自然科学基金资助项目(2013J05100).
作者简介:殷瑞,女,1993年生,硕士研究生,主要研究方向为图像特征表示、计算机视觉、深度学习;苏松志,男,1982年生,讲师,博士。主要研究方向为行人检测和人体行为分析;李绍滋,男,1963年生,教授,博士生导师,福建省人工智能学会副理事长。主要研究方向为人工智能及其应用、计算机视觉与机器学习、运动目标检测与识别等。主持过多项国家、省市级项目研究,获得省科学技术三等奖2项,发表学术论文200余篇,其中SCI检索27篇、EI检索171篇。
通讯作者:李绍滋.E-mail:szlig@xmu.edu.cn.
更新日期/Last Update: 1900-01-01