[1]黄鉴之,丁成诚,陶蔚,等.非光滑凸情形Adam型算法的最优个体收敛速率[J].智能系统学报,2020,15(6):1140-1146.[doi:10.11992/tis.202006046]
 HUANG Jianzhi,DING Chengcheng,TAO Wei,et al.Optimal individual convergence rate of Adam-type algorithms in nonsmooth convex optimization[J].CAAI Transactions on Intelligent Systems,2020,15(6):1140-1146.[doi:10.11992/tis.202006046]
点击复制

非光滑凸情形Adam型算法的最优个体收敛速率

参考文献/References:
[1] KINGMA D P, BA J L. Adam: a method for stochastic optimization[C]//Proceedings of the 3rd International Conference for Learning Representations. San Diego, USA, 2015.
[2] DUCHI J, HAZAN E, SINGER Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. The journal of machine learning research, 2011, 12: 2121-2159.
[3] ZINKEVICH M. Online convex programming and generalized infinitesimal gradient ascent[C]//Proceedings of the 20th International Conference on Machine Learning. Washington, USA, 2003: 928-935.
[4] TIELEMAN T, HINTON G. Lecture 6.5-RMSProp: divide the gradient by a running average of its recent magnitude[R]. Toronto: University of Toronto, 2012.
[5] ZEILER M D. ADADELTA: an adaptive learning rate method[EB/OL]. (2012-12-22)[2020-04-20]. https://arxiv.org/abs/1212.5701
[6] POLYAK B T. Some methods of speeding up the convergence of iteration methods[J]. USSR computational mathematics and mathematical physics, 1964, 4(5): 1-17.
[7] NESTEROV Y E. A method of solving a convex programming problem with convergence rate O(1/k2)[J]. Soviet mathematics doklady, 1983, 27(2): 372-376.
[8] GHADIMI E, FEYZMAHDAVIAN H R, JOHANSSON M. Global convergence of the Heavy-ball method for convex optimization[C]//Proceedings of 2015 European Control Conference. Linz, Austria, 2015: 310-315.
[9] SHAMIR O, ZHANG Tong. Stochastic gradient descent for non-smooth optimization: convergence results and optimal averaging schemes[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta, USA, 2013: 1-71-1-79.
[10] 陶蔚, 潘志松, 储德军, 等. 使用Nesterov步长策略投影次梯度方法的个体收敛性[J]. 计算机学报, 2018, 41(1): 164-176
TAO Wei, PAN Zhisong, CHU Dejun, et al. The individual convergence of projected subgradient methods using the Nesterov’s step-size strategy[J]. Chinese journal of computers, 2018, 41(1): 164-176
[11] TAO Wei, PAN Zhisong, WU Gaowei, et al. The strength of Nesterov’s extrapolation in the individual convergence of nonsmooth optimization[J]. IEEE transactions on neural networks and learning systems, 2020, 31(7): 2557-2568.
[12] 程禹嘉, 陶蔚, 刘宇翔, 等. Heavy-Ball型动量方法的最优个体收敛速率[J]. 计算机研究与发展, 2019, 56(8): 1686-1694
CHENG Yujia, TAO Wei, LIU Yuxiang, et al. Optimal individual convergence rate of the Heavy-ball-based momentum methods[J]. Journal of computer research and development, 2019, 56(8): 1686-1694
[13] KIROS R, ZEMEL R S, SALAKHUTDINOV R, et al. A multiplicative model for learning distributed text-based attribute representations[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada, 2014: 2348-2356.
[14] BAHAR P, ALKHOULI T, PETER J T, et al. Empirical investigation of optimization algorithms in neural machine translation[J]. The Prague bulletin of mathematical linguistics, 2017, 108(1): 13-25.
[15] REDDI S J, KALE S, KUMAR S. On the convergence of Adam and beyond[C]//Processing of the 6th International Conference on Learning Representations. Vancouver, Canada, 2018.
[16] WANG Guanghui, LU Shiyin, TU Weiwei, et al. SAdam: a variant of Adam for strongly convex functions[C]//Processing of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia, 2020.
[17] CHEN Xiangyi, LIU Sijia, SUN Ruoyu, et al. On the convergence of a class of Adam-type algorithms for non-convex optimization[C]//Processing of the 7th International Conference on Learning Representations. New Orleans, USA, 2019.
[18] DUCHI J, SHALEV-SHWARTZ S, SINGER Y, et al. Efficient projections onto the l1-ball for learning in high dimensions[C]//Processing of the 25th International Conference on Machine learning. Helsinki, Finland, 2008: 272-279.
[19] AGARWAL A, BARTLETT P L, RAVIKUMAR P, et al. Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization[J]. IEEE transactions on information theory, 2012, 58(5): 3235-3249.
[20] SHALEV-SHWARTZ S, SINGER Y, SREBRO N, et al. Pegasos: primal estimated sub-gradient solver for SVM[J]. Mathematical programming, 2011, 127(1): 3-30.
[21] RAKHLIN A, SHAMIR O, SRIDHARAN K. Making gradient descent optimal for strongly convex stochastic optimization[C]//Processing of the 29th International Conference on International Conference on Machine Learning. Edinburgh, Scotland, 2012: 1571-1578.
相似文献/References:
[1]叶志飞,文益民,吕宝粮.不平衡分类问题研究综述[J].智能系统学报,2009,4(2):148.
 YE Zhi-fei,WEN Yi-min,LU Bao-liang.A survey of imbalanced pattern classification problems[J].CAAI Transactions on Intelligent Systems,2009,4():148.
[2]刘奕群,张 敏,马少平.基于非内容信息的网络关键资源有效定位[J].智能系统学报,2007,2(1):45.
 LIU Yi-qun,ZHANG Min,MA Shao-ping.Web key resource page selection based on non-content inf o rmation[J].CAAI Transactions on Intelligent Systems,2007,2():45.
[3]马世龙,眭跃飞,许 可.优先归纳逻辑程序的极限行为[J].智能系统学报,2007,2(4):9.
 MA Shi-long,SUI Yue-fei,XU Ke.Limit behavior of prioritized inductive logic programs[J].CAAI Transactions on Intelligent Systems,2007,2():9.
[4]姚伏天,钱沄涛.高斯过程及其在高光谱图像分类中的应用[J].智能系统学报,2011,6(5):396.
 YAO Futian,QIAN Yuntao.Gaussian process and its applications in hyperspectral image classification[J].CAAI Transactions on Intelligent Systems,2011,6():396.
[5]文益民,强保华,范志刚.概念漂移数据流分类研究综述[J].智能系统学报,2013,8(2):95.[doi:10.3969/j.issn.1673-4785.201208012]
 WEN Yimin,QIANG Baohua,FAN Zhigang.A survey of the classification of data streams with concept drift[J].CAAI Transactions on Intelligent Systems,2013,8():95.[doi:10.3969/j.issn.1673-4785.201208012]
[6]杨成东,邓廷权.综合属性选择和删除的属性约简方法[J].智能系统学报,2013,8(2):183.[doi:10.3969/j.issn.1673-4785.201209056]
 YANG Chengdong,DENG Tingquan.An approach to attribute reduction combining attribute selection and deletion[J].CAAI Transactions on Intelligent Systems,2013,8():183.[doi:10.3969/j.issn.1673-4785.201209056]
[7]胡小生,钟勇.基于加权聚类质心的SVM不平衡分类方法[J].智能系统学报,2013,8(3):261.
 HU Xiaosheng,ZHONG Yong.Support vector machine imbalanced data classification based on weighted clustering centroid[J].CAAI Transactions on Intelligent Systems,2013,8():261.
[8]丁科,谭营.GPU通用计算及其在计算智能领域的应用[J].智能系统学报,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
 DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10():1.[doi:10.3969/j.issn.1673-4785.201403072]
[9]孔庆超,毛文吉,张育浩.社交网站中用户评论行为预测[J].智能系统学报,2015,10(3):349.[doi:10.3969/j.issn.1673-4785.201403019]
 KONG Qingchao,MAO Wenji,ZHANG Yuhao.User comment behavior prediction in social networking sites[J].CAAI Transactions on Intelligent Systems,2015,10():349.[doi:10.3969/j.issn.1673-4785.201403019]
[10]姚霖,刘轶,李鑫鑫,等.词边界字向量的中文命名实体识别[J].智能系统学报,2016,11(1):37.[doi:10.11992/tis.201507065]
 YAO Lin,LIU Yi,LI Xinxin,et al.Chinese named entity recognition via word boundarybased character embedding[J].CAAI Transactions on Intelligent Systems,2016,11():37.[doi:10.11992/tis.201507065]

备注/Memo

收稿日期:2020-06-28。
基金项目:国家自然科学基金项目(61673394;62076252)
作者简介:黄鉴之,硕士研究生,主要研究方向为凸优化算法及其在机器学习中的应用;丁成诚,硕士研究生,主要研究方向为凸优化算法及其在机器学习中的应用;陶卿,教授,博士,主要研究方向为模式识别、机器学习和应用数学。承担国家自然科学基金、安徽省自然科学基金等。发表学术论文60余篇
通讯作者:陶卿.E-mail:qing.tao@ia.ac.cn

更新日期/Last Update: 2020-12-25
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com