[1]HUANG Jianzhi,DING Chengcheng,TAO Wei,et al.Optimal individual convergence rate of Adam-type algorithms in nonsmooth convex optimization[J].CAAI Transactions on Intelligent Systems,2020,15(6):1140-1146.[doi:10.11992/tis.202006046]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
15
Number of periods:
2020 6
Page number:
1140-1146
Column:
学术论文—机器学习
Public date:
2020-11-05
- Title:
-
Optimal individual convergence rate of Adam-type algorithms in nonsmooth convex optimization
- Author(s):
-
HUANG Jianzhi1; DING Chengcheng1; TAO Wei2; TAO Qing1
-
1. Department of Information Engineering, Army Academy of Artillery and Air Defense of PLA, Hefei 230031, China;
2. Command and Control Engineering, Army Engineering University of PLA, Nanjing 210007, China
-
- Keywords:
-
machine learning; AdaGrad algorithm; RMSProp algorithm; momentum methods; Adam algorithm; AMSGrad algorithm; individual convergence rate; sparsity
- CLC:
-
TP181
- DOI:
-
10.11992/tis.202006046
- Abstract:
-
Adam is a popular optimization framework for training deep neural networks, which simultaneously employs adaptive step-size and momentum techniques to overcome some inherent disadvantages of SGD. However, even for the convex optimization problem, Adam proves to have the same regret bound as the gradient descent method under online optimization circumstances; moreover, the momentum acceleration property is not revealed. This paper focuses on nonsmooth convex problems. By selecting suitable time-varying step-size and momentum parameters, the improved Adam algorithm exhibits an optimal individual convergence rate, which indicates that Adam has the advantages of both adaptation and acceleration. Experiments conducted on the l1-norm ball constrained hinge loss function problem verify the correctness of the theoretical analysis and the performance of the proposed algorithms in keeping the sparsity.