[1]DOU Yonggan,YUAN Xiaotong.Federated learning with implicit stochastic gradient descent optimization[J].CAAI Transactions on Intelligent Systems,2022,17(3):488-495.[doi:10.11992/tis.202106029]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
17
Number of periods:
2022 3
Page number:
488-495
Column:
学术论文—机器学习
Public date:
2022-05-05
- Title:
-
Federated learning with implicit stochastic gradient descent optimization
- Author(s):
-
DOU Yonggan1; 2; YUAN Xiaotong1; 2
-
1. School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China;
2. Jiangsu Key Laboratory of Big Data Analysis Technology, Nanjing 210044, China
-
- Keywords:
-
federated learning; distributed machine learning; central server; global model; implicit stochastic gradient descent; statistical heterogeneity; systems heterogeneity; optimization algorithm; faster convergence
- CLC:
-
TP8
- DOI:
-
10.11992/tis.202106029
- Abstract:
-
Federated learning is a distributed machine learning paradigm. The central server trains an optimal global model by collaborating with numerous remote devices. Presently, there are two key challenges faced by federated learning: system and statistical heterogeneities. Herein, we mainly focus on the slow convergence of the global model or when it even fails to converge due to system and statistical heterogeneities. We propose a federated learning optimization algorithm based on implicit stochastic gradient descent optimization, which is different from the traditional method of updating in federated learning. We use the locally uploaded model parameters to approximate the average global gradient and to avoid solving the first-order and update the global model parameter via gradient descent. This is performed so that the global model can achieve faster and more stable convergence results with fewer communication rounds. In the experiment, different levels of heterogeneous settings were simulated. The proposed algorithm shows considerably faster and more stable convergence behavior than FedAvg and FedProx. In the premise of the same convergence results, the experimental results show that the proposed method reduces the number of communication rounds by approximately 50% compared with Fedprox in highly heterogeneous synthetic datasets. This considerably improves the stability and robustness of federated learning.