基于梯度跟踪和分布式重球加速的分布式随机优化算法

孙碧皓; 胡锦辉; 夏大文; 李华青

doi:10.1631/FITEE.2000615

Your Location：

Home >

Browse articles >

基于梯度跟踪和分布式重球加速的分布式随机优化算法

常规文章 | Updated：2022-01-05

- 基于梯度跟踪和分布式重球加速的分布式随机优化算法
  Enhanced Publication
- A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration
- 信息与电子工程前沿（英文） 2021年22卷第11期页码：1463-1476
- Affiliations：
  
  Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
  College of Data Science and Information Engineering, Guizhou Minzu University, Guiyang 550025, China
- Author bio：
  
  [ "", "Huaqing LI received his BS degree in information and computing science in 2009 from Chongqing University of Posts and Telecommunications, Chongqing, China and his PhD degree in computer science and technology in 2013 from Chongqing University. From Sept. 2014 to Sept. 2015, he was a postdoctoral researcher at the School of Electrical and Information Engineering, The University of Sydney, Australia. From Nov. 2015 to Nov. 2016, he was a postdoctoral researcher at the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore. He is currently a professor at the College of Electronic and Information Engineering, Southwest University, Chongqing, China. His main research interests include nonlinear dynamics and control, multi-agent system, and distributed optimization. Prof. LI currently serves as a regional editor for Neur Comput Appl, an editorial board member for IEEE Access, and a corresponding expert for Front Inform Technol Electron Eng" ]
- Funds：
  
  Project supported by the Open Research Fund Program of Data Recovery Key Laboratory of Sichuan Province, China (No. DRN2001), the National Natural Science Foundation of China (Nos. 61773321 and 61762020), the Science and Technology Top-Notch Talents Support Project of Colleges and Universities in Guizhou Province, China (No. QJHKY2016065), the Science and Technology Foundation of Guizhou Province, China (No. QKHJC20181083), and the Science and Technology Talents Fund for Excellent Young of Guizhou Province, China (No. QKHPTRC20195669)
- DOI：10.1631/FITEE.2000615
  中图分类号： TP14
- 纸质出版日期：2021-11，
  
  网络出版日期：2021-07-24，
  
  收稿日期：2020-11-08，
  
  修回日期：2021-04-01，
  
  录用日期：2021-02-15
- Accepted：
Scan QR Code
孙碧皓, 胡锦辉, 夏大文, 等. 基于梯度跟踪和分布式重球加速的分布式随机优化算法[J]. 信息与电子工程前沿（英文）, 2021,22(11):1463-1476.

BIHAO SUN, JINHUI HU, DAWEN XIA, et al. A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration. [J]. Frontiers of information technology & electronic engineering, 2021, 22(11): 1463-1476.
孙碧皓, 胡锦辉, 夏大文, 等. 基于梯度跟踪和分布式重球加速的分布式随机优化算法[J]. 信息与电子工程前沿（英文）, 2021,22(11):1463-1476. DOI： 10.1631/FITEE.2000615.

BIHAO SUN, JINHUI HU, DAWEN XIA, et al. A distributed stochastic optimization algorithm with gradient-tracking and distributed heavy-ball acceleration. [J]. Frontiers of information technology & electronic engineering, 2021, 22(11): 1463-1476. DOI： 10.1631/FITEE.2000615.

摘要

由于在机器学习和信号处理中的广泛应用，近年来分布式优化得到良好发展。本文致力于研究分布式优化以求解目标函数全局最小值。该目标是分布在个节点的无向网络上的平滑且强凸的局部成本函数总和。与已有工作不同的是，我们使用分布式重球项以提高算法的收敛性能。为使现有分布式随机一阶梯度算法的收敛加速，将动量项与梯度跟踪技术结合。仿真结果表明，在不增加复杂度的情况下，所提算法具有比GT-SAGA更高收敛速率。在真实数据集上的数值实验证明了该算法的有效性和正确性。

Abstract

Distributed optimization has been well developed in recent years due to its wide applications in machine learning and signal processing. In this paper

we focus on investigating distributed optimization to minimize a global objective. The objective is a sum of smooth and strongly convex local cost functions which are distributed over an undirected network of

$$n$$

nodes. In contrast to existing works

we apply a distributed heavy-ball term to improve the convergence performance of the proposed algorithm. To accelerate the convergence of existing distributed stochastic first-order gradient methods

a momentum term is combined with a gradient-tracking technique. It is shown that the proposed algorithm has better acceleration ability than GT-SAGA without increasing the complexity. Extensive experiments on real-world datasets verify the effectiveness and correctness of the proposed algorithm.

关键词

分布式优化高性能算法多智能体系统机器学习问题随机梯度

Keywords

Distributed optimizationHigh-performance algorithmMulti-agent systemMachine-learning problemStochastic gradient

references

D Bertsekas, , , E Gafni. . Projected Newton methods and optimization of multicommodity flows. . IEEE Trans Autom Contr, , 1983. . 28((12):):1090--1096. . DOI:10.1109/TAC.1983.1103183http://doi.org/10.1109/TAC.1983.1103183..

S Boyd, , , N Parikh, , , E Chu, , , 等. . Distributed optimization and statistical learning via the alternating direction method of multipliers. . Found Trends® Mach Learn, , 2011. . 3((1):):1--122. . DOI:10.1561/2200000016http://doi.org/10.1561/2200000016..

B Cheng, , , ZK Li. . Coordinated tracking control with asynchronous edge-based event-triggered communications. . IEEE Trans Autom Contr, , 2019. . 64((10):):4321--4328. . DOI:10.1109/TAC.2019.2895927http://doi.org/10.1109/TAC.2019.2895927..

S Cheng, , , MY Chen, , , RJ Wai, , , 等. . Optimal placement of distributed generation units in distribution systems via an enhanced multi-objective particle swarm optimization algorithm. . J Zhejiang Univ-Sci C (Comput & Electron), , 2014. . 15((4):):300--311. . DOI:10.1631/jzus.C1300250http://doi.org/10.1631/jzus.C1300250..

K Cohen, , , A Nedić, , , R Srikant. . Distributed learning algorithms for spectrum sharing in spatial random access wireless networks. . IEEE Trans Autom Contr, , 2017. . 62((6):):2854--2869. . DOI:10.1109/TAC.2016.2626578http://doi.org/10.1109/TAC.2016.2626578..

A Defazio, , , F Bach, , , S Lacoste-Julien. . SAGA: a fast incremental gradient method with support for non-strongly convex composite objectives. . Proc 27th Int Conf on Neural Information Processing Systems, , 2014. . p. 1646--1654. . ..

D Dua, , , C Graff. . UCI Machine Learning Repository, , 2017. . http://archive.ics.uci.edu/mlhttp://archive.ics.uci.edu/ml, , ..

JC Duchi, , , A Agarwal, , , MJ Wainwright. . Dual averaging for distributed optimization: convergence analysis and network scaling. . IEEE Trans Autom Contr, , 2012. . 57((3):):592--606. . DOI:10.1109/TAC.2011.2161027http://doi.org/10.1109/TAC.2011.2161027..

M Eisen, , , A Mokhtari, , , A Ribeiro. . Decentralized quasi-Newton methods. . IEEE Trans Signal Process, , 2017. . 65((10):):2613--2628. . DOI:10.1109/TSP.2017.2666776http://doi.org/10.1109/TSP.2017.2666776..

T Erseghe, , , D Zennaro, , , E Dall'Anese, , , 等. . Fast consensus by the alternating direction multipliers method. . IEEE Trans Signal Process, , 2011. . 59((11):):5523--5537. . DOI:10.1109/TSP.2011.2162831http://doi.org/10.1109/TSP.2011.2162831..

L Guan, , , T Sun, , , LB Qiao, , , 等. . An efficient parallel and distributed solution to nonconvex penalized linear SVMs. . Front Inform Technol Electron Eng, , 2020. . 21((4):):587--603. . DOI:10.1631/FITEE.1800566http://doi.org/10.1631/FITEE.1800566..

ZM Han, , , ZY Lin, , , MY Fu, , , 等. . Distributed coordination in multi-agent systems: a graph Laplacian perspective. . Front Inform Technol Electron Eng, , 2015. . 16((6):):429--448. . DOI:10.1631/FITEE.1500118http://doi.org/10.1631/FITEE.1500118..

JH Hu, , , Y Yan, , , HQ Li, , , 等. . Convergence of an accelerated distributed optimisation algorithm over time-varying directed networks. . IET Contr Theory Appl, , 2021. . 15((1):):24--39. . DOI:10.1049/cth2.12022http://doi.org/10.1049/cth2.12022..

R Johnson, , , T Zhang. . Accelerating stochastic gradient descent using predictive variance reduction. . Proc 26th Int Conf on Neural Information Processing Systems, , 2013. . p. 315--323. . ..

Q Lan, , , LB Qiao, , , YJ Wang. . Stochastic extra-gradient based alternating direction methods for graph-guided regularized minimization. . Front Inform Technol Electron Eng, , 2018. . 19((6):):755--762. . DOI:10.1631/FITEE.1601771http://doi.org/10.1631/FITEE.1601771..

HQ Li, , , HQ Cheng, , , Z Wang, , , 等. . Distributed Nesterov gradient and heavy-ball double accelerated asynchronous optimization. . IEEE Trans Neur Netw Learn Syst, in press, , 2020. . DOI:10.1109/TNNLS.2020.3027381http://doi.org/10.1109/TNNLS.2020.3027381..

Z Li, , , W Shi, , , M Yan. . A decentralized proximal-gradient method with network independent step-sizes and separated convergence rates. . IEEE Trans Signal Process, , 2019. . 67((17):):4494--4506. . DOI:10.1109/TSP.2019.2926022http://doi.org/10.1109/TSP.2019.2926022..

Q Ling, , , A Ribeiro. . Decentralized dynamic optimization through the alternating direction method of multipliers. . IEEE Trans Signal Process, , 2014. . 62((5):):1185--1197. . DOI:10.1109/TSP.2013.2295055http://doi.org/10.1109/TSP.2013.2295055..

Q Ling, , , Z Tian. . Decentralized sparse signal recovery for compressive sleeping wireless sensor networks. . IEEE Trans Signal Process, , 2010. . 58((7):):3816--3827. . DOI:10.1109/TSP.2010.2047721http://doi.org/10.1109/TSP.2010.2047721..

R Liu, , , WC Sun, , , T Hou, , , 等. . Block coordinate descentwith time perturbation for nonconvex nonsmooth problems in real-world studies. . Front Inform Technol Electron Eng, , 2019. . 20((10):):1390--1403. . DOI:10.1631/FITEE.1900341http://doi.org/10.1631/FITEE.1900341..

QG Lü, , , XF Liao, , , HQ Li, , , 等. . A Nesterov-like gradient tracking algorithm for distributed optimization over directed networks. . IEEE Trans Syst Man Cybern, in press, , 2020. . DOI:10.1109/TSMC.2019.2960770http://doi.org/10.1109/TSMC.2019.2960770..

G Mateos, , , JA Bazerque, , , GB Giannakis. . Distributed sparse linear regression. . IEEE Trans Signal Process, , 2010. . 58((10):):5262--5276. . DOI:10.1109/TSP.2010.2055862http://doi.org/10.1109/TSP.2010.2055862..

TP Matthews, , , K Wang, , , CP Li, , , 等. . Nonlinear waveform inversion by use of the regularized dual averaging method for ultrasound computed tomography. . Progress in Electromagnetic Research Symp, , 2016. . p. 3948DOI:10.1109/PIERS.2016.7735487http://doi.org/10.1109/PIERS.2016.7735487..

B McMahan, , , E Moore, , , D Ramage, , , 等. . Communication-efficient learning of deep networks from decentralized data. . Proc 20th Int Conf on Artificial Intelligence and Statistics, , 2017. . p. 1273--1282. . ..

A Nedić, , , A Ozdaglar. . Distributed subgradient methods for multi-agent optimization. . IEEE Trans Autom Contr, , 2009. . 54((1):):48--61. . DOI:10.1109/TAC.2008.2009515http://doi.org/10.1109/TAC.2008.2009515..

A Nedić, , , A Olshevsky, , , W Shi. . Achieving geometric convergence for distributed optimization over time-varying graphs. . SIAM J Optim, , 2017a. . 27((4):):2597--2633. . DOI:10.1137/16M1084316http://doi.org/10.1137/16M1084316..

A Nedić, , , A Olshevsky, , , W Shi, , , 等. . Geometrically convergent distributed optimization with uncoordinated step-sizes. . American Control Conf, , 2017b. . p. 3950--3955. . DOI:10.23919/ACC.2017.7963560http://doi.org/10.23919/ACC.2017.7963560..

M Schmidt, , , N Le Roux, , , F Bach. . Minimizing finite sums with the stochastic average gradient. . Math Program, , 2017. . 162((1-2):):83--112. . DOI:10.1007/s10107-016-1030-6http://doi.org/10.1007/s10107-016-1030-6..

CH Tan, , , SQ Ma, , , YH Dai, , , 等. . Barzilai-Borwein step size for stochastic gradient descent. . Proc 30th Int Conf on Neural Information Processing Systems, , 2016. . p. 685--693. . ..

J Tsitsiklis, , , D Bertsekas, , , M Athans. . Distributed asynchronous deterministic and stochastic gradient optimization algorithms. . IEEE Trans Autom Contr, , 1986. . 31((9):):803--812. . DOI:10.1109/TAC.1986.1104412http://doi.org/10.1109/TAC.1986.1104412..

B Wang, , , HY Jiang, , , J Fang, , , 等. . A proximal ADMM for decentralized composite optimization. . IEEE Signal Process Lett, , 2018. . 25((8):):1121--1125. . DOI:10.1109/LSP.2018.2841648http://doi.org/10.1109/LSP.2018.2841648..

Z Wang, , , HQ Li. . Edge-based stochastic gradient algorithm for distributed optimization. . IEEE Trans Netw Sci Eng, , 2020. . 7((3):):1421--1430. . DOI:10.1109/TNSE.2019.2933177http://doi.org/10.1109/TNSE.2019.2933177..

EM Wei, , , A Ozdaglar, , , A Jadbabaie. . A distributed Newton method for network utility maximization—I: algorithm. . IEEE Trans Autom Contr, , 2013. . 58((9):):2162--2175. . DOI:10.1109/TAC.2013.2253218http://doi.org/10.1109/TAC.2013.2253218..

CG Xi, , , UA Khan. . DEXTRA: a fast algorithm for optimization over directed graphs. . IEEE Trans Autom Contr, , 2017. . 62((10):):4980--4993. . DOI:10.1109/TAC.2017.2672698http://doi.org/10.1109/TAC.2017.2672698..

YS Xia, , , J Wang. . A one-layer recurrent neural network for support vector machine learning. . IEEE Trans Syst Man Cybern B, , 2004. . 34((2):):1261--1269. . DOI:10.1109/TSMCB.2003.822955http://doi.org/10.1109/TSMCB.2003.822955..

R Xin, , , UA Khan. . A linear algorithm for optimization over directed graphs with geometric convergence. . IEEE Contr Syst Lett, , 2018. . 2((3):):315--320. . DOI:10.1109/LCSYS.2018.2834316http://doi.org/10.1109/LCSYS.2018.2834316..

R Xin, , , UA Khan. . Distributed heavy-ball: a generalization and acceleration of first-order methods with gradient tracking. . IEEE Trans Autom Contr, , 2020. . 65((6):):2627--2633. . DOI:10.1109/TAC.2019.2942513http://doi.org/10.1109/TAC.2019.2942513..

R Xin, , , D Jakovetić, , , UA Khan. . Distributed Nesterov gradient methods over arbitrary graphs. . IEEE Signal Process Lett, , 2019a. . 26((8):):1247--1251. . DOI:10.1109/LSP.2019.2925537http://doi.org/10.1109/LSP.2019.2925537..

R Xin, , , AK Sahu, , , UA Khan, , , 等. . Distributed stochastic optimization with gradient tracking over strongly-connected networks. . Proc IEEE 58th Conf on Decision and Control, , 2019b. . p. 8353--8358. . DOI:10.1109/CDC40024.2019.9029217http://doi.org/10.1109/CDC40024.2019.9029217..

R Xin, , , CG Xi, , , UA Khan. . FROST—fast row-stochastic optimization with uncoordinated step-sizes. . EURASIP J Adv Signal Process, , 2019c. . 2019((1):):1DOI:10.1186/s13634-018-0596-yhttp://doi.org/10.1186/s13634-018-0596-y..

R Xin, , , UA Khan, , , S Kar. . Variance-reduced decentralized stochastic optimization with accelerated convergence. . IEEE Trans Signal Process, , 2020. . 686255--6271. . DOI:10.1109/TSP.2020.3031071http://doi.org/10.1109/TSP.2020.3031071..

JM Xu, , , SY Zhu, , , YC Soh, , , 等. . Augmented distributed gradient methods for multi-agent optimization under uncoordinated constant stepsizes. . Proc 54th IEEE Conf on Decision and Control, , 2015. . p. 2055--2060. . DOI:10.1109/CDC.2015.7402509http://doi.org/10.1109/CDC.2015.7402509..

R Yin, , , Y Zhang, , , GD Yu, , , 等. . Centralized and distributed resource allocation in OFDM based multi-relay system. . J Zhejiang Univ-Sci C (Comput & Electron), , 2010. . 11((6):):450--464. . DOI:10.1631/jzus.C0910405http://doi.org/10.1631/jzus.C0910405..

DM Yuan, , , Q Ma, , , Z Wang. . Distributed dual averaging method for solving saddle-point problems over multi-agent networks. . Proc 32nd Chinese Control Conf, , 2013. . p. 6868--6872. . ..

CL Zhang, , , M Ahmad, , , YQ Wang. . ADMM based privacy-preserving decentralized optimization. . IEEE Trans Inform Forens Secur, , 2019. . 14((3):):565--580. . DOI:10.1109/TIFS.2018.2855169http://doi.org/10.1109/TIFS.2018.2855169..

MA Zinkevich, , , M Weimer, , , A Smola, , , 等. . Parallelized stochastic gradient descent. . Proc 23rd Int Conf on Neural Information Processing Systems, , 2010. . p. 2595--2603. . ..

浏览量

Downloads

CSCD

文章被引用时，请邮件提醒。

Submit

工具集

关联资源

Event-triggered distributed optimization for model-free multi-agent systems

Matrix-valued distributed stochastic optimization with constraints

Distributed optimization based on improved push-sum framework for optimization problem with multiple local constraints and its application in smart grid

Cooperative planning of multi-agent systems based on task-oriented knowledge fusion with graph neural networks

Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents