We consider a class of distributed non-convex optimization problems, in which a number of agents are connected by a communication network, and they collectively optimize a sum of (possibly non-convex and non-smooth) local objective functions. This type of problem has gained some recent popularities, especially in the application of distributed training of deep neural networks.
We first address the following general question: What is the fastest convergence rate that any properly defined distributed algorithms can achieve, and how to achieve those rates. In particular, we consider a class of unconstrained non-convex problems, and allow the agents to access local first-order gradient information. We develop a lower bound analysis that identifies difficult problem instances for any first-order method. Further, we develop a rate-optimal method whose rate matches our derived lower bound (up to a ploylog factor). The algorithm combines ideas from distributed consensus, nonlinear optimization, as well as classical signal processing techniques. Second, we present some recent extensions of the above work, including how to achieve optimal sample complexity, and how to compute high-order stationary solutions efficiently. Finally, we provide some applications in distributed training of the neural networks, as well as in distributed control of wind farms, and discuss a number of open questions in the area.
Mingyi Hong is an Assistant Professor in the Department of Electrical and Computer Engineering, University of Minnesota. He is serving on the IEEE Signal Processing for Communications and Networking (SPCOM), and Machine Learning for Signal Processing (MLSP) Technical Committees. He has coauthored works that have been selected as finalists for the Best Paper Prize for Young Researchers in Continuous Optimization in 2013, 2016, and won a best student paper award in 2018 Asilomar Conference on Signals, Systems and Computers (as a senior author). His research interests are primarily in optimization theory and its applications in signal processing and machine learning.