Abstract
In this paper, we provide a unified iteration complexity analysis for a family of general block coordinate descent methods, covering popular methods such as the block coordinate gradient descent and the block coordinate proximal gradient, under various different coordinate update rules. We unify these algorithms under the so-called block successive upper-bound minimization (BSUM) framework, and show that for a broad class of multi-block nonsmooth convex problems, all algorithms covered by the BSUM framework achieve a global sublinear iteration complexity of \(\mathcal{{O}}(1/r)\), where r is the iteration index. Moreover, for the case of block coordinate minimization where each block is minimized exactly, we establish the sublinear convergence rate of O(1/r) without per block strong convexity assumption.
Similar content being viewed by others
Notes
We have used the following abbreviations: NS = Nonsmooth, C = Constrained, K = K-block, BSC = Block-wise Strongly Convex, G–So = Gauss–Southwell, G–S = Gauss–Seidel, E-C = Essentially Cyclic, MBI = Maximum Block Improvement. The notion of valid upper-bound as well as the function \(u_k\) will be introduced in Sect. 2.
It appears that the proof in [1, Theorem 4.1] can be modified to allow nonsmooth h, just that it is not explicitly mentioned in the paper. But as it stands, the bound in [1, Theorem 4.1] is explicitly dependent on the Lipschitz constant of the gradient of h, while the bound we derived here in (4.17) is not.
References
Beck, A.: On the convergence of alternating minimization with applications to iteratively reweighted least squares and decomposition schemes. SIAM J. Optim. 25(1), 185–209 (2015)
Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4), 2037–2060 (2013)
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods, 2nd edn. Athena Scientific, Belmont (1997)
Chen, B., He, S., Li, Z., Zhang, S.: Maximum block improvement and polynomial optimization. SIAM J. Optim. 22(1), 87–107 (2012)
Combettes, P., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Bauschke, H.H., Burachik, R., Combettes, P.L., Elser, V., Luke, D.R., Wolkowicz, H. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering, Springer Optimization and Its Applications, pp. 185–212. Springer, New York (2011)
Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley, Hoboken (2005)
Daubechies, I., DeVore, R., Fornasier, M., Gunturk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2010)
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)
He, B., Liao, L., Han, D., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. 92(1), 103–118 (2002)
Hiriart-Urruty, J.-B., Lemarechal, C.: Convex Analysis and Minimization Algorithms I: Fundamentals. Springer, Berlin (1996)
Hong, M., Razaviyayn, M., Luo, Z.-Q., Pang, J.-S.: A unified algorithmic framework for block-structured optimization involving big data. IEEE Signal Process. Mag. 33(1), 57–77 (2016)
Lu, Z., Xiao, L.: On the complexity analysis of randomized block-coordinate descent methods. Math. Program. 152(1), 615–642 (2015)
Luo, Z.-Q., Tseng, P.: On the convergence of the coordinate descent method for convex differentiable minimization. J. Optim. Theory Appl. 72(1), 7–35 (1992)
Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30(2), 408–425 (1992)
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46–47, 157–178 (1993)
Luo, Z.-Q., Tseng, P.: On the convergence rate of dual ascent methods for strictly convex minimization. Math. Oper. Res. 18(4), 846–867 (1993)
Mairal, J.: Optimization with first-order surrogate functions. In: The Proceedings of the International Conference on Machine Learning (ICML) (2013)
Nesterov, V.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, Berlin (2004)
Nesterov, Y.: Efficiency of coordiate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, Cambridge (1972)
Razaviyayn, M., Hong, M., Luo, Z.-Q.: A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J. Optim. 23(2), 1126–1153 (2013)
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144, 1–38 (2014)
Saha, A., Tewari, A.: On the nonasymptotic convergence of cyclic coordinate descent method. SIAM J. Optim. 23(1), 576–601 (2013)
Scutari, G., Facchinei, F., Song, P., Palomar, D.P., Pang, J.-S.: Decomposition by partial linearization: parallel optimization of multi-agent systems. IEEE Trans. Signal Process. 63(3), 641–656 (2014)
Shalev-Shwartz, S., Tewari, A.: Stochastic methods for \(\ell _1\) regularized loss minimization. J. Mach. Learn. Res. 12, 1865–1892 (2011)
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 103(9), 475–494 (2001)
Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. 125(2), 263–295 (2010)
Tseng, P., Yun, S.: Block-coordinate gradient descent method for linearly constrained nonsmooth separable optimization. J. Optim. Theory Appl. 140, 513–535 (2009)
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)
Wang, X., Yuan, X.: The linearized alternating direction method of multipliers for dantzig selector. SIAM J. Sci. Comput. 34(5), 2792–2811 (2012)
Yang, J., Zhang, Y., Yin, W.: A fast alternating direction method for TVL1-L2 signal reconstruction from partial fourier data. IEEE J. Sel. Top. Signal Process. 4(2), 288–297 (2010)
Yu, W., Rhee, W., Boyd, S., Cioffi, J.M.: Iterative water-filling for Gaussian vector multiple-access channels. IEEE Trans. Inf. Theory 50(1), 145–152 (2004)
Zhang, H., Jiang, J., Luo, Z.-Q.: On the linear convergence of a proximal gradient method for a class of nonsmooth convex minimization problems. J. Oper. Res. Soc. China 1(2), 163–186 (2013)
Zhang, X., Burger, M., Osher, S.: A unified primal-dual algorithm framework based on Bregman iteration. J. Sci. Comput. 46(1), 20–46 (2011)
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors would like to thanks Enbin Song for his valuable comments on an early version of the manuscript.
Mingyi Hong: This author is supported by National Science Foundation (NSF) Grant CCF-1526078 and by Air Force Office of Scientific Research (AFOSR) Grant 15RT0767.
Xiangfeng Wang: This author is supported by NSFC, Grant No.11501210, and by Shanghai YangFan, Grant No. 15YF1403400.
Zhi-Quan Luo: This research is supported by NSFC, Grant No. 61571384, and by the Leading Talents of Guang Dong Province program, Grant No. 00201510.
Rights and permissions
About this article
Cite this article
Hong, M., Wang, X., Razaviyayn, M. et al. Iteration complexity analysis of block coordinate descent methods. Math. Program. 163, 85–114 (2017). https://doi.org/10.1007/s10107-016-1057-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-016-1057-8