Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

Chen, Shuhang; Devraj, Adithya; Berstein, Andrey; Meyn, Sean

doi:10.1007/s11424-021-1251-5

Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

Published: 26 October 2021

Volume 34, pages 1681–1702, (2021)
Cite this article

Journal of Systems Science and Complexity Aims and scope Submit manuscript

Shuhang Chen¹,
Adithya Devraj²,
Andrey Berstein³ &
…
Sean Meyn⁴

273 Accesses
2 Citations
Explore all metrics

Abstract

Several decades ago, Profs. Sean Meyn and Lei Guo were postdoctoral fellows at ANU, where they shared interest in recursive algorithms. It seems fitting to celebrate Lei Guo’s 60th birthday with a review of the ODE Method and its recent evolution, with focus on the following themes:

The method has been regarded as a technique for algorithm analysis. It is argued that this viewpoint is backwards: The original stochastic approximation method was surely motivated by an ODE, and tools for analysis came much later (based on establishing robustness of Euler approximations). The paper presents a brief survey of recent research in machine learning that shows the power of algorithm design in continuous time, following by careful approximation to obtain a practical recursive algorithm.
While these methods are usually presented in a stochastic setting, this is not a prerequisite. In fact, recent theory shows that rates of convergence can be dramatically accelerated by applying techniques inspired by quasi Monte-Carlo.
Subject to conditions, the optimal rate of convergence can be obtained by applying the averaging technique of Polyak and Ruppert. The conditions are not universal, but theory suggests alternatives to achieve acceleration.
The theory is illustrated with applications to gradient-free optimization, and policy gradient algorithms for reinforcement learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Correction to: analysis of stochastic gradient descent in continuous time

Article Open access 02 July 2024

Stochastic global maximum principle for optimization with recursive utilities

Article Open access 01 March 2017

Applications of Fokker Planck Equations in Machine Learning Algorithms

References

Chen H F and Guo L, Identification and Stochastic Adaptive Control, Birkhauser, Boston, 1991.
Book MATH Google Scholar
Chen H F and Guo L, Convergence rate of least-squares identification and adaptive control for stochastic systems, Intl. Journal of Control, 1986, 44(5): 1459–1476.
Article MATH Google Scholar
Chen H F and Guo L, Identification and Stochastic Adaptive Control, Springer Science & Business Media, Berlin, 2012.
Google Scholar
Guo L, Estimating time-varying parameters by the Kalman filter based algorithm: Stability and convergence, IEEE Trans. Automat. Control, 1990, 35: 141–147.
Article MathSciNet MATH Google Scholar
Meyn S P and Caines P E, A new approach to stochastic adaptive control, IEEE Trans. Automat. Control, 1987, AC-32: 220–226.
Article MathSciNet MATH Google Scholar
Guo L and Meyn S P, Adaptive control for time-varying systems: A combination of martingale and Markov chain techniques, Int. J. Adaptive Control and Signal Processing, 1989, 3: 1–14.
Article MATH Google Scholar
Xie L L and Guo L, How much uncertainty can be dealt with by feedback?. IEEE Transaction on Automatic Control, 2000, 45: 2203–2217.
Article MathSciNet MATH Google Scholar
Guo L, Stability of recursive stochastic tracking algorithms, SIAM J. Control Optim., 1994, 32: 1195–1225.
Article MathSciNet MATH Google Scholar
Guo L and Ljung L, Performance analysis of general tracking algorithms, IEEE Transactions on Automatic Control, 1995, 40: 1388–1402.
Article MathSciNet MATH Google Scholar
Robbins H and Monro S, A stochastic approximation method, Annals of Mathematical Statistics, 1951, 22: 400–407.
Article MathSciNet MATH Google Scholar
Jaakola T, Jordan M, and Singh S, On the convergence of stochastic iterative dynamic programming algorithms, Neural Computation, 1994, 6: 1185–1201.
Article MATH Google Scholar
Tsitsiklis J, Asynchronous stochastic approximation and Q-learning, Machine Learning, 1994, 16: 185–202.
Article MATH Google Scholar
Meyn S, Control Systems and Reinforcement Learning, Cambridge University Press, Cambridge, 2022 (to appear). Pre-publication draft online https://meyn.ece.ufl.edu/2021/08/01/control-systems-and-reinforcement-learning/.
Book Google Scholar
Sutton R and Barto A, Reinforcement Learning: An Introduction, MIT Press, 2nd Edition, On-line edition at http://www.cs.ualberta.ca/sutton/book/the-book.html, Cambridge, MA, 2018.
MATH Google Scholar
Ljung L, Analysis of recursive stochastic algorithms, Trans. on Automatic Control, 1977, 22(4): 551–575.
Article MathSciNet MATH Google Scholar
Fradkov A and Polyak B T, Adaptive and robust control in the USSR, IFAC — PapersOnLine, 2020, 53(2): 1373–1378.
Article Google Scholar
Tsypkin Y Z and Nikolic Z J, Adaptation and Learning in Automatic Systems, Academic Press, New York, 1971.
Google Scholar
Borkar V S, Stochastic Approximation: A Dynamical Systems Viewpoint, 2nd Edition, Hindustan Book Agency, Delhi, India and Cambridge, UK, 2020.
MATH Google Scholar
Su W, Boyd S, and Candes E, A differential equation for modeling nesterov’s accelerated gradient method: Theory and insights, Proc. Advances in Neural Information Processing Systems, 2014, 2510–2518.
Attouch H, Goudou X, and Redont P, The heavy ball with friction method, I. the continuous dynamical system: Global exploration of the local minima of a real-valued function by asymptotic analysis of a dissipative dynamical system, Communications in Contemporary Mathematics, 2000, 2(1): 1–34.
Article MathSciNet MATH Google Scholar
Krichene W and Bartlett P L, Acceleration and averaging in stochastic descent dynamics, Proc. Advances in Neural Information Processing Systems, 2017, 30: 6796–6806.
Google Scholar
Raginsky M and Bouvrie J, Continuous-time stochastic mirror descent on a network: Variance reduction, consensus, convergence, Proc. of the Conf. on Dec. and Control, 2012, 6793–6800.
Shi B, Du S S, Su W, et al., Acceleration via symplectic discretization of high-resolution differential equations, Eds. by Wallach H, Larochelle H, Beygelzimer A, et al., Proc. Advances in Neural Information Processing Systems, 2019, 5744–5752.
Wibisono A, Wilson A C, and Jordan M I, A variational perspective on accelerated methods in optimization, Proc. of the National Academy of Sciences, 2016, 113: E7351–E7358.
Article MathSciNet MATH Google Scholar
Zhang J, Mokhtari A, Sra S, et al., Direct Runge-Kutta discretization achieves acceleration. Proc. of the Intl. Conference on Neural Information Processing Systems, 2018, 3904–3913.
Devraj A M, Bušić A, and Meyn S, On matrix momentum stochastic approximation and applications to Q-learning, Allerton Conference on Communication, Control, and Computing, 2019, 749–756.
Lapeybe B, Pages G, and Sab K, Sequences with low discrepancy generalisation and application to Robbins-Monro algorithm, Statistics, 1990, 21(2): 251–272.
Article MathSciNet MATH Google Scholar
Laruelle S and Pagès G, Stochastic approximation with averaging innovation applied to finance, Monte Carlo Methods and Applications, 2012, 18(1): 1–51.
Article MathSciNet MATH Google Scholar
Chen S, Devraj A, Bernstein A, et al., Accelerating optimization and reinforcement learning with quasi stochastic approximation, Proc. of the American Control Conf., 2021, 1965–1972.
Bernstein A, Chen Y, Colombino M, et al., Optimal rate of convergence for quasi-stochastic approximation, arXiv: 1903.07228, 2019.
Bernstein A, Chen Y, Colombino M, et al., Quasi-stochastic approximation and off-policy reinforcement learning, Proc. of the Conf. on Dec. and Control, 2019, 5244–5251.
Chen S, Devraj A M, Bušić A, et al., Explicit mean-square error bounds for Monte-Carlo and linear stochastic approximation, Eds. by Chiappa S and Calandra R, Proc. of AISTATS, 2020, 108: 4173–4183.
Borkar V S and Meyn S P, The ODE method for convergence of stochastic approximation and reinforcement learning, SIAM J. Control Optim., 2000, 38(2): 447–469.
Article MathSciNet MATH Google Scholar
Ramaswamy A and Bhatnagar S, Stability of stochastic approximations with ‘controlled Markov’ noise and temporal difference learning, IEEE Transactions on Automatic Control, 2019, 64: 2614–2620.
Article MathSciNet MATH Google Scholar
Polyak B T, A new method of stochastic approximation type, Avtomatika i Telemekhanika (in Russian), 1990, 98–107; translated in Automat. Remote Control, 1991, 51.
Polyak B T and Juditsky A B, Acceleration of stochastic approximation by averaging, SIAM J. Control Optim., 1992, 30(4): 838–855.
Article MathSciNet MATH Google Scholar
Ruppert D, Efficient estimators from a slowly convergent Robbins-Monro processes, Technical Report Tech. Rept. No. 781, Cornell University, School of Operations Research and Industrial Engineering, Ithaca, NY, 1988.
Google Scholar
Kaledin M, Moulines E, Naumov A, et al., Finite time analysis of linear two-timescale stochastic approximation with Markovian noise, arXiv: 2002.01268, 2020.
Ariyur K B and Krstić M, Real Time Optimization by Extremum Seeking Control, John Wiley & Sons, Inc., New York, NY, 2003.
Book MATH Google Scholar
Liu S and Krstic M, Introduction to extremum seeking, Stochastic Averaging and Stochastic Extremum Seeking, Communications and Control Engineering, Springer, London, 2012.
Chapter MATH Google Scholar
Tan Y, Moase W H, Manzie C, et al., Extremum seeking from 1922 to 2010, Proc.of the 29th Chinese Control Conference, IEEE, 2010, 14–26.
Kiefer J and Wolfowitz J, Stochastic estimation of the maximum of a regression function, Ann. Math. Statist., 1952, 23(3): 462–466.
Article MathSciNet MATH Google Scholar
Koronacki J, Random-seeking methods for the stochastic unconstrained optimization, International Journal of Control, 1975, 21(3): 517–527.
Article MathSciNet MATH Google Scholar
Spall J C, A stochastic approximation technique for generating maximum likelihood parameter estimates, Proc. of the American Control Conf., IEEE, 1987, 1161–1167.
Spall J C, A one-measurement form of simultaneous perturbation stochastic approximation, Automatica, 1997, 33(1): 109–112.
Article MathSciNet MATH Google Scholar
Spall J C, et al., Multivariate stochastic approximation using a simultaneous perturbation gradient approximation, IEEE transactions on Automatic Control, 1992, 37(3): 332–341.
Article MathSciNet MATH Google Scholar
Chen H, Duncan T, and Pasik-Duncan B, A Kiefer-Wolfowitz algorithm with randomized differences, IEEE Transactions on Automatic Control, 1999, 44(3): 442–453.
Article MathSciNet MATH Google Scholar
Nesterov Y and Spokoiny V, Random gradient-free minimization of convex functions, Foundations of Computational Mathematics, 2017, 17(2): 527–566.
Article MathSciNet MATH Google Scholar
Bhatnagar S, Fu M C, Marcus S I, et al., Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences, ACM Transactions on Modeling and Computer Simulation (TOMACS), 2003, 13(2): 180–209.
Article MATH Google Scholar
Mandl P, Estimation and control in Markov chains, Advances in Applied Probability, 1974, 6(1): 40–60.
Article MathSciNet MATH Google Scholar
Borkar V and Varaiya P, Adaptive control of Markov chains, I: Finite parameter set, IEEE Trans. Automat. Control, 1979, 24(6): 953–957.
Article MathSciNet MATH Google Scholar
Borkar V and Varaiya P, Identification and adaptive control of Markov chains, SIAM J. Control Optim., 1982, 20(4): 470–489.
Article MathSciNet MATH Google Scholar
Borkar V S, Identification and adaptive control of Markov chains, PhD Thesis, University of California, Berkeley, 1980.
MATH Google Scholar
Williams R J, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 1992, 8(3–4): 229–256.
Article MATH Google Scholar
Mania H, Guy A, and Recht B, Simple random search provides a competitive approach to reinforcement learning, Proc. Advances in Neural Information Processing Systems, 2018, 1800–1809.
Asmussen S and Glynn P W, Stochastic Simulation: Algorithms and Analysis, volume 57 of Stochastic Modelling and Applied Probability, Springer-Verlag, New York, 2007.
Book MATH Google Scholar
Metivier M and Priouret P, Theoremes de convergence presque sure pour une classe d’algorithmes stochastiques a pas decroissants, Prob. Theory Related Fields, 1987, 74: 403–428.
Article MATH Google Scholar
Spall J C, Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control, John Wiley & Sons, New York, 2003.
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Florida, Gainesville, FL, 32605, USA
Shuhang Chen
Stanford University, Stanford, CA, 94305, USA
Adithya Devraj
NREL, Golden, CO, 80401, USA
Andrey Berstein
University of Florida, Gainesville, FL, 32605, USA
Sean Meyn

Authors

Shuhang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Adithya Devraj
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Berstein
View author publications
You can also search for this author in PubMed Google Scholar
Sean Meyn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Shuhang Chen, Adithya Devraj, Andrey Berstein or Sean Meyn.

Additional information

This research was supported by ARO W911NF1810334, and NSF under EPCN 1935389. The work of A. Bernstein was supported in part by the National Renewable Energy Laboratory (NREL), operated by Alliance for Sustainable Energy, LLC, for the U.S. Department of Energy (DOE) under Contract DE-AC36-08GO28308 and in part by the Laboratory Directed Research and Development (LDRD) Program at NREL. The views expressed in the paper do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the paper for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, S., Devraj, A., Berstein, A. et al. Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation. J Syst Sci Complex 34, 1681–1702 (2021). https://doi.org/10.1007/s11424-021-1251-5

Download citation

Received: 10 July 2021
Published: 26 October 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11424-021-1251-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Correction to: analysis of stochastic gradient descent in continuous time

Stochastic global maximum principle for optimization with recursive utilities

Applications of Fokker Planck Equations in Machine Learning Algorithms

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Revisiting the ODE Method for Recursive Algorithms: Fast Convergence Using Quasi Stochastic Approximation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Correction to: analysis of stochastic gradient descent in continuous time

Stochastic global maximum principle for optimization with recursive utilities

Applications of Fokker Planck Equations in Machine Learning Algorithms

References

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation