Controller Optimization for Multirate Systems Based on Reinforcement Learning

Li, Zhan; Xue, Sheng-Ri; Yu, Xing-Hu; Gao, Hui-Jun

doi:10.1007/s11633-020-1229-0

Controller Optimization for Multirate Systems Based on Reinforcement Learning

Research Article
Published: 14 April 2020

Volume 17, pages 417–427, (2020)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

280 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

The goal of this paper is to design a model-free optimal controller for the multirate system based on reinforcement learning. Sampled-data control systems are widely used in the industrial production process and multirate sampling has attracted much attention in the study of the sampled-data control theory. In this paper, we assume the sampling periods for state variables are different from periods for system inputs. Under this condition, we can obtain an equivalent discrete-time system using the lifting technique. Then, we provide an algorithm to solve the linear quadratic regulator (LQR) control problem of multirate systems with the utilization of matrix substitutions. Based on a reinforcement learning method, we use online policy iteration and off-policy algorithms to optimize the controller for multirate systems. By using the least squares method, we convert the off-policy algorithm into a model-free reinforcement learning algorithm, which only requires the input and output data of the system. Finally, we use an example to illustrate the applicability and efficiency of the model-free algorithm above mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reinforcement learning for optimal tracking of large-scale systems with multitime scales

Article 29 June 2023

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

Article 24 August 2021

Reinforcement Learning for Optimal Adaptive Control of Time Delay Systems

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

P. Shi. Filtering on sampled-data systems with parametric uncertainty. IEEE Transactions on Automatic Control, vol. 43, no. 7, pp. 1022–1027, 1998. DOI: 10.1109/9.701119.
Article MathSciNet MATH Google Scholar
X. J. Han, Y. C. Ma. Sampled-data robust H∞ control for T-S fuzzy time-delay systems with state quantization. International Journal of Control, Automation and Systems, vol. 17, no. 1, pp. 46–56, 2019. DOI: 10.1007/s12555-018-0279-3.
Article Google Scholar
K. Abidi, Y. Yildiz, A. Annaswamy. Control of uncertain sampled-data systems: An adaptive posicast control approach. IEEE Transactions on Automatic Control, vol. 62, no. 5, pp. 2597–2602, 2017. DOI: 10.1109/TAC.2016.2600627.
Article MathSciNet MATH Google Scholar
T. Nguyen-Van. An observer based sampled-data control for class of scalar nonlinear systems using continualized discretization method. International Journal of Control, Automation and Systems, vol. 16, no. 2, pp. 709–716, 2018. DOI: 10.1007/s12555-016-0739-6.
Article MathSciNet Google Scholar
R. J. Liu, J. F. Wu, D. Wang. Sampled-data fuzzy control of two-wheel inverted pendulums based on passivity theory. International Journal of Control, Automation and Systems, vol. 16, no. 5, pp. 2538–2648, 2018. DOI: 10.1007/s12555-018-0063-4.
Article Google Scholar
R. E. Kalman, J. E. Bertram. A unified approach to the theory of sampling systems. Journal of the Franklin Institute, vol. 267, no. 5, pp. 405–436, 1959. DOI: 10.1016/0016- 0032(59)90093-6.
Article MathSciNet MATH Google Scholar
B. Friedland. Sampled-data control systems containing periodically varying members. In Proceedings of the 1^stIFAC World Conference, Moscow, Russia, pp. 361–367, 1961. DOI: 10.1016/s1474-6670(17)70078-X.
Google Scholar
D. G. Meyer. A new class of shift-varying operators, their shift-invariant equivalents, and multirate digital systems. IEEE Transactions on Automatic Control, vol. 35, no. 4, pp. 429–433, 1990. DOI: 10.1109/9.52295.
Article MathSciNet MATH Google Scholar
T. W. Chen, L. Qiu. H∞ design of general multirate sampled-data control systems. Automatica, vol. 30, no. 7, pp. 1139–1152, 1994. DOI: 10.1016/0005-1098(94)90210-0.
Article MathSciNet MATH Google Scholar
M. F. Sågfors, H. T. Toivonen, B. Lennartson. H∞ control of multirate sampled-data systems: A state-space approach. Automatica, vol. 34, no. 4, pp. 415–428, 1998. DOI: 10.1016/S0005-1098(97)00236-7.
Article MathSciNet MATH Google Scholar
L. Qiu, K. Tan. Direct state space solution of multirate sampled-data H2 optimal control. Automatica, vol. 34, no. 11, pp. 1431–1437, 1998. DOI: 10.1016/S0005-1098(98)00080-6.
Article MATH Google Scholar
P. Colaneri, G. D. Nicolao. Multirate LQG control of continuous-time stochastic systems. Automatica, vol. 31, no. 4, pp. 591–595, 1995. DOI: 10.1016/0005-1098(95)98488-R.
Article MathSciNet MATH Google Scholar
N. Xiao, L. H. Xie, L. Qiu. Feedback stabilization of discrete-time networked systems over fading channels. IEEE Transactions on Automatic Control, vol. 57, no. 9, pp. 2167–2189, 2012. DOI: 10.1109/TAC.2012.2183450.
Article MathSciNet MATH Google Scholar
W. Chen, L. Qiu. Stabilization of networked control systems with multirate sampling. Automatica, vol. 49, no. 6, pp. 1528–1537, 2013. DOI: 10.1016/j.automatica.2013.02.010.
Article MathSciNet MATH Google Scholar
S. R. Xue, X. B. Yang, Z. Li, H. J. Gao. An approach to fault detection for multirate sampled-data systems with frequency specifications. IEEE Transactions on Systems, man, and cybernetics: Systems, vol. 48, no. 7, pp. 1155–1165, 2018. DOI: 10.1109/TSMC.2016.2645797.
Article Google Scholar
M. Y. Zhong, H. Ye, S. X. Ding, G. Z. Wang. Observer-based fast rate fault detection for a class of multirate sampled-data systems. IEEE Transactions on Automatic control, vol. 52, no. 3, pp. 520–525, 2007. DOI: 10.1109/TAC.2006.890488.
Article MathSciNet MATH Google Scholar
H. J. Gao, S. R. Xue, S. Yin, J. B. Qiu, C. H. Wang. Out-put feedback control of multirate sampled-data systems with frequency specifications. IEEE Transactions on Control Systems Technology, vol. 25, no. 5, pp. 1599–1608, 2017. DOI: 10.1109/TCST.2016.2616379.
Article Google Scholar
X. X. Guo, S. Singh, H. Lee, R. Lewis, X. S. Wang. Deep learning for real-time Atari game play using offline montecarlo tree search planning. In Proceedings of the 27th International Conference on Neural Information Processing Systems, ACM, Montreal, Canada, pp. 3338–3346, 2014.
Google Scholar
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis. Mastering the game of go with deep neural networks and tree search. Nature, vol. 529, no. 7587, pp. 484–489, 2016. DOI: 10.1038/nature16961.
Article Google Scholar
D. P. Bertsekas, J. N. Tsitsiklis. Neuro-dynamic programming: An overview. In Proceedings of the 34th IEEE Conference on Decision and Control, IEEE, New Orleans, USA, pp. 560–564, 1995. DOI: 10.1109/CDC.1995.478953.
Google Scholar
F. Y. Wang, H. G. Zhang, D. R. Liu. Adaptive dynamic programming: An introduction. IEEE Computational Intelligence Magazine, vol. 4, no. 2, pp. 39–47, 2009. DOI: 10.1109/MCI.2009.932261.
Article Google Scholar
W. N. Gao, Z. P. Jiang. Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Transactions on Automatic Control, vol. 61, no. 12, pp. 4164–4169, 2016. DOI: 10.1109/TAC.2016.2548662.
Article MathSciNet MATH Google Scholar
W. J. Lu, P. P. Zhu, S. Ferrari. A hybrid-adaptive dynamic programming approach for the model-free control of nonlinear switched systems. IEEE Transactions on Automatic Control, vol. 61, no. 10, pp. 3203–3208, 2016. DOI: 10.1109/TAC.2015.2509421.
Article MathSciNet MATH Google Scholar
Y. Yang, J. M. Lee. A switching robust model predictive control approach for nonlinear systems. Journal of Process Control, vol. 23, no. 6, pp. 852–860, 2013. DOI: 10.1016/j.jprocont.2013.03.011.
Article Google Scholar
B. Luo, H. N. Wu, T. W. Huang. Off-policy reinforcement learning for H∞ control design. IEEE Transactions on Cybernetics, vol. 45, no. 1, pp. 65–76, 2015. DOI: 10.1109/TCYB.2014.2319577.
Article Google Scholar
H. J. Yang, M. Tan. Sliding mode control for flexible-link manipulators based on adaptive neural networks. International Journal of Automation and Computing, vol. 15, no. 2, pp. 239–248, 2018. DOI: 10.1007/s11633-018-1122-2.
Article Google Scholar
M. S. Tong, W. Y. Lin, X. Huo, Z. S. Jin, C. Z. Miao. A model-free fuzzy adaptive trajectory tracking control algorithm based on dynamic surface control. International Journal of Advanced Robotic Systems, vol. 17, no. 1, pp. 17–29, 2020. DOI: 10.1177/1729881419894417.
Article Google Scholar
I. Zaidi, M. Chtourou, M. Djemel. Robust neural control of discrete time uncertain nonlinear systems using sliding mode backpropagation training algorithm. International Journal of Automation and Computing, vol. 16, no. 2, pp. 213–225, 2019. DOI: 10.1007/s11633-017-1062-2.
Article Google Scholar
M. Zhu, J. N. Bian, W. M. Wu. A novel collaborative scheme of simulation and model checking for system properties verification. Computers in Industry, vol. 57, no. 8–9, pp. 752–757, 2006. DOI: 10.1016/j.compind.2006.04.006.
Article Google Scholar
Y. H. Zhu, D. B. Zhao, H. B. He, J. H. Ji. Event-triggered optimal control for partially unknown constrained-input systems via adaptive dynamic programming. IEEE Transactions on Industrial Electronics, vol. 64, no. 5, pp. 4101–4109, 2017. DOI: 10.1109/TIE.2016.2597763.
Article Google Scholar
R. Kamalapurkar, P. Walters, W. E. Dixon. Model-based reinforcement learning for approximate optimal regulation. Automatica, vol. 64, pp. 94–104, 2016. DOI: 10.1016/j.automatica.2015.10.039.
Article MathSciNet MATH Google Scholar
B. Kiumarsi, F. L. Lewis, H. Modares, A. Karimpour, M. B. Naghibi-Sistani. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica, vol. 50, pp. 1167–1175, 2014. DOI: 10.1016/j.automatica.2014.02.015.
Article MathSciNet MATH Google Scholar
H. Modares, S. P. Nageshrao, G. A. Delgado Lopes, R. Babuska, F. L. Lewis. Optimal model-free output synchronization of heterogeneous systems using off-policy re-inforcement learning. Automatica, vol. 71, pp. 334–341, 2016. DOI: 10.1016/j.automatica.2016.05.017.
Article MathSciNet MATH Google Scholar
A. Madady, H. R. Reza-Alikhani, S. Zamiri. Optimal N-parametric type iterative learning control. International Journal of Control, Automation and Systems, vol. 16, no. 5, pp. 2187–2202, 2018. DOI: 10.1007/s12555-017-0259-z.
Article Google Scholar
Z. Li, S. R. Xue, W. Y. Lin, M. S. Tong. Training a robust reinforcement learning controller for the uncertain system based on policy gradient method. Neurocomputing, vol. 316, pp. 313–321, 2018. DOI: 10.1016/j.neucom.2018.08.007.
Article Google Scholar
S. R. Xue, Z. Li, L. Yang. Training a model-free reinforcement learning controller for a 3-degree-of-freedom helicopter under multiple constraints. Measurement and Control, vol. 52, no. 7–8, pp. 844–854, 2019. DOI: 10.1177/0020294019847711.
Article Google Scholar
S. Preitl, R. E. Precup, Z. Preitl, S. Vaivoda, S. Kilyeni, J. K. Tar. Iterative feedback and learning control. Servo systems applications. IFAC Proceedings Volumes, vol. 40, no. 8, pp. 16–27, 2007. DOI: 10.3182/20070709-3-RO-4910.00004.
Article Google Scholar
R. P. A. Gil, Z. C. Johanyak, T. Kovacs. Surrogate model based optimization of traffic lights cycles and green period ratios using microscopic simulation and fuzzy rule interpolation. International Journal of Artificial Intelligence, vol. 16, no. 1, pp. 20–40, 2018.
Google Scholar
F. L. Lewis, D. Vrabie, K. G. Vamvoudakis. Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers. IEEE Control Systems Magazine, vol. 32, no. 6, pp. 76–105, 2012. DOI: 10.1109/MCS.2012.2214134.
Article MathSciNet MATH Google Scholar
J. X. Yu, H. Dang, L. M. Wang. Fuzzy iterative learning control-based design of fault tolerant guaranteed cost controller for nonlinear batch processes. International Journal of Control, Automation and Systems, vol. 16, no. 5, pp. 2518–2527, 2018. DOI: 10.1007/s12555-017-0614-0.
Article Google Scholar
H. Modares, F. L. Lewis, Z. P. Jiang. Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning. IEEE Transactions on Cybernetics, vol. 46, no. 11, pp. 2401–2410, 2016. DOI: 10.1109/TCYB.2015.2477810.
Article Google Scholar
B. Hu, J. C. Wang. Deep learning based hand gesture recognition and UAV flight controls. International Journal of Automation and Computing, vol. 17, no. 1, pp. 17–29, 2020. DOI: 10.1007/s11633-019-1194-7.
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Key R&D Program of China (No. 2018YFB1308404).

Author information

Authors and Affiliations

Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin, 150001, China
Zhan Li, Sheng-Ri Xue, Xing-Hu Yu & Hui-Jun Gao
Ningbo Institute of Intelligent Equipment Technology, Harbin Institute of Technology, Ningbo, 315200, China
Xing-Hu Yu

Authors

Zhan Li
View author publications
You can also search for this author in PubMed Google Scholar
Sheng-Ri Xue
View author publications
You can also search for this author in PubMed Google Scholar
Xing-Hu Yu
View author publications
You can also search for this author in PubMed Google Scholar
Hui-Jun Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Sheng-Ri Xue or Hui-Jun Gao.

Additional information

Zhan Li received the Ph. D. degree in control science and engineering from Harbin Institute of Technology, Harbin, China in 2015. He is currently an associate professor with Research Institute of Intelligent Control and Systems, School of Astronautics, Harbin Institute of Technology, China.

His research interests include motion control, industrial robot control, robust control of small unmanned aerial vehicles (UAVs), and cooperative control of multivehicle systems.

Sheng-Ri Xue received the B. Sc. degree in automation engineering from Harbin Institute of Technology, China in 2015, where he is currently pursuing the Ph. D. degree with the Research Institute of Intelligent Control and Systems.

His research interests include H-infinity control, controller optimization, reinforcement learning, and their applications to sampled-data control systems design.

Xing-Hu Yu received the M. M. degree in osteopathic medicine from Jinzhou Medical University, China, in 2016. He is currently a Ph. D. degree candidate in control science and engineering from Harbin Institute of Technology, China.

His research interests include intelligent control and biomedical image processing.

Hui-Jun Gao received the Ph. D. degree in control science and engineering from Harbin Institute of Technology, China in 2005. From 2005 to 2007, he carried out his postdoctoral research with Department of Electrical and Computer Engineering, University of Alberta, Canada. Since 2004, he has been with Harbin Institute of Technology, where he is currently a full professor, the Director of Inter-discipline Science Research Center, and the Director of the Research Institute of Intelligent Control and Systems. He is an IEEE Industrial Electronics Society Administration Committee Member, and a council member of IFAC. He is the Co-Editor-in-Chief for IEEE Transactions on Industrial Electronics, and an Associate Editor for Automatica, IEEE Transactions on Control Systems Technology, IEEE Transactions on Cybernetics, and IEEE/ASME Transactions on Mechatronics.

His research interests include intelligent and robust control, robotics, mechatronics, and their engineering applications.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Z., Xue, SR., Yu, XH. et al. Controller Optimization for Multirate Systems Based on Reinforcement Learning. Int. J. Autom. Comput. 17, 417–427 (2020). https://doi.org/10.1007/s11633-020-1229-0

Download citation

Received: 21 December 2019
Accepted: 21 February 2020
Published: 14 April 2020
Issue Date: June 2020
DOI: https://doi.org/10.1007/s11633-020-1229-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Controller Optimization for Multirate Systems Based on Reinforcement Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement learning for optimal tracking of large-scale systems with multitime scales

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

Reinforcement Learning for Optimal Adaptive Control of Time Delay Systems

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Controller Optimization for Multirate Systems Based on Reinforcement Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement learning for optimal tracking of large-scale systems with multitime scales

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

Reinforcement Learning for Optimal Adaptive Control of Time Delay Systems

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation