Abstract
Reinforcement learning (RL) is a powerful technique for learning in domains where there is no instructive feedback but only evaluative feedback and is rapidly expanding in industrial and research fields. One of the main limitations of RL is the slowness in convergence. Thus, several methods have been proposed to speed up RL. They involve the incorporation of prior knowledge or bias into RL. In this paper, we present a new method for incorporating bias into RL. This method extends the choosing initial Q-values method proposed by Hailu G. and Sommer G. and one kind of learning mechanism is introduced into agent. This allows for much more specific information to guide the agent which action to choose and meanwhile it is helpful to reduce the state research space. So it improves the learning performance and speed up the convergence of the learning process greatly.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Gabriel, M., Moore, J.W. (eds.): Learning and Computational Neuroscience. MIT Press, Cambridge; Mam, Y.: The Technical Writer’s Handbook. University Science, Mill Valley (1989)
Barto, A.G., Sutton, R.S., Watkins, C.J.C.H.: Learning and sequential decision making. In: Gabriel, M., Moore, J.W. (eds.) Learning and Computational Neuroscience. The MIT Press, Cambridge (1990)
Hailu, G., Sommer, G.: Embedding knowledge in reinforcement learning. In: International Conference on Artificial Neural Network (ICANN), Sweden, pp. 1133–1138 (1998)
Malak, R.J., Kholsa, P.K.: A framework for the adaptive transfer of robot skill knowledge among reinforcement learning agents. In: IEEE International Conference on Robotic Automation (2001)
Wiewiora, E., Cottrell, G., Elkan, C.: Principled Methods for Advising Reinforcement Learning Agents. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), Washington DC (2003)
Perkins, T., Barto, A.: Lyapunov design for safe reinforcement learning control. In: Machine Learning, Proceedings of the Sixteenth International Conference. Morgan Kaufmann, San Francisco (2001)
Hailu, G., Sommer, G.: On Amount and Quality of Bias in Reinforcement Learning. In: IEEE International Conference on Systems, Man and Cybernetics (IEEE SMC 1999), Tokyo, Japan, pp. 1491–1495 (1999)
Watkins, C.: Learning from delayed rewards. Ph.D. dissertation. Cambridge University, Cambridge, England (1989)
Watkins, C., Dayan, P.: Technical note: Q-learning. Machine Learning 8, 279–292 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Z., Hu, K., Liu, Z., Yu, X. (2011). Principled Methods for Biasing Reinforcement Learning Agents. In: Deng, H., Miao, D., Lei, J., Wang, F.L. (eds) Artificial Intelligence and Computational Intelligence. AICI 2011. Lecture Notes in Computer Science(), vol 7003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23887-1_89
Download citation
DOI: https://doi.org/10.1007/978-3-642-23887-1_89
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23886-4
Online ISBN: 978-3-642-23887-1
eBook Packages: Computer ScienceComputer Science (R0)