Evaluating Critical Reinforcement Learning Framework in the Field

Ju, Song; Zhou, Guojing; Abdelshiheed, Mark; Barnes, Tiffany; Chi, Min

doi:10.1007/978-3-030-78292-4_18

Song Ju¹³,
Guojing Zhou¹³,
Mark Abdelshiheed¹³,
Tiffany Barnes¹³ &
…
Min Chi¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12748))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3493 Accesses
6 Citations

Abstract

Reinforcement Learning (RL) is learning what action to take next by mapping situations to actions so as to maximize cumulative rewards. In recent years RL has achieved great success in inducing effective pedagogical policies for various interactive e-learning environments. However, it is often prohibitive to identify the critical pedagogical decisions that actually contribute to desirable learning outcomes. In this work, by utilizing the RL framework we defined critical decisions to be those states in which the agent has to take the optimal actions, and subsequently, the Critical policy as carrying out optimal actions in the critical states while acting randomly in others. We proposed a general Critical-RL framework for identifying critical decisions and inducing a Critical policy. The effectiveness of our Critical-RL framework is empirically evaluated from two perspectives: whether optimal actions must be carried out in critical states (the necessary hypothesis) and whether only carrying out optimal actions in critical states is as effective as a fully-executed RL policy (the sufficient hypothesis). Our results confirmed both hypotheses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards a Personalized Learning Experience Using Reinforcement Learning

A Reinforcement Learning-Based Adaptive Learning System

Where’s the Reward?

Article 14 November 2019

References

Andrychowicz, M., Baker, B., et al.: Learning dexterous in-hand manipulation. arXiv preprint arXiv:1808.00177 (2018)
Ausin, M.S., Azizsoltani, H., Barnes, T., Chi, M.: Leveraging deep reinforcement learning for pedagogical policy induction in an intelligent tutoring system. In: EDM (2019)
Google Scholar
Sanz Ausin, M., Maniktala, M., Barnes, T., Chi, M.: Exploring the impact of simple explanations and agency on batch deep reinforcement learning induced pedagogical Policies. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12163, pp. 472–485. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52237-7_38
Chapter Google Scholar
Ausin, M.S., Maniktala, M., Barnes, T., Chi, M.: Tackling the credit assignment problem in reinforcement learning-induced pedagogical policies with neural networks. In: AIED (2021)
Google Scholar
Barto, A.G., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discret. Event Dyn. Syst. 13(1–2), 41–77 (2003). https://doi.org/10.1023/A:1022140919877
Article MathSciNet MATH Google Scholar
Beck, J., Woolf, B.P., Beal, C.R.: Advisor: a machine learning architecture for intelligent tutor construction. In: AAAI/IAAI, pp. 552–557 (2000)
Google Scholar
Chi, M., VanLehn, K., Litman, D., Jordan, P.: Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Model. User-Adap. Inter. 21(1–2), 137–180 (2011). https://doi.org/10.1007/s11257-010-9093-1
Article Google Scholar
Clouse, J.A.: On integrating apprentice learning and reinforcement learning. Ph.D. thesis (1996)
Google Scholar
Fachantidis, A., Taylor, M.E., Vlahavas, I.P.: Learning to teach reinforcement learning agents. Mach. Learn. Knowl. Extract. 1, 21–42 (2017)
Article Google Scholar
Iglesias, A., Martínez, P., Aler, R., Fernández, F.: Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowl.-Based Syst. 22(4), 266–270 (2009)
Article Google Scholar
Ju, S., Zhou, G., Azizsoltani, H., Barnes, T., Chi, M.: Identifying critical pedagogical decisions through adversarial deep reinforcement learning. In: EDM (2019)
Google Scholar
Ju, S., Zhou, G., Barnes, T., Chi, M.: Pick the moment: identifying critical pedagogical decisions using long-short term rewards. In: EDM (2020)
Google Scholar
Li, J., Daw, N.D.: Signals in human striatum are appropriate for policy update rather than value prediction, 31 (2011)
Google Scholar
Mandel, T., Liu, Y.E., Levine, S., Brunskill, E., Popovic, Z.: Offline policy evaluation across representations with applications to educational games. In: AAMAS, pp. 1077–1084 (2014)
Google Scholar
McClure, S.M., Laibson, D.I., Loewenstein, G., Cohen, J.D.: Separate neural systems value immediate and delayed monetary rewards. Science 306, 503–507 (2004)
Article Google Scholar
McLaren, B.M., Isotani, S.: When is it best to learn with all worked examples? In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) AIED 2011. LNCS (LNAI), vol. 6738, pp. 222–229. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21869-9_30
Chapter Google Scholar
McLaren, B.M., Lim, S.J., Koedinger, K.R.: When and how often should worked examples be given to students? New results and a summary of the current state of research. In: CogSci, pp. 2176–2181 (2008)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Morris, G., Nevet, A., Arkadir, D., Vaadia, E., Bergman, H.: Midbrain dopamine neurons encode decisions for future action. Nat. Neurosci. 9(8), 1057–1063 (2006)
Article Google Scholar
Najar, A.S., Mitrovic, A., McLaren, B.M.: Adaptive support versus alternating worked examples and tutored problems: which leads to better learning? In: Dimitrova, V., Kuflik, T., Chin, D., Ricci, F., Dolog, P., Houben, G.-J. (eds.) UMAP 2014. LNCS, vol. 8538, pp. 171–182. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-08786-3_15
Chapter Google Scholar
Narasimhan, K., Kulkarni, T., Barzilay, R.: Language understanding for text-based games using deep reinforcement learning. arXiv preprint arXiv:1506.08941 (2015)
Rafferty, A.N., Brunskill, E., et al.: Faster teaching via POMDP planning. Cogn. Sci. 40(6), 1290–1332 (2016)
Article Google Scholar
Renkl, A., Atkinson, R.K., Maier, U.H., Staley, R.: From example study to problem solving: smooth transitions help learning. J. Exp. Educ. 70(4), 293–315 (2002)
Article Google Scholar
Roesch, M.R., Calu, D.J., Schoenbaum, G.: Dopamine neurons encode the better option in rats deciding between different delayed or sized rewards. Nat. Neurosci. 10(12), 1615–1624 (2007)
Article Google Scholar
Rowe, J., Mott, B., Lester, J.: Optimizing player experience in interactive narrative planning: a modular reinforcement learning approach. In: Tenth Artificial Intelligence and Interactive Digital Entertainment Conference (2014)
Google Scholar
Rowe, J.P., Lester, J.C.: Improving student problem solving in narrative-centered learning environments: a modular reinforcement learning framework. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (eds.) AIED 2015. LNCS (LNAI), vol. 9112, pp. 419–428. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19773-9_42
Chapter Google Scholar
Shen, S., Ausin, M.S., Mostafavi, B., Chi, M.: Improving learning & reducing time: a constrained action-based reinforcement learning approach. In: UMAP (2018)
Google Scholar
Shen, S., Chi, M.: Aim low: correlation-based feature selection for model-based reinforcement learning. In: EDM (2016)
Google Scholar
Shen, S., Chi, M.: Reinforcement learning: the sooner the better, or the later the better? In: Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, pp. 37–44. ACM (2016)
Google Scholar
Shen, S., Mostafavi, B., Lynch, C., Barnes, T., Chi, M.: Empirically evaluating the effectiveness of POMDP vs. MDP towards the pedagogical strategies induction. In: Penstein Rosé, C., et al. (eds.) AIED 2018. LNCS (LNAI), vol. 10948, pp. 327–331. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93846-2_61
Chapter Google Scholar
Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Silver, D., Hubert, T., Schrittwieser, J., et al.: A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science 362(6419), 1140–1144 (2018)
Article MathSciNet Google Scholar
Stamper, J.C., Eagle, M., Barnes, T., Croy, M.: Experimental evaluation of automatic hint generation for a logic tutor. In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) AIED 2011. LNCS (LNAI), vol. 6738, pp. 345–352. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21869-9_45
Chapter Google Scholar
Sul, J.H., Jo, S., Lee, D., Jung, M.W.: Role of rodent secondary motor cortex in value-based action selection. Nat. Neurosci. 14(9), 1202–1208 (2011)
Article Google Scholar
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif. Intell. 112(1–2), 181–211 (1999)
Article MathSciNet Google Scholar
Torrey, L., Taylor, M.E.: Teaching on a budget: agents advising agents in reinforcement learning. In: International conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2013, pp. 1053–1060 (2013)
Google Scholar
Van Gog, T., Kester, L., Paas, F.: Effects of worked examples, example-problem, and problem-example pairs on novices’ learning. Contemp. Educ. Psychol. 36(3), 212–218 (2011)
Article Google Scholar
Wang, P., Rowe, J., Min, W., Mott, B., Lester, J.: Interactive narrative personalization with deep reinforcement learning. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (2017)
Google Scholar
Zhou, G.: Big, little, or both? Exploring the impact of granularity on learning for students with different incoming competence. In: CogSci (2019)
Google Scholar
Zhou, G., et al.: Towards closing the loop: bridging machine-induced pedagogical policies to learning theories. In: EDM (2017)
Google Scholar
Zhou, G., Azizsoltani, H., Ausin, M.S., Barnes, T., Chi, M.: Hierarchical reinforcement learning for pedagogical policy induction. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS (LNAI), vol. 11625, pp. 544–556. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_45
Chapter Google Scholar
Zhou, G., Price, T.W., Lynch, C., Barnes, T., Chi, M.: The impact of granularity on worked examples and problem solving. In: CogSci, pp. 2817–2822 (2015)
Google Scholar
Zhou, G., Yang, X., Azizsoltani, H., Barnes, T., Chi, M.: Improving student-tutor interaction through data-driven explanation of hierarchical reinforcement induced pedagogical policies. In: UMAP. ACM (2020)
Google Scholar
Zimmer, M., Viappiani, P., Weng, P.: Teacher-student framework: a reinforcement learning approach. In: AAMAS Workshop Autonomous Robots and Multirobot Systems (2013)
Google Scholar

Download references

Acknowledgements

This research was supported by the NSF Grants: #1726550, #1651909, and #2013502.

Author information

Authors and Affiliations

Department of Computer Science, North Carolina State University, Raleigh, NC, 27695, USA
Song Ju, Guojing Zhou, Mark Abdelshiheed, Tiffany Barnes & Min Chi

Authors

Song Ju
View author publications
You can also search for this author in PubMed Google Scholar
Guojing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Mark Abdelshiheed
View author publications
You can also search for this author in PubMed Google Scholar
Tiffany Barnes
View author publications
You can also search for this author in PubMed Google Scholar
Min Chi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min Chi .

Editor information

Editors and Affiliations

Technion – Israel Institute of Technology, Haifa, Israel
Ido Roll
Arizona State University, Tempe, AZ, USA
Danielle McNamara
Utrecht University, Utrecht, The Netherlands
Sergey Sosnovsky
London Knowledge Lab, London, UK
Rose Luckin
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ju, S., Zhou, G., Abdelshiheed, M., Barnes, T., Chi, M. (2021). Evaluating Critical Reinforcement Learning Framework in the Field. In: Roll, I., McNamara, D., Sosnovsky, S., Luckin, R., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2021. Lecture Notes in Computer Science(), vol 12748. Springer, Cham. https://doi.org/10.1007/978-3-030-78292-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-78292-4_18
Published: 11 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-78291-7
Online ISBN: 978-3-030-78292-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluating Critical Reinforcement Learning Framework in the Field

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards a Personalized Learning Experience Using Reinforcement Learning

A Reinforcement Learning-Based Adaptive Learning System

Where’s the Reward?

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Evaluating Critical Reinforcement Learning Framework in the Field

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards a Personalized Learning Experience Using Reinforcement Learning

A Reinforcement Learning-Based Adaptive Learning System

Where’s the Reward?

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation