{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,30]],"date-time":"2024-10-30T21:12:54Z","timestamp":1730322774046,"version":"3.28.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,1,17]]},"DOI":"10.1145\/3437992.3439927","type":"proceedings-article","created":{"date-parts":[[2021,1,20]],"date-time":"2021-01-20T23:11:11Z","timestamp":1611184271000},"page":"18-31","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["CertRL: formalizing convergence proofs for value and policy iteration in Coq"],"prefix":"10.1145","author":[{"ORCID":"http:\/\/orcid.org\/0000-0003-3799-6326","authenticated-orcid":false,"given":"Koundinya","family":"Vajjha","sequence":"first","affiliation":[{"name":"University of Pittsburgh, USA"}]},{"given":"Avraham","family":"Shinnar","sequence":"additional","affiliation":[{"name":"IBM Research, USA"}]},{"given":"Barry","family":"Trager","sequence":"additional","affiliation":[{"name":"IBM Research, USA"}]},{"given":"Vasily","family":"Pestun","sequence":"additional","affiliation":[{"name":"IBM Research, USA \/ IHES, France"}]},{"given":"Nathan","family":"Fulton","sequence":"additional","affiliation":[{"name":"IBM Research, USA"}]}],"member":"320","published-online":{"date-parts":[[2021,1,20]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-32347-8_16"},{"volume-title":"Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press.","year":"2018","author":"Alshiekh Mohammed","key":"e_1_3_2_2_2_1","unstructured":"Mohammed Alshiekh , Roderick Bloem , R\u00fcdiger Ehlers , Bettina K\u00f6nighofer , Scott Niekum , and Ufuk Topcu . 2018 . Safe Reinforcement Learning via Shielding . In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press. Mohammed Alshiekh, Roderick Bloem, R\u00fcdiger Ehlers, Bettina K\u00f6nighofer, Scott Niekum, and Ufuk Topcu. 2018. Safe Reinforcement Learning via Shielding. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila A. McIlraith and Kilian Q. Weinberger (Eds.). AAAI Press."},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"crossref","unstructured":"Philippe Audebaud and Christine Paulin-Mohring. 2009. Proofs of randomized algorithms in Coq. Science of Computer Programming 74 8 ( 2009 ) 568-589. Philippe Audebaud and Christine Paulin-Mohring. 2009. Proofs of randomized algorithms in Coq. Science of Computer Programming 74 8 ( 2009 ) 568-589.","DOI":"10.1016\/j.scico.2007.09.002"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3035918.3056447"},{"key":"e_1_3_2_2_5_1","first-page":"2662","volume-title":"Certifying the True Error: Machine Learning in Coq with Verified Generalization Guarantees. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI","author":"Bagnall Alexander","year":"2019","unstructured":"Alexander Bagnall and Gordon Stewart . 2019 . Certifying the True Error: Machine Learning in Coq with Verified Generalization Guarantees. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019. AAAI Press , 2662 - 2669 . htps:\/\/doi.org\/10.1609\/aaai.v33i01. 33012662 10.1609\/aaai.v33i01 Alexander Bagnall and Gordon Stewart. 2019. Certifying the True Error: Machine Learning in Coq with Verified Generalization Guarantees. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019. AAAI Press, 2662-2669. htps:\/\/doi.org\/10.1609\/aaai.v33i01. 33012662"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Richard Bellman. 1954. The theory of dynamic programming. Bull. Amer. Math. Soc. 60 6 ( 11 1954 ) 503-515. htps:\/\/projecteuclid.org: 443\/euclid.bams\/1183519147 Richard Bellman. 1954. The theory of dynamic programming. Bull. Amer. Math. Soc. 60 6 ( 11 1954 ) 503-515. htps:\/\/projecteuclid.org: 443\/euclid.bams\/1183519147","DOI":"10.1090\/S0002-9904-1954-09848-8"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10817-018-9481-5"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3192366.3192406"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018610.3018625"},{"volume-title":"Coquelicot: A User-Friendly Library of Real Analysis for Coq. Mathematics in Computer Science 9 ( 03 2014 ). htps:\/\/doi.org\/10.1007\/s11786-014-0181-1","year":"2014","author":"Boldo Sylvie","key":"e_1_3_2_2_10_1","unstructured":"Sylvie Boldo , Catherine Lelay , and Guillaume Melquiond . 2014 . Coquelicot: A User-Friendly Library of Real Analysis for Coq. Mathematics in Computer Science 9 ( 03 2014 ). htps:\/\/doi.org\/10.1007\/s11786-014-0181-1 10.1007\/s11786-014-0181-1 Sylvie Boldo, Catherine Lelay, and Guillaume Melquiond. 2014. Coquelicot: A User-Friendly Library of Real Analysis for Coq. Mathematics in Computer Science 9 ( 03 2014 ). htps:\/\/doi.org\/10.1007\/s11786-014-0181-1"},{"key":"e_1_3_2_2_11_1","first-page":"80","volume-title":"ESOP 2015 (LNCS","volume":"9032","author":"Eberl Manuel","year":"2015","unstructured":"Manuel Eberl , Johannes H\u00f6lzl , and Tobias Nipkow . 2015 . A Veriifed Compiler for Probability Density Functions . In ESOP 2015 (LNCS , Vol. 9032 ), Jan Vitek (Ed.). Springer , 80 - 104 . htps:\/\/doi.org\/10.1007\/ 978-3-662-46669-8_4 Manuel Eberl, Johannes H\u00f6lzl, and Tobias Nipkow. 2015. A Veriifed Compiler for Probability Density Functions. In ESOP 2015 (LNCS, Vol. 9032 ), Jan Vitek (Ed.). Springer, 80-104. htps:\/\/doi.org\/10.1007\/ 978-3-662-46669-8_4"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-00389-0_6"},{"volume-title":"Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila McIlraith and Kilian Weinberger (Eds.). AAAI Press, 6485-6492","year":"2018","author":"Fulton Nathan","key":"e_1_3_2_2_13_1","unstructured":"Nathan Fulton and Andr\u00e9 Platzer . 2018 . Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning . In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila McIlraith and Kilian Weinberger (Eds.). AAAI Press, 6485-6492 . Nathan Fulton and Andr\u00e9 Platzer. 2018. Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018 ), Sheila McIlraith and Kilian Weinberger (Eds.). AAAI Press, 6485-6492."},{"volume-title":"Categorical Aspects of Topology and Analysis","author":"Giry Mich\u00e8le","key":"e_1_3_2_2_14_1","unstructured":"Mich\u00e8le Giry . 1982. A categorical approach to probability theory . In Categorical Aspects of Topology and Analysis , B. Banaschewski (Ed.). Springer Berlin Heidelberg , Berlin, Heidelberg , 68-85. Mich\u00e8le Giry. 1982. A categorical approach to probability theory. In Categorical Aspects of Topology and Analysis, B. Banaschewski (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 68-85."},{"key":"e_1_3_2_2_15_1","unstructured":"Shixiang Gu Ethan Holly Timothy P. Lillicrap and Sergey Levine. 2016. Deep Reinforcement Learning for Robotic Manipulation. CoRR abs\/1610.00633 ( 2016 ). arXiv: 1610.00633 htp:\/\/arxiv.org\/abs\/1610. 00633 Shixiang Gu Ethan Holly Timothy P. Lillicrap and Sergey Levine. 2016. Deep Reinforcement Learning for Robotic Manipulation. CoRR abs\/1610.00633 ( 2016 ). arXiv: 1610.00633 htp:\/\/arxiv.org\/abs\/1610. 00633"},{"volume-title":"Logically-Correct Reinforcement Learning. CoRR abs\/","year":"1801","author":"Hasanbeig Mohammadhosein","key":"e_1_3_2_2_16_1","unstructured":"Mohammadhosein Hasanbeig , Alessandro Abate , and Daniel Kroening . 2018. Logically-Correct Reinforcement Learning. CoRR abs\/ 1801 .08099 ( 2018 ). arXiv: 1801.08099 Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening. 2018. Logically-Correct Reinforcement Learning. CoRR abs\/ 1801.08099 ( 2018 ). arXiv: 1801.08099"},{"key":"e_1_3_2_2_17_1","unstructured":"Johannes Hoelzl. 2017. Markov Chains and Markov Decision Processes in Isabelle\/HOL. Journal of Automated Reasoning ( 2017 ). htps:\/\/doi. org\/10.1007\/s10817-016-9401-5 Johannes Hoelzl. 2017. Markov Chains and Markov Decision Processes in Isabelle\/HOL. Journal of Automated Reasoning ( 2017 ). htps:\/\/doi. org\/10.1007\/s10817-016-9401-5"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018610.3018628"},{"volume-title":"Dynamic Programming and Markov Processes","author":"Howard R.A.","key":"e_1_3_2_2_19_1","unstructured":"R.A. Howard . 1960. Dynamic Programming and Markov Processes . Technology Press of Massachusetts Institute of Technology . htps: \/\/books.google.com\/books?id=fXJEAAAAIAAJ R.A. Howard. 1960. Dynamic Programming and Markov Processes. Technology Press of Massachusetts Institute of Technology. htps: \/\/books.google.com\/books?id=fXJEAAAAIAAJ"},{"volume-title":"Verifiably Safe Exploration for End-toEnd Reinforcement Learning. ArXiv abs\/","year":"2007","author":"Hunt Nathan","key":"e_1_3_2_2_20_1","unstructured":"Nathan Hunt , N. Fulton , Sara Magliacane , N. Ho\u00e0ng , Subhro Das , and Armando Solar-Lezama . 2020. Verifiably Safe Exploration for End-toEnd Reinforcement Learning. ArXiv abs\/ 2007 .01223 ( 2020 ). Nathan Hunt, N. Fulton, Sara Magliacane, N. Ho\u00e0ng, Subhro Das, and Armando Solar-Lezama. 2020. Verifiably Safe Exploration for End-toEnd Reinforcement Learning. ArXiv abs\/ 2007.01223 ( 2020 )."},{"volume-title":"From probability monads to commutative efectuses. Journal of Logical and Algebraic Methods in Programming 94 ( 2018 ), 200-237. htps:\/\/doi.org\/10.1016\/j.jlamp","year":"2016","author":"Jacobs Bart","key":"e_1_3_2_2_21_1","unstructured":"Bart Jacobs . 2018. From probability monads to commutative efectuses. Journal of Logical and Algebraic Methods in Programming 94 ( 2018 ), 200-237. htps:\/\/doi.org\/10.1016\/j.jlamp . 2016 . 11.006 10.1016\/j.jlamp Bart Jacobs. 2018. From probability monads to commutative efectuses. Journal of Logical and Algebraic Methods in Programming 94 ( 2018 ), 200-237. htps:\/\/doi.org\/10.1016\/j.jlamp. 2016. 11.006"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/LICS.1989.39173"},{"key":"e_1_3_2_2_23_1","unstructured":"Dexter Kozen. 2007. Coinductive Proof Principles for Stochastic Processes. CoRR abs\/0711.0194 ( 2007 ). arXiv: 0711.0194 htp:\/\/arxiv.org\/ abs\/0711.0194 Dexter Kozen. 2007. Coinductive Proof Principles for Stochastic Processes. CoRR abs\/0711.0194 ( 2007 ). arXiv: 0711.0194 htp:\/\/arxiv.org\/ abs\/0711.0194"},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"crossref","unstructured":"Dexter Kozen and Nicholas Ruozzi. 2009. Applications of Metric Coinduction. Log. Methods Comput. Sci. 5 3 ( 2009 ). htp:\/\/arxiv.org\/ abs\/0908.2793 Dexter Kozen and Nicholas Ruozzi. 2009. Applications of Metric Coinduction. Log. Methods Comput. Sci. 5 3 ( 2009 ). htp:\/\/arxiv.org\/ abs\/0908.2793","DOI":"10.2168\/LMCS-5(3:10)2009"},{"key":"e_1_3_2_2_25_1","unstructured":"F William Lawvere. 1962. The category of probabilistic mappings. preprint ( 1962 ). F William Lawvere. 1962. The category of probabilistic mappings. preprint ( 1962 )."},{"key":"e_1_3_2_2_26_1","unstructured":"OpenAI. 2018. OpenAI Five. htps:\/\/blog.openai.com\/openai-five\/. OpenAI. 2018. OpenAI Five. htps:\/\/blog.openai.com\/openai-five\/."},{"volume-title":"Notes on Category Theory with examples from basic mathematics. arXiv","year":"1912","author":"Perrone Paolo","key":"e_1_3_2_2_28_1","unstructured":"Paolo Perrone . 2019. Notes on Category Theory with examples from basic mathematics. arXiv : 1912 .10642 [math.CT] Paolo Perrone. 2019. Notes on Category Theory with examples from basic mathematics. arXiv: 1912.10642 [math.CT]"},{"edition":"1","volume-title":"Markov Decision Processes: Discrete Stochastic Dynamic Programming","author":"Puterman Martin L.","key":"e_1_3_2_2_29_1","unstructured":"Martin L. Puterman . 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming ( 1 st ed.). John Wiley and Sons, Inc. , USA. Martin L. Puterman. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming (1st ed.). John Wiley and Sons, Inc., USA."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/503272.503288"},{"key":"e_1_3_2_2_31_1","first-page":"165","volume-title":"Proceedings of the 8th ACM SIGPLAN Symposium on Haskell, Haskell 2015","author":"Adam","year":"2015","unstructured":"Adam ?cibior, Zoubin Ghahramani , and Andrew D. Gordon . 2015. Practical probabilistic programming with monads . In Proceedings of the 8th ACM SIGPLAN Symposium on Haskell, Haskell 2015 , Vancouver, BC, Canada , September 3-4, 2015 , Ben Lippmeier (Ed.). ACM, 165 - 176 . htps:\/\/doi.org\/10.1145\/2804302.2804317 10.1145\/2804302.2804317 Adam ?cibior, Zoubin Ghahramani, and Andrew D. Gordon. 2015. Practical probabilistic programming with monads. In Proceedings of the 8th ACM SIGPLAN Symposium on Haskell, Haskell 2015, Vancouver, BC, Canada, September 3-4, 2015, Ben Lippmeier (Ed.). ACM, 165-176. htps:\/\/doi.org\/10.1145\/2804302.2804317"},{"volume-title":"Proceedings of the 34th International Conference on Machine Learning, ICML 2017","year":"2017","author":"Selsam Daniel","key":"e_1_3_2_2_32_1","unstructured":"Daniel Selsam , Percy Liang , and David L. Dill . 2017. Developing Bug-Free Machine Learning Systems With Formal Mathematics . In Proceedings of the 34th International Conference on Machine Learning, ICML 2017 , Sydney, NSW, Australia , 6-11 August 2017 (Proceedings of Machine Learning Research, Vol. 70 ), Doina Precup and Yee Whye Teh (Eds.). PMLR, 3047-3056. htp:\/\/proceedings.mlr.press\/v70\/selsam17a. html Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing Bug-Free Machine Learning Systems With Formal Mathematics. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 (Proceedings of Machine Learning Research, Vol. 70 ), Doina Precup and Yee Whye Teh (Eds.). PMLR, 3047-3056. htp:\/\/proceedings.mlr.press\/v70\/selsam17a. html"},{"key":"#cr-split#-e_1_3_2_2_33_1.1","unstructured":"Andrew Senior Richard Evans John Jumper James Kirkpatrick Laurent Sifre Tim Green Chongli Qin Augustin ?\u00eddek Alexander Nelson Alex Bridgland Hugo Penedones Stig Petersen Karen Simonyan Steve Crossan Pushmeet Kohli David Jones David Silver Koray Kavukcuoglu and Demis Hassabis. 2020. Improved protein structure prediction using potentials from deep learning. Nature 577 (01 2020 ) 1-5. htps:\/\/doi.org\/10.1038\/s41586-019-1923-7 10.1038\/s41586-019-1923-7"},{"key":"#cr-split#-e_1_3_2_2_33_1.2","doi-asserted-by":"crossref","unstructured":"Andrew Senior Richard Evans John Jumper James Kirkpatrick Laurent Sifre Tim Green Chongli Qin Augustin ?\u00eddek Alexander Nelson Alex Bridgland Hugo Penedones Stig Petersen Karen Simonyan Steve Crossan Pushmeet Kohli David Jones David Silver Koray Kavukcuoglu and Demis Hassabis. 2020. Improved protein structure prediction using potentials from deep learning. Nature 577 (01 2020 ) 1-5. htps:\/\/doi.org\/10.1038\/s41586-019-1923-7","DOI":"10.1038\/s41586-019-1923-7"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"volume-title":"Barto","year":"1998","author":"Sutton Richard S.","key":"e_1_3_2_2_35_1","unstructured":"Richard S. Sutton and Andrew G . Barto . 1998 . Reinforcement Learning : An Introduction. MIT Press , Cambridge, MA. Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290377"},{"volume-title":"A Formal Proof of PAC Learnability for Decision Stumps. CoRR abs\/","year":"1911","author":"Tassarotti Joseph","key":"e_1_3_2_2_37_1","unstructured":"Joseph Tassarotti , Jean-Baptiste Tristan , and Koundinya Vajjha . 2019. A Formal Proof of PAC Learnability for Decision Stumps. CoRR abs\/ 1911 .00385 ( 2019 ). arXiv: 1911.00385 htp:\/\/arxiv.org\/abs\/ 1911. 00385 Joseph Tassarotti, Jean-Baptiste Tristan, and Koundinya Vajjha. 2019. A Formal Proof of PAC Learnability for Decision Stumps. CoRR abs\/ 1911.00385 ( 2019 ). arXiv: 1911.00385 htp:\/\/arxiv.org\/abs\/ 1911. 00385"},{"key":"e_1_3_2_2_38_1","unstructured":"The Coq Development Team. 2004. The Coq Proof Assistant Reference Manual. LogiCal Project. Version 8.0. The Coq Development Team. 2004. The Coq Proof Assistant Reference Manual. LogiCal Project. Version 8.0."},{"volume-title":"Verification of ML Systems via Reparameterization. CoRR abs\/","year":"2007","author":"Tristan Jean-Baptiste","key":"e_1_3_2_2_39_1","unstructured":"Jean-Baptiste Tristan , Joseph Tassarotti , Koundinya Vajjha , Michael L. Wick , and Anindya Banerjee . 2020. Verification of ML Systems via Reparameterization. CoRR abs\/ 2007 .06776 ( 2020 ). arXiv: 2007.06776 htps:\/\/arxiv.org\/abs\/ 2007.06776 Jean-Baptiste Tristan, Joseph Tassarotti, Koundinya Vajjha, Michael L. Wick, and Anindya Banerjee. 2020. Verification of ML Systems via Reparameterization. CoRR abs\/ 2007.06776 ( 2020 ). arXiv: 2007.06776 htps:\/\/arxiv.org\/abs\/ 2007.06776"}],"event":{"name":"CPP '21: 10th ACM SIGPLAN International Conference on Certified Programs and Proofs","sponsor":["SIGPLAN ACM Special Interest Group on Programming Languages"],"location":"Virtual Denmark","acronym":"CPP '21"},"container-title":["Proceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3437992.3439927","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,8]],"date-time":"2023-01-08T19:49:29Z","timestamp":1673207369000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3437992.3439927"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,17]]},"references-count":39,"alternative-id":["10.1145\/3437992.3439927","10.1145\/3437992"],"URL":"https:\/\/doi.org\/10.1145\/3437992.3439927","relation":{},"subject":[],"published":{"date-parts":[[2021,1,17]]},"assertion":[{"value":"2021-01-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}