iBet uBet web content aggregator. Adding the entire web to your favor.

Link to original content: https://dblp.dagstuhl.de/pid/255/7051.ris

Provider: Schloss Dagstuhl - Leibniz Center for Informatics Database: dblp computer science bibliography Content:text/plain; charset="utf-8" TY - CPAPER ID - DBLP:conf/ijcai/MehrabianAKSBCB24 AU - Mehrabian, Abbas AU - Anand, Ankit AU - Kim, Hyunjik AU - Sonnerat, Nicolas AU - Balog, Matej AU - Comanici, Gheorghe AU - Berariu, Tudor AU - Lee, Andrew AU - Ruoss, Anian AU - Bulanova, Anna AU - Toyama, Daniel AU - Blackwell, Sam AU - Romera-Paredes, Bernardino AU - Velickovic, Petar AU - Orseau, Laurent AU - Lee, Joonkyung AU - Naredla, Anurag Murty AU - Precup, Doina AU - Wagner, Adam Zsolt TI - Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search. BT - Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI 2024, Jeju, South Korea, August 3-9, 2024 SP - 6985 EP - 6993 PY - 2024// UR - https://www.ijcai.org/proceedings/2024/772 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2405-14573 AU - Rawles, Christopher AU - Clinckemaillie, Sarah AU - Chang, Yifan AU - Waltz, Jonathan AU - Lau, Gabrielle AU - Fair, Marybeth AU - Li, Alice AU - Bishop, William E. AU - Li, Wei AU - Campbell-Ajala, Folawiyo AU - Toyama, Daniel AU - Berry, Robert AU - Tyamagundlu, Divya AU - Lillicrap, Timothy P. AU - Riva, Oriana TI - AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents. JO - CoRR VL - abs/2405.14573 PY - 2024// DO - 10.48550/ARXIV.2405.14573 UR - https://doi.org/10.48550/arXiv.2405.14573 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2410-01748 AU - Hosseini, Arian AU - Sordoni, Alessandro AU - Toyama, Daniel AU - Courville, Aaron C. AU - Agarwal, Rishabh TI - Not All LLM Reasoners Are Created Equal. JO - CoRR VL - abs/2410.01748 PY - 2024// DO - 10.48550/ARXIV.2410.01748 UR - https://doi.org/10.48550/arXiv.2410.01748 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2308-03526 AU - Mathieu, Michaël AU - Ozair, Sherjil AU - Srinivasan, Srivatsan AU - Gülçehre, Çaglar AU - Zhang, Shangtong AU - Jiang, Ray AU - Paine, Tom Le AU - Powell, Richard AU - Zolna, Konrad AU - Schrittwieser, Julian AU - Choi, David H. AU - Georgiev, Petko AU - Toyama, Daniel AU - Huang, Aja AU - Ring, Roman AU - Babuschkin, Igor AU - Ewalds, Timo AU - Bordbar, Mahyar AU - Henderson, Sarah AU - Colmenarejo, Sergio Gómez AU - Oord, Aäron van den AU - Czarnecki, Wojciech Marian AU - Freitas, Nando de AU - Vinyals, Oriol TI - AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning. JO - CoRR VL - abs/2308.03526 PY - 2023// DO - 10.48550/ARXIV.2308.03526 UR - https://doi.org/10.48550/arXiv.2308.03526 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2311-03583 AU - Mehrabian, Abbas AU - Anand, Ankit AU - Kim, Hyunjik AU - Sonnerat, Nicolas AU - Balog, Matej AU - Comanici, Gheorghe AU - Berariu, Tudor AU - Lee, Andrew AU - Ruoss, Anian AU - Bulanova, Anna AU - Toyama, Daniel AU - Blackwell, Sam AU - Romera-Paredes, Bernardino AU - Velickovic, Petar AU - Orseau, Laurent AU - Lee, Joonkyung AU - Naredla, Anurag Murty AU - Precup, Doina AU - Wagner, Adam Zsolt TI - Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search. JO - CoRR VL - abs/2311.03583 PY - 2023// DO - 10.48550/ARXIV.2311.03583 UR - https://doi.org/10.48550/arXiv.2311.03583 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2204-10374 AU - Comanici, Gheorghe AU - Glaese, Amelia AU - Gergely, Anita AU - Toyama, Daniel AU - Ahmed, Zafarali AU - Jackson, Tyler AU - Hamel, Philippe AU - Precup, Doina TI - Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning. JO - CoRR VL - abs/2204.10374 PY - 2022// DO - 10.48550/ARXIV.2204.10374 UR - https://doi.org/10.48550/arXiv.2204.10374 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2105-13231 AU - Toyama, Daniel AU - Hamel, Philippe AU - Gergely, Anita AU - Comanici, Gheorghe AU - Glaese, Amelia AU - Ahmed, Zafarali AU - Jackson, Tyler AU - Mourad, Shibl AU - Precup, Doina TI - AndroidEnv: A Reinforcement Learning Platform for Android. JO - CoRR VL - abs/2105.13231 PY - 2021// UR - https://arxiv.org/abs/2105.13231 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2106-13105 AU - Barreto, André AU - Borsa, Diana AU - Hou, Shaobo AU - Comanici, Gheorghe AU - Aygün, Eser AU - Hamel, Philippe AU - Toyama, Daniel AU - Hunt, Jonathan J. AU - Mourad, Shibl AU - Silver, David AU - Precup, Doina TI - The Option Keyboard: Combining Skills in Reinforcement Learning. JO - CoRR VL - abs/2106.13105 PY - 2021// UR - https://arxiv.org/abs/2106.13105 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2111-02767 AU - Ramos, Sabela AU - Girgin, Sertan AU - Hussenot, Léonard AU - Vincent, Damien AU - Yakubovich, Hanna AU - Toyama, Daniel AU - Gergely, Anita AU - Stanczyk, Piotr AU - Marinier, Raphaël AU - Harmsen, Jeremiah AU - Pietquin, Olivier AU - Momchev, Nikola TI - RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning. JO - CoRR VL - abs/2111.02767 PY - 2021// UR - https://arxiv.org/abs/2111.02767 ER - TY - Informal or Other Publication ID - DBLP:journals/corr/abs-2112-11446 AU - Rae, Jack W. AU - Borgeaud, Sebastian AU - Cai, Trevor AU - Millican, Katie AU - Hoffmann, Jordan AU - Song, H. Francis AU - Aslanides, John AU - Henderson, Sarah AU - Ring, Roman AU - Young, Susannah AU - Rutherford, Eliza AU - Hennigan, Tom AU - Menick, Jacob AU - Cassirer, Albin AU - Powell, Richard AU - Driessche, George van den AU - Hendricks, Lisa Anne AU - Rauh, Maribeth AU - Huang, Po-Sen AU - Glaese, Amelia AU - Welbl, Johannes AU - Dathathri, Sumanth AU - Huang, Saffron AU - Uesato, Jonathan AU - Mellor, John AU - Higgins, Irina AU - Creswell, Antonia AU - McAleese, Nat AU - Wu, Amy AU - Elsen, Erich AU - Jayakumar, Siddhant M. AU - Buchatskaya, Elena AU - Budden, David AU - Sutherland, Esme AU - Simonyan, Karen AU - Paganini, Michela AU - Sifre, Laurent AU - Martens, Lena AU - Li, Xiang Lorraine AU - Kuncoro, Adhiguna AU - Nematzadeh, Aida AU - Gribovskaya, Elena AU - Donato, Domenic AU - Lazaridou, Angeliki AU - Mensch, Arthur AU - Lespiau, Jean-Baptiste AU - Tsimpoukelli, Maria AU - Grigorev, Nikolai AU - Fritz, Doug AU - Sottiaux, Thibault AU - Pajarskas, Mantas AU - Pohlen, Toby AU - Gong, Zhitao AU - Toyama, Daniel AU - d'Autume, Cyprien de Masson AU - Li, Yujia AU - Terzi, Tayfun AU - Mikulik, Vladimir AU - Babuschkin, Igor AU - Clark, Aidan AU - Casas, Diego de Las AU - Guy, Aurelia AU - Jones, Chris AU - Bradbury, James AU - Johnson, Matthew J. AU - Hechtman, Blake A. AU - Weidinger, Laura AU - Gabriel, Iason AU - Isaac, William AU - Lockhart, Edward AU - Osindero, Simon AU - Rimell, Laura AU - Dyer, Chris AU - Vinyals, Oriol AU - Ayoub, Kareem AU - Stanway, Jeff AU - Bennett, Lorrayne AU - Hassabis, Demis AU - Kavukcuoglu, Koray AU - Irving, Geoffrey TI - Scaling Language Models: Methods, Analysis & Insights from Training Gopher. JO - CoRR VL - abs/2112.11446 PY - 2021// UR - https://arxiv.org/abs/2112.11446 ER - TY - CPAPER ID - DBLP:conf/nips/BarretoBHCAHTHM19 AU - Barreto, André AU - Borsa, Diana AU - Hou, Shaobo AU - Comanici, Gheorghe AU - Aygün, Eser AU - Hamel, Philippe AU - Toyama, Daniel AU - Hunt, Jonathan J. AU - Mourad, Shibl AU - Silver, David AU - Precup, Doina TI - The Option Keyboard: Combining Skills in Reinforcement Learning. BT - Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada. SP - 13031 EP - 13041 PY - 2019// UR - https://proceedings.neurips.cc/paper/2019/hash/251c5ffd6b62cc21c446c963c76cf214-Abstract.html UR - http://papers.nips.cc/paper/9463-the-option-keyboard-combining-skills-in-reinforcement-learning ER -