iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://dblp.org/pid/159/1894.rss
dblp: Kelvin Xu https://dblp.org/pid/159/1894.html dblp person page RSS feed Tue, 22 Oct 2024 20:16:14 +0200 en-US daily 1 released under the CC0 1.0 license dblp@dagstuhl.de (dblp team) dblp@dagstuhl.de (dblp team) Computers/Computer_Science/Publications/Bibliographies http://www.rssboard.org/rss-specification https://dblp.org/img/logo.144x51.pngdblp: Kelvin Xuhttps://dblp.org/pid/159/1894.html14451 Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models.https://openreview.net/forum?id=lNAyUngGFK, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. Trans. Mach. Learn. Res. 2024 ()]]>
https://dblp.org/rec/journals/tmlr/SinghCAAPGLH0XP24Mon, 01 Jan 2024 00:00:00 +0100
ContMulti-objective Optimization Model for Momentum Change Based on Genetic Algorithm.https://doi.org/10.1007/978-981-97-5578-3_11, , , , , , :
ContMulti-objective Optimization Model for Momentum Change Based on Genetic Algorithm. ICIC (1) : 134-145]]>
https://dblp.org/rec/conf/icic/ZhangKXSKLZ24Mon, 01 Jan 2024 00:00:00 +0100
Small-scale proxies for large-scale Transformer training instabilities.https://openreview.net/forum?id=d8w0pmvXbZ, , , , , , , , , , , , , , , :
Small-scale proxies for large-scale Transformer training instabilities. ICLR ]]>
https://dblp.org/rec/conf/iclr/WortsmanLXEAACG24Mon, 01 Jan 2024 00:00:00 +0100
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters.https://doi.org/10.48550/arXiv.2408.03314, , , :
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. CoRR abs/2408.03314 ()]]>
https://dblp.org/rec/journals/corr/abs-2408-03314Mon, 01 Jan 2024 00:00:00 +0100
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability.https://doi.org/10.48550/arXiv.2408.07852, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability. CoRR abs/2408.07852 ()]]>
https://dblp.org/rec/journals/corr/abs-2408-07852Mon, 01 Jan 2024 00:00:00 +0100
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries.https://doi.org/10.48550/arXiv.2409.12640, , , , , , , , , , , , , , , , , , , , , , , :
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries. CoRR abs/2409.12640 ()]]>
https://dblp.org/rec/journals/corr/abs-2409-12640Mon, 01 Jan 2024 00:00:00 +0100
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance.https://doi.org/10.1109/ICRA48891.2023.10161493, , , , , , :
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance. ICRA : 5938-5945]]>
https://dblp.org/rec/conf/icra/XuHDRKGL23Sun, 01 Jan 2023 00:00:00 +0100
Small-scale proxies for large-scale Transformer training instabilities.https://doi.org/10.48550/arXiv.2309.14322, , , , , , , , , , , , , , , :
Small-scale proxies for large-scale Transformer training instabilities. CoRR abs/2309.14322 ()]]>
https://dblp.org/rec/journals/corr/abs-2309-14322Sun, 01 Jan 2023 00:00:00 +0100
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?https://doi.org/10.48550/arXiv.2311.07587, , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? CoRR abs/2311.07587 ()]]>
https://dblp.org/rec/journals/corr/abs-2311-07587Sun, 01 Jan 2023 00:00:00 +0100
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models.https://doi.org/10.48550/arXiv.2311.18232, , , , , , , :
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models. CoRR abs/2311.18232 ()]]>
https://dblp.org/rec/journals/corr/abs-2311-18232Sun, 01 Jan 2023 00:00:00 +0100
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models.https://doi.org/10.48550/arXiv.2312.06585, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models. CoRR abs/2312.06585 ()]]>
https://dblp.org/rec/journals/corr/abs-2312-06585Sun, 01 Jan 2023 00:00:00 +0100
Towards Adaptive, Continual Embodied Agents.https://www.escholarship.org/uc/item/3tk9g0b7:
Towards Adaptive, Continual Embodied Agents. University of California, Berkeley, USA, ]]>
https://dblp.org/rec/phd/us/Xu22fSat, 01 Jan 2022 00:00:00 +0100
Autonomous Reinforcement Learning: Formalism and Benchmarking.https://openreview.net/forum?id=nkaba3ND7B5, , , , , , :
Autonomous Reinforcement Learning: Formalism and Benchmarking. ICLR ]]>
https://dblp.org/rec/conf/iclr/SharmaXS0HLF22Sat, 01 Jan 2022 00:00:00 +0100
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance.https://doi.org/10.48550/arXiv.2212.09902, , , , , , :
Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance. CoRR abs/2212.09902 ()]]>
https://dblp.org/rec/journals/corr/abs-2212-09902Sat, 01 Jan 2022 00:00:00 +0100
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention.https://doi.org/10.1109/ICRA48506.2021.9561384, , , , , , , :
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. ICRA : 6664-6671]]>
https://dblp.org/rec/conf/icra/0004YZKRXDL21Fri, 01 Jan 2021 00:00:00 +0100
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention.https://arxiv.org/abs/2104.11203, , , , , , , :
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention. CoRR abs/2104.11203 ()]]>
https://dblp.org/rec/journals/corr/abs-2104-11203Fri, 01 Jan 2021 00:00:00 +0100
Autonomous Reinforcement Learning: Formalism and Benchmarking.https://arxiv.org/abs/2112.09605, , , , , , :
Autonomous Reinforcement Learning: Formalism and Benchmarking. CoRR abs/2112.09605 ()]]>
https://dblp.org/rec/journals/corr/abs-2112-09605Fri, 01 Jan 2021 00:00:00 +0100
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples.https://openreview.net/forum?id=rkgAGAVKPr, , , , , , , , , , :
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. ICLR ]]>
https://dblp.org/rec/conf/iclr/TriantafillouZD20Wed, 01 Jan 2020 00:00:00 +0100
Continual Learning of Control Primitives : Skill Discovery via Reset-Games.https://proceedings.neurips.cc/paper/2020/hash/3472ab80b6dff70c54758fd6dfc800c2-Abstract.html, , , :
Continual Learning of Control Primitives : Skill Discovery via Reset-Games. NeurIPS ]]>
https://dblp.org/rec/conf/nips/XuVFL20Wed, 01 Jan 2020 00:00:00 +0100
Continual Learning of Control Primitives: Skill Discovery via Reset-Games.https://arxiv.org/abs/2011.05286, , , :
Continual Learning of Control Primitives: Skill Discovery via Reset-Games. CoRR abs/2011.05286 ()]]>
https://dblp.org/rec/journals/corr/abs-2011-05286Wed, 01 Jan 2020 00:00:00 +0100
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning.http://proceedings.mlr.press/v97/xu19d.html, , , , :
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. ICML : 6952-6962]]>
https://dblp.org/rec/conf/icml/XuRDLF19Tue, 01 Jan 2019 00:00:00 +0100
Privacy-Preserving Fall Detection with Deep Learning on mmWave Radar Signal.https://doi.org/10.1109/VCIP47243.2019.8965661, , , , :
Privacy-Preserving Fall Detection with Deep Learning on mmWave Radar Signal. VCIP : 1-4]]>
https://dblp.org/rec/conf/vcip/SunH0JX19Tue, 01 Jan 2019 00:00:00 +0100
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples.http://arxiv.org/abs/1903.03096, , , , , , , , , :
Meta-Dataset: A Dataset of Datasets for Learning to Learn from Few Examples. CoRR abs/1903.03096 ()]]>
https://dblp.org/rec/journals/corr/abs-1903-03096Tue, 01 Jan 2019 00:00:00 +0100
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control.https://openreview.net/forum?id=HyrCWeWCb, , , :
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. ICLR (Poster) ]]>
https://dblp.org/rec/conf/iclr/Nachum0XS18Mon, 01 Jan 2018 00:00:00 +0100
Probabilistic Model-Agnostic Meta-Learning.https://proceedings.neurips.cc/paper/2018/hash/8e2c381d4dd04f1c55093f22c59c3a08-Abstract.html, , :
Probabilistic Model-Agnostic Meta-Learning. NeurIPS : 9537-9548]]>
https://dblp.org/rec/conf/nips/FinnXL18Mon, 01 Jan 2018 00:00:00 +0100
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning.http://arxiv.org/abs/1805.12573, , , , :
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. CoRR abs/1805.12573 ()]]>
https://dblp.org/rec/journals/corr/abs-1805-12573Mon, 01 Jan 2018 00:00:00 +0100
Probabilistic Model-Agnostic Meta-Learning.http://arxiv.org/abs/1806.02817, , :
Probabilistic Model-Agnostic Meta-Learning. CoRR abs/1806.02817 ()]]>
https://dblp.org/rec/journals/corr/abs-1806-02817Mon, 01 Jan 2018 00:00:00 +0100
On integrating a language model into neural machine translation.https://doi.org/10.1016/j.csl.2017.01.014, , , , :
On integrating a language model into neural machine translation. Comput. Speech Lang. 45: 137-148 ()]]>
https://dblp.org/rec/journals/csl/GulcehreFXCB17Sun, 01 Jan 2017 00:00:00 +0100
An Actor-Critic Algorithm for Sequence Prediction.https://openreview.net/forum?id=SJDaqqveg, , , , , , , :
An Actor-Critic Algorithm for Sequence Prediction. ICLR (Poster) ]]>
https://dblp.org/rec/conf/iclr/BahdanauBXGLPCB17Sun, 01 Jan 2017 00:00:00 +0100
Unsupervised Perceptual Rewards for Imitation Learning.https://openreview.net/forum?id=Byf3mmNFl, , :
Unsupervised Perceptual Rewards for Imitation Learning. ICLR (Workshop) ]]>
https://dblp.org/rec/conf/iclr/SermanetXL17Sun, 01 Jan 2017 00:00:00 +0100
Bridging the Gap Between Value and Policy Based Reinforcement Learning.https://proceedings.neurips.cc/paper/2017/hash/facf9f743b083008a894eee7baa16469-Abstract.html, , , :
Bridging the Gap Between Value and Policy Based Reinforcement Learning. NIPS : 2775-2785]]>
https://dblp.org/rec/conf/nips/NachumNXS17Sun, 01 Jan 2017 00:00:00 +0100
Unsupervised Perceptual Rewards for Imitation Learning.http://www.roboticsproceedings.org/rss13/p50.html, , :
Unsupervised Perceptual Rewards for Imitation Learning. Robotics: Science and Systems ]]>
https://dblp.org/rec/conf/rss/SermanetXL17Sun, 01 Jan 2017 00:00:00 +0100
Bridging the Gap Between Value and Policy Based Reinforcement Learning.http://arxiv.org/abs/1702.08892, , , :
Bridging the Gap Between Value and Policy Based Reinforcement Learning. CoRR abs/1702.08892 ()]]>
https://dblp.org/rec/journals/corr/NachumNXS17Sun, 01 Jan 2017 00:00:00 +0100
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control.http://arxiv.org/abs/1707.01891, , , :
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control. CoRR abs/1707.01891 ()]]>
https://dblp.org/rec/journals/corr/NachumNXS17aaSun, 01 Jan 2017 00:00:00 +0100
Theano: A Python framework for fast computation of mathematical expressions.http://arxiv.org/abs/1605.02688, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , :
Theano: A Python framework for fast computation of mathematical expressions. CoRR abs/1605.02688 ()]]>
https://dblp.org/rec/journals/corr/Al-RfouAAa16Fri, 01 Jan 2016 00:00:00 +0100
An Actor-Critic Algorithm for Sequence Prediction.http://arxiv.org/abs/1607.07086, , , , , , , :
An Actor-Critic Algorithm for Sequence Prediction. CoRR abs/1607.07086 ()]]>
https://dblp.org/rec/journals/corr/BahdanauBXGLPCB16Fri, 01 Jan 2016 00:00:00 +0100
Unsupervised Perceptual Rewards for Imitation Learning.http://arxiv.org/abs/1612.06699, , :
Unsupervised Perceptual Rewards for Imitation Learning. CoRR abs/1612.06699 ()]]>
https://dblp.org/rec/journals/corr/SermanetXL16Fri, 01 Jan 2016 00:00:00 +0100
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.http://proceedings.mlr.press/v37/xuc15.html, , , , , , , :
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. ICML : 2048-2057]]>
https://dblp.org/rec/conf/icml/XuBKCCSZB15Thu, 01 Jan 2015 00:00:00 +0100
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention.http://arxiv.org/abs/1502.03044, , , , , , , :
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. CoRR abs/1502.03044 ()]]>
https://dblp.org/rec/journals/corr/XuBKCCSZB15Thu, 01 Jan 2015 00:00:00 +0100
On Using Monolingual Corpora in Neural Machine Translation.http://arxiv.org/abs/1503.03535, , , , , , , , :
On Using Monolingual Corpora in Neural Machine Translation. CoRR abs/1503.03535 ()]]>
https://dblp.org/rec/journals/corr/GulcehreFXCBLBS15Thu, 01 Jan 2015 00:00:00 +0100
A Controller Recognizer Framework: How necessary is recognition for control?http://arxiv.org/abs/1511.06428, , , :
A Controller Recognizer Framework: How necessary is recognition for control? CoRR abs/1511.06428 ()]]>
https://dblp.org/rec/journals/corr/MoczulskiXCC15Thu, 01 Jan 2015 00:00:00 +0100