default search action
Zhiheng Xi
Person information
Other persons with a similar name
SPARQL queries
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c16]Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Wei Shen, Limao Xiong, Yuhao Zhou, Xiao Wang, Zhiheng Xi, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin. ACL (1) 2024: 1932-1945 - [c15]Shihan Dou, Yan Liu, Haoxiang Jia, Enyu Zhou, Limao Xiong, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang:
StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback. ACL (1) 2024: 4571-4585 - [c14]Yuhao Zhou, Wenxiang Chen, Rui Zheng, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
ORTicket: Let One Robust BERT Ticket Transfer across Different Tasks. LREC/COLING 2024: 12527-12538 - [c13]Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang:
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions. LREC/COLING 2024: 14186-14203 - [c12]Rui Zheng, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals. LREC/COLING 2024: 15410-15421 - [c11]Binghai Wang, Rui Zheng, Lu Chen, Zhiheng Xi, Wei Shen, Yuhao Zhou, Dong Yan, Tao Gui, Qi Zhang, Xuanjing Huang:
Reward Modeling Requires Automatic Adjustment Based on Data Quality. EMNLP (Findings) 2024: 4041-4064 - [c10]Han Xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang, Xuanjing Huang:
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data. EMNLP (Findings) 2024: 8178-8188 - [c9]Lu Chen, Rui Zheng, Binghai Wang, Senjie Jin, Caishuang Huang, Junjie Ye, Zhihao Zhang, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning. EMNLP 2024: 15270-15283 - [c8]Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang:
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning. ICLR 2024 - [c7]Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang:
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning. ICML 2024 - [c6]Wei He, Shichun Liu, Jun Zhao, Yiwen Ding, Yi Lu, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models. NAACL-HLT (Findings) 2024: 3829-3845 - [i23]Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
Secrets of RLHF in Large Language Models Part II: Reward Modeling. CoRR abs/2401.06080 (2024) - [i22]Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
MouSi: Poly-Visual-Expert Vision-Language Models. CoRR abs/2401.17221 (2024) - [i21]Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui:
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback. CoRR abs/2402.01391 (2024) - [i20]Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang:
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning. CoRR abs/2402.05808 (2024) - [i19]Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang:
RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions. CoRR abs/2402.16431 (2024) - [i18]Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang:
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models. CoRR abs/2403.12171 (2024) - [i17]Rui Zheng, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals. CoRR abs/2403.16176 (2024) - [i16]Wei He, Shichun Liu, Jun Zhao, Yiwen Ding, Yi Lu, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models. CoRR abs/2404.00884 (2024) - [i15]Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments. CoRR abs/2406.04151 (2024) - [i14]Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Hang Li, Yang Liu:
Toward Optimal LLM Alignments Using Two-Player Games. CoRR abs/2406.10977 (2024) - [i13]Han Xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang, Xuanjing Huang:
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data. CoRR abs/2408.14874 (2024) - [i12]Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment. CoRR abs/2410.09893 (2024) - [i11]Shuo Li, Tao Ji, Xiaoran Fan, Linsheng Lu, Leyi Yang, Yuming Yang, Zhiheng Xi, Rui Zheng, Yuran Wang, Xiaohui Zhao, Tao Gui, Qi Zhang, Xuanjing Huang:
Have the VLMs Lost Confidence? A Study of Sycophancy in VLMs. CoRR abs/2410.11302 (2024) - [i10]Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang:
Distill Visual Chart Reasoning Ability from LLMs to MLLMs. CoRR abs/2410.18798 (2024) - 2023
- [c5]Rui Zheng, Zhiheng Xi, Qin Liu, Wenbin Lai, Tao Gui, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan, Weifeng Ge:
Characterizing the Impacts of Instances on Robustness. ACL (Findings) 2023: 2314-2332 - [c4]Zhiheng Xi, Rui Zheng, Yuansen Zhang, Xuanjing Huang, Zhongyu Wei, Minlong Peng, Mingming Sun, Qi Zhang, Tao Gui:
Connectivity Patterns are Task Embeddings. ACL (Findings) 2023: 11993-12013 - [c3]Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang:
RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms. EMNLP (Findings) 2023: 10262-10274 - [c2]Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Jia Liu, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement. EMNLP (Findings) 2023: 11383-11406 - [i9]Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Tao Gui, Qi Zhang, Xuanjing Huang:
Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement. CoRR abs/2305.14497 (2023) - [i8]Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang:
Secrets of RLHF in Large Language Models Part I: PPO. CoRR abs/2307.04964 (2023) - [i7]Shihan Dou, Junjie Shan, Haoxiang Jia, Wenhao Deng, Zhiheng Xi, Wei He, Yueming Wu, Tao Gui, Yang Liu, Xuanjing Huang:
Towards Understanding the Capability of Large Language Models on Code Clone Detection: A Survey. CoRR abs/2308.01191 (2023) - [i6]Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui:
The Rise and Potential of Large Language Model Based Agents: A Survey. CoRR abs/2309.07864 (2023) - [i5]Xiao Wang, Yuansen Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xianjun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang:
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models. CoRR abs/2310.06762 (2023) - [i4]Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang:
RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms. CoRR abs/2310.11227 (2023) - [i3]Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang:
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning. CoRR abs/2310.11971 (2023) - [i2]Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment. CoRR abs/2312.09979 (2023) - 2022
- [c1]Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
Efficient Adversarial Training with Robust Early-Bird Tickets. EMNLP 2022: 8318-8331 - [i1]Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
Efficient Adversarial Training with Robust Early-Bird Tickets. CoRR abs/2211.07263 (2022)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-30 00:14 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint