default search action
NAACL-HLT 2024: Mexico City, Mexico
- Kevin Duh, Helena Gómez-Adorno, Steven Bethard:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), NAACL 2024, Mexico City, Mexico, June 16-21, 2024. Association for Computational Linguistics 2014, ISBN 979-8-89176-114-8 - Frontmatter.
- Hongyi Liu, Qingyun Wang, Payam Karisani, Heng Ji:
Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences. 1-21 - Hongyi Yuan, Zheng Yuan, Chuanqi Tan, Fei Huang, Songfang Huang:
Text Diffusion Model with Encoder-Decoder Transformers for Sequence-to-Sequence Generation. 22-39 - Nikhil Mehta, Dan Goldwasser:
An Interactive Framework for Profiling News Media Sources. 40-58 - Yinghao Li, Haorui Wang, Chao Zhang:
Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study. 59-81 - Taeyang Yun, Hyunkuk Lim, Jeonghwan Lee, Min Song:
TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation. 82-95 - Seanie Lee, Jianpeng Cheng, Joris Driesen, Alexandru Coca, Anders Johannsen:
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries. 96-111 - Maitrey Mehta, Valentina Pyatkin, Vivek Srikumar:
Promptly Predicting Structures: The Return of Inference. 112-130 - Yutong Shao, Ndapa Nakashole:
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL. 131-156 - Thang Le, Anh Tuan Luu:
Extractive Summarization with Text Generator. 157-174 - Michele Resta, Davide Bacciu:
Self-generated Replay Memories for Continual Neural Machine Translation. 175-191 - Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran:
Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models. 192-210 - Shreya Havaldar, Salvatore Giorgi, Sunny Rai, Thomas Talhelm, Sharath Chandra Guntuku, Lyle H. Ungar:
Building Knowledge-Guided Lexica to Model Cultural Variation. 211-226 - Shangqian Gao, Ting Hua, Yen-Chang Hsu, Yilin Shen, Hongxia Jin:
Adaptive Rank Selections for Low-Rank Approximation of Language Models. 227-241 - Pengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang:
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation. 242-256 - Zhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge, Furu Wei, Heng Ji:
Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration. 257-279 - Ziyang Wang, Sanwoo Lee, Hsiu-Yuan Huang, Yunfang Wu:
FPT: Feature Prompt Tuning for Few-shot Readability Assessment. 280-295 - Junlong Li, Jinyuan Wang, Zhuosheng Zhang, Hai Zhao:
Self-Prompting Large Language Models for Zero-Shot Open-Domain QA. 296-310 - Kai Sun, Yifan Ethan Xu, Hanwen Zha, Yue Liu, Xin Luna Dong:
Head-to-Tail: How Knowledgeable are Large Language Models (LLMs)? A.K.A. Will LLMs Replace Knowledge Graphs? 311-325 - Wenting Zhao, Ye Liu, Yao Wan, Yibo Wang, Qingyang Wu, Zhongfen Deng, Jiangshu Du, Shuaiqi Liu, Yunlong Xu, Philip S. Yu:
kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning. 326-337 - Jon Saad-Falcon, Omar Khattab, Christopher Potts, Matei Zaharia:
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems. 338-354 - Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo:
DEMO: A Statistical Perspective for Efficient Image-Text Matching. 355-369 - Bin Wang, Zhengyuan Liu, Xin Huang, Fangkai Jiao, Yang Ding, AiTi Aw, Nancy Chen:
SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning. 370-390 - Seongyun Lee, Sue Hyun Park, Yongrae Jo, Minjoon Seo:
Volcano: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision. 391-404 - Samuel Cahyawijaya, Holy Lovenia, Pascale Fung:
LLMs Are Few-Shot In-Context Low-Resource Language Learners. 405-433 - Yuekun Yao, Alexander Koller:
Simple and effective data augmentation for compositional generalization. 434-449 - Tianyang Liu, Fei Wang, Muhao Chen:
Rethinking Tabular Data Understanding with Large Language Models. 450-482 - Qin Liu, Fei Wang, Chaowei Xiao, Muhao Chen:
From Shortcuts to Triggers: Backdoor Defense with Denoised PoE. 483-496 - Rahul Kumar, Amar Raja Dibbu, Shrutendra Harsola, Vignesh Subrahmaniam, Ashutosh Modi:
BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain. 497-516 - Shamik Roy, Sailik Sengupta, Daniele Bonadiman, Saab Mansour, Arshit Gupta:
FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs. 517-539 - Yuxi Feng, Laks V. S. Lakshmanan:
DuRE: Dual Contrastive Self Training for Semi-Supervised Relation Extraction. 540-555 - Zhen Yu, Zhenhua Chen, Kun He:
Query-Efficient Textual Adversarial Example Generation for Black-Box Attacks. 556-569 - Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu:
Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles. 570-593 - Haoyi Qiu, Kung-Hsiang Huang, Jingnong Qu, Nanyun Peng:
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation. 594-608 - Lang Cao, Zifeng Wang, Cao Xiao, Jimeng Sun:
PILOT: Legal Case Outcome Prediction with Case Law. 609-621 - Zequan Liu, Jiawen Lyn, Wei Zhu, Xing Tian, Yvette Graham:
ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models. 622-641 - Heng-Jui Chang, James R. Glass:
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces. 642-662 - Yifan Wang, Yafei Liu, Chufan Shi, Haoling Li, Chen Chen, Haonan Lu, Yujiu Yang:
InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions. 663-677 - Saiteja Utpala, Alex Gu, Pin-Yu Chen:
Language Agnostic Code Embeddings. 678-691 - Teli Ma, Rong Li, Junwei Liang:
An Examination of the Compositionality of Large Generative Vision-Language Models. 692-705 - Victoria Graf, Qin Liu, Muhao Chen:
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors. 706-718 - Jonathan Rusert:
VertAttack: Taking Advantage of Text Classifiers' Horizontal Vision. 719-732 - Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu:
KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive Learning. 733-749 - Jian Zhu, Changbing Yang, Farhan Samir, Jahurul Islam:
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language. 750-772 - Yunqi Zhang, Songda Li, Chunyuan Deng, Luyi Wang, Hui Zhao:
Think Before You Act: A Two-Stage Framework for Mitigating Gender Bias Towards Vision-Language Tasks. 773-791 - Xianming Li, Jing Li:
BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings. 792-804 - Weixuan Wang, Barry Haddow, Alexandra Birch, Wei Peng:
Assessing Factual Reliability of Large Language Model Knowledge. 805-819 - Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu:
Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems. 820-830 - Cheng Qian, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu:
Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model. 831-854 - Letian Wang, Xianggen Liu, Jiancheng Lv:
Create! Don't Repeat: A Paradigm Shift in Multi-Label Augmentation through Label Creative Generation. 855-869 - Ali Safaya, Deniz Yuret:
Neurocache: Efficient Vector Retrieval for Long-range Language Modeling. 870-883 - Haoran Yang, Yumeng Zhang, Jiaqi Xu, Hongyuan Lu, Pheng-Ann Heng, Wai Lam:
Unveiling the Generalization Power of Fine-Tuned Large Language Models. 884-899 - Ruixin Hong, Hongming Zhang, Xinyu Pang, Dong Yu, Changshui Zhang:
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning. 900-925 - Fangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty:
Exploring Self-supervised Logic-enhanced Training for Large Language Models. 926-941 - Debrup Das, Debopriyo Banerjee, Somak Aditya, Ashish Kulkarni:
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning. 942-966 - Dawei Zhu, Wenhao Wu, Yifan Song, Fangwei Zhu, Ziqiang Cao, Sujian Li:
CoUDA: Coherence Evaluation via Unified Data Augmentation. 967-978 - Vipul Raheja, Dimitris Alikaniotis, Vivek Kulkarni, Bashar Alhafni, Dhruv Kumar:
mEdIT: Multilingual Text Editing via Instruction Tuning. 979-1001 - Yunchao Zhang, Zonglin Di, Kaiwen Zhou, Cihang Xie, Xin Wang:
Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning. 1002-1016 - Gilad Deutch, Nadav Magar, Tomer Bar Natan, Guy Dar:
In-context Learning and Gradient Descent Revisited. 1017-1028 - Olufunke Oluyemi Sarumi, Béla Neuendorf, Joan Plepi, Lucie Flek, Jörg Schlötterer, Charles Welch:
Corpus Considerations for Annotator Modeling and Scaling. 1029-1040 - Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou:
On Large Language Models' Hallucination with Regard to Known Facts. 1041-1053 - Li Lucy, Su Lin Blodgett, Milad Shokouhi, Hanna M. Wallach, Alexandra Olteanu:
"One-Size-Fits-All"? Examining Expectations around What Constitute "Fair" or "Good" NLG System Behaviors. 1054-1089 - Jian Guan, Jesse Dodge, David Wadden, Minlie Huang, Hao Peng:
Language Models Hallucinate, but May Excel at Fact Verification. 1090-1111 - Bowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, Yue Zhang:
A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution. 1112-1140 - Mengxin Zheng, Jiaqi Xue, Xun Chen, Yanshan Wang, Qian Lou, Lei Jiang:
TrojFSP: Trojan Insertion in Few-shot Prompt Tuning. 1141-1151 - Yi Luo, Zhenghao Lin, Yuhao Zhang, Jiashuo Sun, Chen Lin, Chengjin Xu, Xiangdong Su, Yelong Shen, Jian Guo, Yeyun Gong:
Ensuring Safe and High-Quality Outputs: A Guideline Library Approach for Language Models. 1152-1197 - Juan Diego Rodriguez, Katrin Erk, Greg Durrett:
X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs. 1198-1222 - Rajiv Movva, Sidhika Balachandar, Kenny Peng, Gabriel Agostini, Nikhil Garg, Emma Pierson:
Topics, Authors, and Institutions in Large Language Model Research: Trends from 17K arXiv Papers. 1223-1243 - Zhehao Zhang, Yan Gao, Jian-Guang Lou:
E⁵: Zero-shot Hierarchical Table Analysis using Augmented LLMs via Explain, Extract, Execute, Exhibit and Extrapolate. 1244-1258 - Fangyu Lei, Qian Liu, Yiming Huang, Shizhu He, Jun Zhao, Kang Liu:
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Model. 1259-1286 - Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu:
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning. 1287-1310 - Chengxu Zhuang, Evelina Fedorenko, Jacob Andreas:
Visual Grounding Helps Learn Word Meanings in Low-Data Regimes. 1311-1329 - Hendra Setiawan:
Accurate Knowledge Distillation via n-best Reranking. 1330-1345 - Zhaorun Chen, Zhuokai Zhao, Zhihong Zhu, Ruiqi Zhang, Xiang Li, Bhiksha Raj, Huaxiu Yao:
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition. 1346-1362 - Tal Schuster, Ádám D. Lelkes, Haitian Sun, Jai Gupta, Jonathan Berant, William W. Cohen, Donald Metzler:
SEMQA: Semi-Extractive Multi-Source Question Answering. 1363-1381 - Hao Lang, Fei Huang, Yongbin Li:
Fine-Tuning Language Models with Reward Learning on Policy. 1382-1392 - Robert Pugh, Francis M. Tyers:
A Universal Dependencies Treebank for Highland Puebla Nahuatl. 1393-1403 - Haryo Akbarianto Wibowo, Erland Hilman Fuadi, Made Nindyatama Nityasya, Radityo Eko Prasojo, Alham Fikri Aji:
COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances. 1404-1422 - Xiusi Chen, Hongzhi Wen, Sreyashi Nag, Chen Luo, Qingyu Yin, Ruirui Li, Zheng Li, Wei Wang:
IterAlign: Iterative Constitutional Alignment of Large Language Models. 1423-1433 - Chia-Hsuan Lee, Hao Cheng, Mari Ostendorf:
OrchestraLLM: Efficient Orchestration of Language Models for Dialogue State Tracking. 1434-1445 - Marco Valentino, Jordan Meadows, Lan Zhang, André Freitas:
Multi-Operational Mathematical Derivations in Latent Space. 1446-1458 - Chenglei Si, Navita Goyal, Tongshuang Wu, Chen Zhao, Shi Feng, Hal Daumé III, Jordan L. Boyd-Graber:
Large Language Models Help Humans Verify Truthfulness - Except When They Are Convincingly Wrong. 1459-1474 - Brendon Boldt, David R. Mortensen:
XferBench: a Data-Driven Benchmark for Emergent Language. 1475-1489 - Se-eun Yoon, Zhankui He, Jessica Maria Echterhoff, Julian J. McAuley:
Evaluating Large Language Models as Generative User Simulators for Conversational Recommendation. 1490-1504 - Jordan Meadows, Marco Valentino, Damien Teney, André Freitas:
A Symbolic Framework for Evaluating Mathematical Reasoning and Generalisation with Transformers. 1505-1523 - David Chanin, Anthony Hunter, Oana-Maria Camburu:
Identifying Linear Relational Concepts in Large Language Models. 1524-1535 - Venelin Kovatchev, Matthew Lease:
Benchmark Transparency: Measuring the Impact of Data on Evaluation. 1536-1551 - Jillian Fisher, Ximing Lu, Jaehun Jung, Liwei Jiang, Zaïd Harchaoui, Yejin Choi:
JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models. 1552-1581 - Zhenyu He, Zexuan Zhong, Tianle Cai, Jason D. Lee, Di He:
REST: Retrieval-Based Speculative Decoding. 1582-1595 - Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu:
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations. 1596-1609 - Mobashir Sadat, Cornelia Caragea:
MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference. 1610-1629 - Bohan Zhang, Yixin Wang, Paramveer Dhillon:
Causal Inference for Human-Language Model Collaboration. 1630-1647 - Zezhong Wang, Fangkai Yang, Lu Wang, Pu Zhao, Hongru Wang, Liang Chen, Qingwei Lin, Kam-Fai Wong:
SELF-GUARD: Empower the LLM to Safeguard Itself. 1648-1668 - Jinpeng Li, Hang Yu, Xiangfeng Luo, Qian Liu:
COSIGN: Contextual Facts Guided Generation for Knowledge Graph Completion. 1669-1682 - Zhewei Sun, Qian Hu, Rahul Gupta, Richard S. Zemel, Yang Xu:
Toward Informal Language Processing: Knowledge of Slang in Large Language Models. 1683-1701 - Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein:
Ghostbuster: Detecting Text Ghostwritten by Large Language Models. 1702-1717 - Jiahao Zhang, Haiyang Zhang, Dongmei Zhang, Yong Liu, Shen Huang:
End-to-End Beam Retrieval for Multi-Hop Question Answering. 1718-1731 - Binghao Tang, Boda Lin, Haolong Yan, Si Li:
Leveraging Generative Large Language Models with Visual Instruction and Demonstration Retrieval for Multimodal Sarcasm Detection. 1732-1742 - Xiaojun Kuang, C. L. Philip Chen, Shuzhen Li, Tong Zhang:
Multi-Scale Prompt Memory-Augmented Model for Black-Box Scenarios. 1743-1757 - Chenming Tang, Fanyi Qu, Yunfang Wu:
Ungrammatical-syntax-based In-context Example Selection for Grammatical Error Correction. 1758-1770 - Akari Asai, Sneha Kudugunta, Xinyan Yu, Terra Blevins, Hila Gonen, Machel Reid, Yulia Tsvetkov, Sebastian Ruder, Hannaneh Hajishirzi:
BUFFET: Benchmarking Large Language Models for Few-shot Cross-lingual Transfer. 1771-1800 - Yanhe Fu, Yanan Cao, Qingyue Wang, Yi Liu:
TISE: A Tripartite In-context Selection Method for Event Argument Extraction. 1801-1818 - Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim:
Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks. 1819-1862 - Yucheng Wang, Bowen Yu, Yilin Liu, Shudong Lu:
TRUE-UIE: Two Universal Relations Unify Information Extraction Tasks. 1863-1876 - Zifeng Ding, Heling Cai, Jingpei Wu, Yunpu Ma, Ruotong Liao, Bo Xiong, Volker Tresp:
zrLLM: Zero-Shot Relational Learning on Temporal Knowledge Graphs with Large Language Models. 1877-1895 - Jielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao:
Embodied Executable Policy Learning with Language-based Scene Summarization. 1896-1913 - Yuqing Wang, Yun Zhao:
Metacognitive Prompting Improves Understanding in Large Language Models. 1914-1926 - Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao:
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming. 1927-1937 - Young-Jun Lee, Byungsoo Ko, Han-Gyu Kim, Jonghwan Hyeon, Ho-Jin Choi:
DialogCC: An Automated Pipeline for Creating High-Quality Multi-Modal Dialogue Dataset. 1938-1963 - Keming Lu, Hongyi Yuan, Runji Lin, Junyang Lin, Zheng Yuan, Chang Zhou, Jingren Zhou:
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models. 1964-1974 - Jiarui Liu, Wenkai Li, Zhijing Jin, Mona T. Diab:
Automatic Generation of Model and Data Cards: A Step Towards Responsible AI. 1975-1997 - Chen Liu, Jonas Pfeiffer, Ivan Vulic, Iryna Gurevych:
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing. 1998-2015 - Chen Liu, Fajri Koto, Timothy Baldwin, Iryna Gurevych:
Are Multilingual LLMs Culturally-Diverse Reasoners? An Investigation into Multicultural Proverbs and Sayings. 2016-2039 - Shir Lissak, Nitay Calderon, Geva Shenkman, Yaakov Ophir, Eyal Fruchter, Anat Brunstein Klomek, Roi Reichart:
The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth. 2040-2079 - Jianli Zhao, Changhao Xu, Bin. Jiang:
IPED: An Implicit Perspective for Relational Triple Extraction based on Diffusion Model. 2080-2092 - Vishvak Murahari, Ameet Deshpande, Peter Clark, Tanmay Rajpurohit, Ashish Sabharwal, Karthik Narasimhan, Ashwin Kalyan:
QualEval: Qualitative Evaluation for Model Improvement. 2093-2111 - Kehuan Yan, Peichao Lai, Yilei Wang:
Quantum-inspired Language Model with Lindblad Master Equation and Interference Measurement for Sentiment Analysis. 2112-2121 - Dongsheng Zhu, Daniel Tang, Weidong Han, Jinghui Lu, Yukun Zhao, Guoliang Xing, Junfeng Wang, Dawei Yin:
VisLingInstruct: Elevating Zero-Shot Learning in Multi-Modal Language Models with Autonomous Instruction Optimization. 2122-2135 - Peng Ding, Jun Kuang, Dan Ma, Xuezhi Cao, Yunsen Xian, Jiajun Chen, Shujian Huang:
A Wolf in Sheep's Clothing: Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily. 2136-2153 - Yuhan Liu, Shangbin Feng, Xiaochuang Han, Vidhisha Balachandran, Chan Young Park, Sachin Kumar, Yulia Tsvetkov:
P³Sum: Preserving Author's Perspective in News Summarization with Diffusion Language Models. 2154-2173 - Rose E. Wang, Qingyang Zhang, Carly Robinson, Susanna Loeb, Dorottya Demszky:
Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistakes. 2174-2199 - Dongqi Pu, Vera Demberg:
RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization. 2200-2220 - Yao Lu, Jiayi Wang, Raphael Tang, Sebastian Riedel, Pontus Stenetorp:
Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt Optimisation. 2221-2231 - Jinhao Duan, Shiqi Wang, James Diffenderfer, Lichao Sun, Tianlong Chen, Bhavya Kailkhura, Kaidi Xu:
ReTA: Recursively Thinking Ahead to Improve the Strategic Reasoning of Large Language Models. 2232-2246 - Payam Karisani, Heng Ji:
Fact Checking Beyond Training Set. 2247-2261 - Anubha Kabra, Sanketh Rangreji, Yash Mathur, Aman Madaan, Emmy Liu, Graham Neubig:
Program-Aided Reasoners (Better) Know What They Know. 2262-2278 - Eve Fleisig, Su Lin Blodgett, Dan Klein, Zeerak Talat:
The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels. 2279-2292 - Aparna Elangovan, Jiayuan He, Yuan Li, Karin Verspoor:
Principles from Clinical Research for NLP Model Generalization. 2293-2309 - Naomi Saphra, Eve Fleisig, Kyunghyun Cho, Adam Lopez:
First Tragedy, then Parse: History Repeats Itself in the New Era of Large Language Models. 2310-2326 - Raphael Tang, Xinyu Crystina Zhang, Xueguang Ma, Jimmy Lin, Ferhan Ture:
Found in the Middle: Permutation Self-Consistency Improves Listwise Ranking in Large Language Models. 2327-2340 - Xuansheng Wu, Wenlin Yao, Jianshu Chen, Xiaoman Pan, Xiaoyang Wang, Ninghao Liu, Dong Yu:
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning. 2341-2369 - Jerry Junyang Cheung, Yuchen Zhuang, Yinghao Li, Pranav Shetty, Wantian Zhao, Sanjeev Grampurohit, Rampi Ramprasad, Chao Zhang:
POLYIE: A Dataset of Information Extraction from Polymer Material Scientific Literature. 2370-2385 - Kai Zhang, Yangyang Kang, Fubang Zhao, Xiaozhong Liu:
LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination. 2386-2398 - Jacob Parnell, Inigo Jauregi Unanue, Massimo Piccardi:
SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization. 2399-2415 - Hanseok Oh, Haebin Shin, Miyoung Ko, Hyunji Lee, Minjoon Seo:
KTRL+F: Knowledge-Augmented In-Document Search. 2416-2436 - Hyunji Lee, Se June Joo, Chaeeun Kim, Joel Jang, Doyoung Kim, Kyoung-Woon On, Minjoon Seo:
How Well Do Large Language Models Truly Ground? 2437-2465 - Vasudha Varadarajan, Sverker Sikström, Oscar N. E. Kjell, H. Andrew Schwartz:
ALBA: Adaptive Language-Based Assessments for Mental Health. 2466-2478 - Wei Zhou, Mohsen Mesgar, Heike Adel, Annemarie Friedrich:
FREB-TQA: A Fine-Grained Robustness Evaluation Benchmark for Table Question Answering. 2479-2497 - Pengyue Jia, Yiding Liu, Xiangyu Zhao, Xiaopeng Li, Changying Hao, Shuaiqiang Wang, Dawei Yin:
MILL: Mutual Verification with Large Language Models for Zero-Shot Query Expansion. 2498-2518 - Yotam Perlitz, Elron Bandel, Ariel Gera, Ofir Arviv, Liat Ein-Dor, Eyal Shnarch, Noam Slonim, Michal Shmueli-Scheuer, Leshem Choshen:
Efficient Benchmarking (of Language Models). 2519-2536 - Dana Arad, Hadas Orgad, Yonatan Belinkov:
ReFACT: Updating Text-to-Image Models by Editing the Text Encoder. 2537-2558 - V. S. D. S. Mahesh Akavarapu, Arnab Bhattacharya:
A Likelihood Ratio Test of Genetic Relationship among Languages. 2559-2570 - Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, Bowen Zhou:
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning. 2571-2597 - Sanchit Ahuja, Divyanshu Aggarwal, Varun Gumma, Ishaan Watts, Ashutosh Sathe, Millicent Ochieng, Rishav Hada, Prachi Jain, Mohamed Ahmed, Kalika Bali, Sunayana Sitaram:
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks. 2598-2637 - Zihan Qiu, Zeyu Huang, Jie Fu:
Unlocking Emergent Modularity in Large Language Models. 2638-2660 - Maja Stahl, Nadine Michel, Sebastian Kilsbach, Julian Schmidtke, Sara Rezat, Henning Wachsmuth:
A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality. 2661-2674 - Katrin Erk, Marianna Apidianaki:
Adjusting Interpretable Dimensions in Embedding Space with Human Judgments. 2675-2686 - You Zuo, Kim Gerdes, Éric de la Clergerie, Benoît Sagot:
PatentEval: Understanding Errors in Patent Generation. 2687-2710 - Sai Koneru, Miriam Exel, Matthias Huck, Jan Niehues:
Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing. 2711-2725 - Kaidi Jia, Rongsheng Li:
Metaphor Detection with Context Enhancement and Curriculum Learning. 2726-2737 - Wei Liu, Stephen Wan, Michael Strube:
What Causes the Failure of Explicit to Implicit Discourse Relation Recognition? 2738-2753 - Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan S. Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Karen Livescu, Shinji Watanabe:
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions. 2754-2774 - Lingbo Mo, Boshi Wang, Muhao Chen, Huan Sun:
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities. 2775-2792 - Yue Zhou, Yada Zhu, Diego Antognini, Yoon Kim, Yang Zhang:
Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models. 2793-2804 - Pengcheng Jiang, Cao Xiao, Zifeng Wang, Parminder Bhatia, Jimeng Sun, Jiawei Han:
TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale. 2805-2819 - Pengcheng Jiang, Jiacheng Lin, Zifeng Wang, Jimeng Sun, Jiawei Han:
GenRES: Rethinking Evaluation for Generative Relation Extraction in the Era of Large Language Models. 2820-2837 - Andrés Lou, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena:
Curated Datasets and Neural Models for Machine Translation of Informal Registers between Mayan and Spanish Vernaculars. 2838-2850 - Zoey Liu, Bonnie J. Dorr:
The Effect of Data Partitioning Strategy on Model Generalizability: A Case Study of Morphological Segmentation. 2851-2864 - Debasmita Bhattacharya, Siying Ding, Alayna Nguyen, Julia Hirschberg:
Measuring Entrainment in Spontaneous Code-switched Speech. 2865-2876 - Zacchary Sadeddine, Juri Opitz, Fabian M. Suchanek:
A Survey of Meaning Representations - From Theory to Practical Utility. 2877-2892 - Haozhe Zhao, Zefan Cai, Shuzheng Si, Liang Chen, Yufeng He, Kaikai An, Baobao Chang:
Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation. 2893-2907 - Arkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi:
Evaluating In-Context Learning of Libraries for Code Generation. 2908-2926 - Tingyu Qu, Tinne Tuytelaars, Marie-Francine Moens:
Visually-Aware Context Modeling for News Image Captioning. 2927-2943 - Athul Paul Jacob, Gabriele Farina, Jacob Andreas:
Regularized Conventions: Equilibrium Computation as a Model of Pragmatic Reasoning. 2944-2955 - Chau Pham, Alexander Miserlis Hoyle, Simeng Sun, Philip Resnik, Mohit Iyyer:
TopicGPT: A Prompt-based Topic Modeling Framework. 2956-2984 - Jiazhao Li, Yijin Yang, Zhuofeng Wu, V. G. Vinod Vydiswaran, Chaowei Xiao:
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger. 2985-3004 - Naitian Zhou, David Jurgens, David Bamman:
Social Meme-ing: Measuring Linguistic Variation in Memes. 3005-3024 - Chaitanya Malaviya, Subin Lee, Sihao Chen, Elizabeth Sieber, Mark Yatskar, Dan Roth:
ExpertQA: Expert-Curated Questions and Attributed Answers. 3025-3045 - Chaitanya Malaviya, Subin Lee, Dan Roth, Mark Yatskar:
What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception. 3046-3065 - Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu:
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels. 3066-3082 - Nathaniel R. Robinson, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Bizon Monroc, Loïc Grobol, Hasan Muhammad, Ashi Garg, Naome A. Etori, Vijay Murari Tiyyala, Olanrewaju Samuel, Matthew Dean Stutzman, Bismarck Bamfo Odoom, Sanjeev Khudanpur, Stephen D. Richardson, Kenton Murray:
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages. 3083-3110 - Jiashu Xu, Mingyu Derek Ma, Fei Wang, Chaowei Xiao, Muhao Chen:
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models. 3111-3126 - Jiamin Yang, David Jurgens:
Modeling Empathetic Alignment in Conversation. 3127-3148 - Dhiman Goswami, Sharanya Thilagan, Kai North, Shervin Malmasi, Marcos Zampieri:
Native Language Identification in Texts: A Survey. 3149-3160 - Yifan Yang, Jiajun Zhou, Ngai Wong, Zheng Zhang:
LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models. 3161-3176 - Chancharik Mitra, Abrar Anwar, Rodolfo Corona, Dan Klein, Trevor Darrell, Jesse Thomason:
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding. 3177-3189 - Ting-Yun Chang, Jesse Thomason, Robin Jia:
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks. 3190-3211 - Tianrong Zhang, Zhaohan Xi, Ting Wang, Prasenjit Mitra, Jinghui Chen:
PromptFix: Few-shot Backdoor Removal via Adversarial Prompt Tuning. 3212-3225 - Zhixue Zhao, Nikolaos Aletras:
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models. 3226-3244 - Shayne Longpre, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, Daphne Ippolito:
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity. 3245-3276 - Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh, Chaowei Xiao, Muhao Chen:
Instructional Fingerprinting of Large Language Models. 3277-3306 - Alireza Salkhordeh Ziabari, Ali Omrani, Parsa Hejabi, Preni Golazizian, Brendan Kennedy, Payam Piray, Morteza Dehghani:
Reinforced Multiple Instance Selection for Speaker Attribute Prediction. 3307-3321 - Shikhar Tuli, Chi-Heng Lin, Yen-Chang Hsu, Niraj K. Jha, Yilin Shen, Hongxia Jin:
DynaMo: Accelerating Language Model Inference with Dynamic Multi-Token Sampling. 3322-3345 - Haochen Liu, Song Wang, Chen Chen, Jundong Li:
Few-shot Knowledge Graph Relational Reasoning via Subgraph Adaptation. 3346-3356 - Chen Ling, Xujiang Zhao, Xuchao Zhang, Wei Cheng, Yanchi Liu, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Jie Ji, Guangji Bai, Liang Zhao, Haifeng Chen:
Uncertainty Quantification for In-Context Learning of Large Language Models. 3357-3370 - Zhilin Wang, Yi Dong, Jiaqi Zeng, Virginia Adams, Makesh Narsimhan Sreedhar, Daniel Egert, Olivier Delalleau, Jane Polak Scowcroft, Neel Kant, Aidan Swope, Oleksii Kuchaiev:
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM. 3371-3384 - Dawei Zhu, Sony Trenous, Xiaoyu Shen, Dietrich Klakow, Bill Byrne, Eva Hasler:
A Preference-driven Paradigm for Enhanced Translation with Large Language Models. 3385-3403 - Yusen Zhang, Nan Zhang, Yixin Liu, Alexander R. Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen R. McKeown, Rui Zhang:
Fair Abstractive Summarization of Diverse Perspectives. 3404-3426 - Anthony Meng Huat Tiong, Junqi Zhao, Boyang Li, Junnan Li, Steven C. H. Hoi, Caiming Xiong:
What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases. 3427-3454 - Nicholas Lourie, Kyunghyun Cho, He He:
Show Your Work with Confidence: Confidence Bands for Tuning Curves. 3455-3472 - Vinodkumar Prabhakaran, Christopher Homan, Lora Aroyo, Aida Mostafazadeh Davani, Alicia Parrish, Alex S. Taylor, Mark Diaz, Ding Wang, Gregory Serapio-García:
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives. 3473-3492 - Yidan Sun, Qin Chao, Boyang Li:
Event Causality Is Key to Computational Story Understanding. 3493-3511 - Yoichi Ishibashi, Sho Yokoi, Katsuhito Sudoh, Satoshi Nakamura:
Subspace Representations for Soft Set Operations and Sentence Similarities. 3512-3524 - Yuan Zhuang, Tianyu Jiang, Ellen Riloff:
My Heart Skipped a Beat! Recognizing Expressions of Embodied Emotion in Natural Language. 3525-3537 - Bill Cai, Clarence Boon Liang Ng, Daniel Liang, Shelvia Hotama:
Low-Cost Generation and Evaluation of Dictionary Example Sentences. 3538-3549 - Shuofei Qiao, Honghao Gui, Chengfei Lv, Qianghuai Jia, Huajun Chen, Ningyu Zhang:
Making Language Models Better Tool Learners with Execution Feedback. 3550-3568 - Jifan Chen, Grace Kim, Aniruddh Sriram, Greg Durrett, Eunsol Choi:
Complex Claim Verification with Evidence Retrieved in the Wild. 3569-3587 - Zehui Wu, Ziwei Gong, Jaywon Koo, Julia Hirschberg:
Multimodal Multi-loss Fusion Network for Sentiment Analysis. 3588-3602 - Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju:
Confronting LLMs with Traditional ML: Rethinking the Fairness of Large Language Models in Tabular Classifications. 3603-3620 - Meghdut Sengupta, Roxanne El Baff, Milad Alshomary, Henning Wachsmuth:
Analyzing the Use of Metaphors in News Editorials for Political Framing. 3621-3631 - Thanh-Thien Le, Viet Dao, Linh Nguyen, Thi-Nhung Nguyen, Linh Ngo, Thien Nguyen:
SharpSeq: Empowering Continual Event Detection through Sharpness-Aware Sequential-task Learning. 3632-3644 - Stephan Linzbach, Dimitar Dimitrov, Laura Kallmeyer, Kilian Evang, Hajira Jabeen, Stefan Dietze:
Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models. 3645-3655 - Ava Spataru, Eric Hambro, Elena Voita, Nicola Cancedda:
Know When To Stop: A Study of Semantic Drift in Text Generation. 3656-3671 - Kraig Tou, Zijun Sun:
Curriculum Masking in Vision-Language Pretraining to Maximize Cross Modal Interaction. 3672-3688 - Cristina España-Bonet, Alberto Barrón-Cedeño:
Elote, Choclo and Mazorca: on the Varieties of Spanish. 3689-3711 - Chonghua Wang, Haodong Duan, Songyang Zhang, Dahua Lin, Kai Chen:
Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks. 3712-3724 - Regina Ofori-Boateng, Magaly Aceves-Martins, Nirmalie Wiratunga, Carlos Francisco Moreno-García:
A Zero-Shot Monolingual Dual Stage Information Retrieval System for Spanish Biomedical Systematic Literature Reviews. 3725-3736 - Siyuan Huang, Yongping Xiong, Guibin Wu:
LayoutPointer: A Spatial-Context Adaptive Pointer Network for Visual Information Extraction. 3737-3748 - Domenic Rosati, Robie Gonzales, Jinkun Chen, Xuemin Yu, Yahya Kayani, Frank Rudzicz, Hassan Sajjad:
Long-form evaluation of model editing. 3749-3780 - Zhijing Jin, Yuen Chen, Fernando Gonzalez Adauto, Jiarui Liu, Jiayi Zhang, Julian Michael, Bernhard Schölkopf, Mona T. Diab:
Analyzing the Role of Semantic Representations in the Era of Large Language Models. 3781-3798 - Shuo Li, Sangdon Park, Insup Lee, Osbert Bastani:
TRAQ: Trustworthy Retrieval Augmented Question Answering via Conformal Prediction. 3799-3821 - Xinpei Zhao, Jingyuan Sun, Shaonan Wang, Jing Ye, Xiaohan Zhang, Chengqing Zong:
MapGuide: A Simple yet Effective Method to Reconstruct Continuous Language from Brain Activities. 3822-3832 - Monica Munnangi, Sergey Feldman, Byron C. Wallace, Silvio Amir, Tom Hope, Aakanksha Naik:
On-the-fly Definition Augmentation of LLMs for Biomedical NER. 3833-3854 - Bryan Li, Samar Haider, Chris Callison-Burch:
This Land is Your, My Land: Evaluating Geopolitical Bias in Language Models through Territorial Disputes. 3855-3871 - Xingwei Tan, Yuxiang Zhou, Gabriele Pergola, Yulan He:
Set-Aligning Framework for Auto-Regressive Event Temporal Graph Generation. 3872-3892 - Shujian Zhang, Lemeng Wu, Chengyue Gong, Xingchao Liu:
LanguageFlow: Advancing Diffusion Language Generation with Probabilistic Flows. 3893-3905 - Nilay Patel, Shivashankar Subramanian, Siddhant Garg, Pratyay Banerjee, Amita Misra:
Towards Improved Multi-Source Attribution for Long-Form Answer Generation. 3906-3919 - Aldo G. Carranza, Rezsa Farahani, Natalia Ponomareva, Alexey Kurakin, Matthew Jagielski, Milad Nasr:
Synthetic Query Generation for Privacy-Preserving Deep Retrieval Systems using Differentially Private Language Models. 3920-3930 - Abhijnan Nath, Shadi Manafi Avari, Avyakta Chelle, Nikhil Krishnaswamy:
Okay, Let's Do This! Modeling Event Coreference with Generated Rationales and Knowledge Distillation. 3931-3946 - Garima Agrawal, Tharindu Kumarage, Zeyad Alghamdi, Huan Liu:
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey. 3947-3960 - Brian D. Ondov, Kush Attal, Dina Demner-Fushman:
Pedagogically Aligned Objectives Create Reliable Automatic Cloze Tests. 3961-3972 - Kazuma Hashimoto, Karthik Raman, Michael Bendersky:
Take One Step at a Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context Learning. 3973-3990 - Chi Han, Qifan Wang, Hao Peng, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang:
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models. 3991-4008 - Albert Yu Sun, Varun Nair, Elliot Schumacher, Anitha Kannan:
CONSCENDI: A Contrastive and Scenario-Guided Distillation Approach to Guardrail Models for Virtual Assistants. 4009-4030 - KiYoon Yoo, Wonhyuk Ahn, Nojun Kwak:
Advancing Beyond Identification: Multi-bit Watermark for Large Language Models. 4031-4055 - Tingxuan Chen, Jun Long, Liu Yang, Zidong Wang, Yongheng Wang, Xiongnan Jin:
HTCCN: Temporal Causal Convolutional Networks with Hawkes Process for Extrapolation Reasoning in Temporal Knowledge Graphs. 4056-4066 - Abe Bohan Hou, Jingyu Zhang, Tianxing He, Yichen Wang, Yung-Sung Chuang, Hongwei Wang, Lingfeng Shen, Benjamin Van Durme, Daniel Khashabi, Yulia Tsvetkov:
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. 4067-4082 - Iffat Maab, Edison Marrese-Taylor, Sebastian Padó, Yutaka Matsuo:
Media Bias Detection Across Families of Language Models. 4083-4098 - Aobo Kong, Shiwan Zhao, Hao Chen, Qicheng Li, Yong Qin, Ruiqi Sun, Xin Zhou, Enzhi Wang, Xiaohang Dong:
Better Zero-Shot Reasoning with Role-Play Prompting. 4099-4113 - Fenghua Cheng, Xue Li, Zi Huang, Jinxiang Wang, Sen Wang:
Event-Content-Oriented Dialogue Generation in Short Video. 4114-4124 - Yongrui Chen, Haiyun Jiang, Xinting Huang, Shuming Shi, Guilin Qi:
DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping. 4125-4135 - T. Y. S. S. Santosh, Vatsal Venkatkrishna, Saptarshi Ghosh, Matthias Grabmair:
Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case Summarization. 4136-4150 - Qifan Lu, Bhaskar Ramasubramanian, Radha Poovendran:
EDC: Effective and Efficient Dialog Comprehension For Dialog State Tracking. 4151-4165 - Sara Abedalmonem Mohammad Shatnawi, Sawsan Alqahtani, Hanan Aldarmaki:
Automatic Restoration of Diacritics for Speech Data Sets. 4166-4176 - Maite Heredia, Julen Etxaniz, Muitze Zulaika, Xabier Saralegi, Jeremy Barnes, Aitor Soroa:
XNLIeu: a dataset for cross-lingual NLI in Basque. 4177-4188 - Huazheng Wang, Jinming Wu, Haifeng Sun, Zixuan Xia, Daixuan Cheng, Jingyu Wang, Qi Qi, Jianxin Liao:
MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning. 4189-4204 - Nayeon Lee, Chani Jung, Junho Myung, Jiho Jin, José Camacho-Collados, Juho Kim, Alice Oh:
Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis. 4205-4224 - Zheng Zhao, Emilio Monti, Jens Lehmann, Haytham Assem:
Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding. 4225-4237 - Hyewon Jang, Diego Frassinelli:
Generalizable Sarcasm Detection is Just Around the Corner, of Course! 4238-4249 - Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupala:
Encoding of lexical tone in self-supervised models of spoken language. 4250-4261 - Francesco Periti, Nina Tahmasebi:
A Systematic Comparison of Contextualized Word Embeddings for Lexical Semantic Change. 4262-4282 - Xiancai Xu, Jia-Dong Zhang, Lei Xiong, Zhishang Liu:
iACOS: Advancing Implicit Sentiment Extraction with Informative and Adaptive Negative Examples. 4283-4293 - Joonwon Jang, Sanghwan Jang, Wonbin Kweon, Minjin Jeon, Hwanjo Yu:
Rectifying Demonstration Shortcut in In-Context Learning. 4294-4321 - Stephen Mayhew, Terra Blevins, Shuheng Liu, Marek Suppa, Hila Gonen, Joseph Marvin Imperial, Börje Karlsson, Peiqin Lin, Nikola Ljubesic, Lester James V. Miranda, Barbara Plank, Arij Riabi, Yuval Pinter:
Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark. 4322-4337 - Sunjae Kwon, Xun Wang, Weisong Liu, Emily Druhl, Minhee L. Sung, Joel I. Reisman, Wenjun Li, Robert D. Kerns, William Becker, Hong Yu:
ODD: A Benchmark Dataset for the Natural Language Processing Based Opioid Related Aberrant Behavior Detection. 4338-4359 - Xingmeng Zhao, Ali Niazi, Anthony Rios:
A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models. 4360-4374 - Paiheng Xu, Jing Liu, Nathan Jones, Julie Cohen, Wei Ai:
The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education. 4375-4389 - James Flemings, Meisam Razaviyayn, Murali Annavaram:
Differentially Private Next-Token Prediction of Large Language Models. 4390-4404 - Janis Goldzycher, Paul Röttger, Gerold Schneider:
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset. 4405-4424 - Cícero Nogueira dos Santos, James Lee-Thorp, Isaac Noble, Chung-Ching Chang, David C. Uthus:
Memory Augmented Language Models through Mixture of Word Experts. 4425-4438 - Jaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu, Jillian Fisher, Taylor Sorensen, Yejin Choi:
Impossible Distillation for Paraphrasing and Summarization: How to Make High-quality Lemonade out of Small, Low-quality Model. 4439-4454 - Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yuan Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown:
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization. 4455-4480 - Xinliang Frederick Zhang, Winston Wu, Nicholas Beauchamp, Lu Wang:
MOKA: Moral Knowledge Augmentation for Moral Event Extraction. 4481-4502 - Paulo Cavalin, Pedro Henrique Domingues, Claudio S. Pinhanez, Julio Nogima:
Fixing Rogue Memorization in Many-to-One Multilingual Translators of Extremely-Low-Resource Languages by Rephrasing Training Samples. 4503-4514 - Jun Wang, Qiongkai Xu, Xuanli He, Benjamin I. P. Rubinstein, Trevor Cohn:
Backdoor Attacks on Multilingual Machine Translation. 4515-4534 - Yue Guo, Joseph Chee Chang, Maria Antoniak, Erin Bransom, Trevor Cohen, Lucy Lu Wang, Tal August:
Personalized Jargon Identification for Enhanced Interdisciplinary Communication. 4535-4550 - Kexin Huang, Xiangyang Liu, Qianyu Guo, Tianxiang Sun, Jiawei Sun, Yaru Wang, Zeyang Zhou, Yixu Wang, Yan Teng, Xipeng Qiu, Yingchun Wang, Dahua Lin:
Flames: Benchmarking Value Alignment of LLMs in Chinese. 4551-4591 - Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, Nanyun Peng:
Mitigating Bias for Question Answering Models by Tracking Bias Influence. 4592-4610 - Seoyeon Kim, Minguk Kang, Dongwon Kim, Jaesik Park, Suha Kwak:
Extending CLIP's Image-Text Alignment to Referring Image Segmentation. 4611-4628 - Yu-Xiang Lin, Wei-Yun Ma:
Generating Attractive and Authentic Copywriting from Customer Reviews. 4629-4642 - Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma:
Effective Long-Context Scaling of Foundation Models. 4643-4663 - Zhujin Gao, Junliang Guo, Xu Tan, Yongxin Zhu, Fang Zhang, Jiang Bian, Linli Xu:
Empowering Diffusion Models on the Embedding Space for Text Generation. 4664-4683 - Yu Xia, Tong Yu, Zhankui He, Handong Zhao, Julian J. McAuley, Shuai Li:
Aligning as Debiasing: Causality-Aware Alignment via Reinforcement Learning with Interventional Feedback. 4684-4695 - Yixu Wang, Yan Teng, Kexin Huang, Chengqi Lyu, Songyang Zhang, Wenwei Zhang, Xingjun Ma, Yu-Gang Jiang, Yu Qiao, Yingchun Wang:
Fake Alignment: Are LLMs Really Aligned Well? 4696-4712 - Zhiming Mao, Haoli Bai, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong:
Visually Guided Generative Text-Layout Pre-training for Document Intelligence. 4713-4730 - He Zhu, Junran Wu, Ruomei Liu, Yue Hou, Ze Yuan, Shangzhe Li, Yicheng Pan, Ke Xu:
HILL: Hierarchy-aware Information Lossless Contrastive Learning for Hierarchical Text Classification. 4731-4745 - Rao Ma, Adian Liusie, Mark J. F. Gales, Kate M. Knill:
Investigating the Emergent Audio Classification Ability of ASR Foundation Models. 4746-4760 - Aaron Mueller, Albert Webson, Jackson Petty, Tal Linzen:
In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax. 4761-4779 - Yongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao:
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt. 4780-4794 - Dena F. Mujtaba, Nihar R. Mahapatra, Megan Arney, J. Scott Yaruss, Hope Gerlach-Houck, Caryn Herring, Jia Bin:
Lost in Transcription: Identifying and Quantifying the Accuracy Biases of Automatic Speech Recognition Systems Against Disfluent Speech. 4795-4809 - Chadi Helwe, Tom Calamai, Pierre-Henri Paris, Chloé Clavel, Fabian M. Suchanek:
MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and Classification. 4810-4845 - Lihua Qian, Mingxuan Wang, Yang Liu, Hao Zhou:
Diffusion Glancing Transformer for Parallel Sequence-to-Sequence Learning. 4846-4862 - Kellen Cheng, Suma Bhat:
No Context Needed: Contextual Quandary In Idiomatic Reasoning With Pre-Trained Language Models. 4863-4880 - Xindi Wang, Robert E. Mercer, Frank Rudzicz:
Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation. 4881-4891 - Anemily Machina, Robert E. Mercer:
Anisotropy is Not Inherent to Transformers. 4892-4907 - Parker Riley, Daniel Deutsch, George F. Foster, Viresh Ratnakar, Ali Dabirmoghaddam, Markus Freitag:
Finding Replicable Human Evaluations via Stable Ranking Probability. 4908-4919 - Yuanpu Cao, Bochuan Cao, Jinghui Chen:
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections. 4920-4935 - Sai Ashish Somayajula, Youwei Liang, Li Zhang, Abhishek Singh, Pengtao Xie:
Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts. 4936-4953 - Daeun Lee, Hyolim Jeon, Sejung Son, Chaewon Park, Ji Hyun An, Seungbae Kim, Jinyoung Han:
Detecting Bipolar Disorder from Misdiagnosed Major Depressive Disorder with Mood-Aware Multi-Task Learning. 4954-4970 - Ben Bogin, Shivanshu Gupta, Peter Clark, Ashish Sabharwal:
Leveraging Code to Improve In-Context Learning for Semantic Parsing. 4971-5012 - Micheal Abaho, Danushka Bollegala, Gary Leeming, Dan W. Joyce, Iain E. Buchan:
Improving Pre-trained Language Model Sensitivity via Mask Specific losses: A case study on Biomedical NER. 5013-5029 - Jack Merullo, Carsten Eickhoff, Ellie Pavlick:
Language Models Implement Simple Word2Vec-style Vector Arithmetic. 5030-5047 - Ruiyi Zhang, Rushi Qiang, Sai Ashish Somayajula, Pengtao Xie:
AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning. 5048-5060 - Haotian Xia, Zhengbang Yang, Yuqing Wang, Rhys Tracy, Yun Zhao, Dongdong Huang, Zezhi Chen, Yan Zhu, Yuan-Fang Wang, Weining Shen:
SportQA: A Benchmark for Sports Understanding in Large Language Models. 5061-5081 - Thinh Truong, Yulia Otmakhova, Karin Verspoor, Trevor Cohn, Timothy Baldwin:
Revisiting subword tokenization: A case study on affixal negation in large language models. 5082-5095 - Daniel Cabrera Lozoya, Alejandro Berazaluce, Juan Perches, Eloy Lúa, Mike Conway, Simon D'Alfonso:
Generating Mental Health Transcripts with SAPE (Spanish Adaptive Prompt Engineering). 5096-5113 - Patrick Foley, Matthew Wiesner, Bismarck Odoom, Leibny Paola García-Perera, Kenton Murray, Philipp Koehn:
Where are you from? Geolocating Speech and Applications to Language Identification. 5114-5126 - Xiao Yu, Baolin Peng, Michel Galley, Jianfeng Gao, Zhou Yu:
Teaching Language Models to Self-Improve through Interactive Demonstrations. 5127-5149 - Hossein Aboutalebi, Hwanjun Song, Yusheng Xie, Arshit Gupta, Lijia Sun, Hang Su, Igor Shalyminov, Nikolaos Pappas, Siffi Singh, Saab Mansour:
MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets. 5150-5167 - Ke Lin, Yiyang Luo, Zijian Zhang, Ping Luo:
Zero-shot Generative Linguistic Steganography. 5168-5182 - Cameron Jones, Ben Bergen:
Does GPT-4 pass the Turing test? 5183-5210 - Yuanyuan Lei, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu:
Polarity Calibration for Opinion Summarization. 5211-5224 - Yuanyuan Lei, Ruihong Huang:
Sentence-level Media Bias Analysis with Event Relation Graph. 5225-5238 - Yuanyuan Lei, Md Messal Monem Miah, Ayesha Qamar, Sai Ramana Reddy, Jonathan Tong, Haotian Xu, Ruihong Huang:
EMONA: Event-level Moral Opinions in News Articles. 5239-5251 - Beibei Gao, Yangsen Zhang, Ga Xiang, Yushan Jiang:
DLM: A Decoupled Learning Model for Long-tailed Polyphone Disambiguation in Mandarin. 5252-5262 - Bangzhao Shu, Lechen Zhang, Minje Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens:
You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments. 5263-5281 - Xiao Liu, Yansong Feng, Kai-Wei Chang:
CASA: Causality-driven Argument Sufficiency Assessment. 5282-5302 - Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras, Raja Marjieh, Nanyun Peng, Yejin Choi, Thomas L. Griffiths, Faeze Brahman:
MacGyver: Are Large Language Models Creative Problem Solvers? 5303-5324 - Benedikt Ebing, Goran Glavas:
To Translate or Not to Translate: A Systematic Investigation of Translation-Based Cross-Lingual Transfer to Low-Resource Languages. 5325-5344 - Rui Wang, Hongru Wang, Fei Mi, Boyang Xue, Yi Chen, Kam-Fai Wong, Ruifeng Xu:
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting. 5345-5363 - Urchade Zaratiana, Nadi Tomeh, Pierre Holat, Thierry Charnois:
GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer. 5364-5376 - Paul Röttger, Hannah Kirk, Bertie Vidgen, Giuseppe Attanasio, Federico Bianchi, Dirk Hovy:
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models. 5377-5400 - Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-Young Yun:
Carpe diem: On the Evaluation of World Knowledge in Lifelong Language Models. 5401-5415 - Minwoo Lee, Hyukhun Koh, Minsung Kim, Kyomin Jung:
Fine-grained Gender Control in Machine Translation with Large Language Models. 5416-5430 - Zefan Cai, Xin Zheng, Tianyu Liu, Haoran Meng, Jiaqi Han, Gang Yuan, Binghuai Lin, Baobao Chang, Yunbo Cao:
DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade. 5431-5452 - Xiaonan Li, Changtai Zhu, Linyang Li, Zhangyue Yin, Tianxiang Sun, Xipeng Qiu:
LLatrieval: LLM-Verified Retrieval for Verifiable Generation. 5453-5471 - Siyuan Chen, Meilin Wang, Minghao Lv, Zhiling Zhang, Juqianqian Juqianqian, Dejiyangla Dejiyangla, Yujia Peng, Kenny Q. Zhu, Mengyue Wu:
Mapping Long-term Causalities in Psychiatric Symptomatology and Life Events from Social Media. 5472-5487 - Averi Nowak, Francesco Piccinno, Yasemin Altun:
Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches. 5488-5505 - Seiji Maekawa, Hayate Iso, Sairam Gurajada, Nikita Bhutani:
Retrieval Helps or Hurts? A Deeper Dive into the Efficacy of Retrieval Augmentation to Language Models. 5506-5521 - Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Ke Li, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs. 5522-5532 - Ashim Gupta, Rishanth Rajendhran, Nathan Stringham, Vivek Srikumar, Ana Marasovic:
Whispers of Doubt Amidst Echoes of Triumph in NLP Robustness. 5533-5590 - Semih Yagcioglu, Osman Batur Ince, Aykut Erdem, Erkut Erdem, Desmond Elliott, Deniz Yuret:
Sequential Compositional Generalization in Multimodal Models. 5591-5611 - Md Nayem Uddin, Enfa Rose George, Eduardo Blanco, Steven R. Corman:
Generating Uncontextualized and Contextualized Questions for Document-Level Event Argument Extraction. 5612-5627 - Zhenrui Yue, Huimin Zeng, Yimeng Lu, Lanyu Shang, Yang Zhang, Dong Wang:
Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation. 5628-5643 - Huimin Zeng, Zhenrui Yue, Dong Wang:
Open-Vocabulary Federated Learning with Multimodal Prototyping. 5644-5656 - Xiao Li, Yong Jiang, Shen Huang, Pengjun Xie, Gong Cheng, Fei Huang:
Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning. 5657-5667 - Siqi Shen, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Soujanya Poria, Rada Mihalcea:
Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense. 5668-5680 - Lajanugen Logeswaran, Sungryull Sohn, Yiwei Lyu, Anthony Z. Liu, Dong-Ki Kim, Dongsub Shim, Moontae Lee, Honglak Lee:
Code Models are Zero-shot Precondition Reasoners. 5681-5697 - Suyoung Kim, Jiyeon Hwang, Ho-Young Jung:
Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding. 5698-5711 - Yuan Wang, Xuyang Wu, Hsin-Tai Wu, Zhiqiang Tao, Yi Fang:
Do Large Language Models Rank Fairly? An Empirical Study on the Fairness of LLMs as Rankers. 5712-5724 - Md Mahadi Hasan Nahid, Davood Rafiei:
TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition. 5725-5737 - Tanmay Parekh, I-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, Nanyun Peng:
Contextual Label Projection for Cross-Lingual Structured Prediction. 5738-5757 - Tanmay Parekh, Anh Mac, Jiarui Yu, Yuxuan Dong, Syed Shahriar, Bonnie Liu, Eric Yang, Kuan-Hao Huang, Wei Wang, Nanyun Peng, Kai-Wei Chang:
Event Detection from Social Media for Epidemic Prediction. 5758-5783 - Song Jiang, Zahra Shakeri, Aaron Chan, Maziar Sanjabi, Hamed Firooz, Yinglong Xia, Bugra Akyildiz, Yizhou Sun, Jinchao Li, Qifan Wang, Asli Celikyilmaz:
RESPROMPT: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language Models. 5784-5809 - Thomas Bauwens, Pieter Delobelle:
BPE-knockout: Pruning Pre-existing BPE Tokenisers with Backwards-compatible Morphological Semi-supervision. 5810-5832 - Sheng Lu, Hendrik Schuff, Iryna Gurevych:
How are Prompts Different in Terms of Sensitivity? 5833-5856 - Guanghui Ye, Huan Zhao, Zixing Zhang, Xupeng Zha, Zhihua Jiang:
LSTDial: Enhancing Dialogue Generation via Long- and Short-Term Measurement Feedback. 5857-5871 - Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ramakanth Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz:
The ART of LLM Refinement: Ask, Refine, and Trust. 5872-5883 - Sungjun Lim, Yoonjung Choi, Sangha Kim:
Modularized Multilingual NMT with Fine-grained Interlingua. 5884-5899 - Oren Sultan, Yonatan Bitton, Ron Yosef, Dafna Shahaf:
ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies. 5900-5924 - Shuyang Cao, Lu Wang:
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content. 5925-5941 - Kristina Gligoric, Myra Cheng, Lucia Zheng, Esin Durmus, Dan Jurafsky:
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps. 5942-5959 - Enze Shi, Lei Ding, Linglong Kong, Bei Jiang:
Debiasing with Sufficient Projection: A General Theoretical Framework for Vector Representations. 5960-5975 - Jianfeng He, Hang Su, Jason Cai, Igor Shalyminov, Hwanjun Song, Saab Mansour:
Semi-Supervised Dialogue Abstractive Summarization via High-Quality Pseudolabel Selection. 5976-5996 - Jiayi Wang, David Ifeoluwa Adelani, Sweta Agrawal, Marek Masiak, Ricardo Rei, Eleftheria Briakou, Marine Carpuat, Xuanli He, Sofia Bourhim, Andiswa Bukula, Muhidin Mohamed, Temitayo Olatoye, Tosin P. Adewumi, Hamam Mokayed, Christine Mwase, Wangui Kimotho, Foutse Yuehgoh, Anuoluwapo Aremu, Jessica Ojo, Shamsuddeen Hassan Muhammad, Salomey Osei, Abdul-Hakeem Omotayo, Chiamaka Chukwuneke, Perez Ogayo, Oumaima Hourrane, Salma El Anigri, Lolwethu Ndolela, Thabiso Mangwana, Shafie Abdi Mohamed, Ayinde Hassan, Oluwabusayo Olufunke Awoyomi, Lama Alkhaled, Sana Sabah Al-Azzawi, Naome A. Etori, Millicent Ochieng, Clemencia Siro, Njoroge Kiragu, Eric Muchiri, Wangari Kimotho, Sakayo Toadoum Sari, Lyse Naomi Wamba Momo, Daud Abolade, Simbiat Ajao, Iyanuoluwa Shode, Ricky Macharm, Ruqayya Nasir Iro, Saheed S. Abdullahi, Stephen E. Moore, Bernard Opoku, Zainab Akinjobi, Afolabi Abeeb, Nnaemeka C. Obiefuna, Onyekachi Raphael Ogbu, Sam Ochieng', Verrah Otiende, Chinedu E. Mbonu, Yao Lu, Pontus Stenetorp:
AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages. 5997-6023 - Tianshu Zhang, Xiang Yue, Yifei Li, Huan Sun:
TableLlama: Towards Open Large Generalist Models for Tables. 6024-6044 -