Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Cattan, Arie; Jacovi, Alon; Fabrikant, Alex; Herzig, Jonathan; Aharoni, Roee; Rashkin, Hannah; Marcus, Dror; Hassidim, Avinatan; Matias, Yossi; Szpektor, Idan; Caciularu, Avi

Computer Science > Computation and Language

arXiv:2406.13632 (cs)

[Submitted on 19 Jun 2024 (v1), last revised 18 Oct 2024 (this version, v3)]

Title:Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Authors:Arie Cattan, Alon Jacovi, Alex Fabrikant, Jonathan Herzig, Roee Aharoni, Hannah Rashkin, Dror Marcus, Avinatan Hassidim, Yossi Matias, Idan Szpektor, Avi Caciularu

View PDF HTML (experimental)

Abstract:Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naïvely adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot example and context mismatch between the demonstrations and the target query. In this work, we propose to automatically generate few-shot examples for long context QA tasks by recycling contexts. Specifically, given a long input context (1-3k tokens) and a query, we generate additional query-output pairs from the given context as few-shot examples, while introducing the context only once. This ensures that the demonstrations are leveraging the same context as the target query while only adding a small number of tokens to the prompt. We further enhance each demonstration by instructing the model to explicitly identify the relevant paragraphs before the answer, which improves performance while providing fine-grained attribution to the answer source. We apply our method on multiple LLMs and obtain substantial improvements (+16 absolute points on average across models) on various QA datasets with long context, especially when the answer lies within the middle of the context. Surprisingly, despite introducing only single-hop ICL examples, LLMs also successfully generalize to multi-hop long-context QA using our approach.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2406.13632 [cs.CL]
	(or arXiv:2406.13632v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.13632

Submission history

From: Arie Cattan [view email]
[v1] Wed, 19 Jun 2024 15:28:29 UTC (9,015 KB)
[v2] Sun, 23 Jun 2024 07:19:22 UTC (9,016 KB)
[v3] Fri, 18 Oct 2024 09:07:53 UTC (10,192 KB)

Computer Science > Computation and Language

Title:Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators