Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets

Druart, Lucas; Vielzeuf, Valentin; Estève, Yannick

doi:10.1007/978-3-031-70566-3_18

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15049))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

207 Accesses

Abstract

In spoken Task-Oriented Dialogue (TOD) systems, the choice of the semantic representation describing the users’ requests is key to a smooth interaction. Indeed, the system uses this representation to reason over a database and its domain knowledge to choose its next action. The dialogue course thus depends on the information provided by this semantic representation. While textual datasets provide fine-grained semantic representations, spoken dialogue datasets fall behind. This paper provides insights into automatic enhancement of spoken dialogue datasets’ semantic representations. Our contributions are three fold: (1) assess the relevance of Large Language Model fine-tuning, (2) evaluate the knowledge captured by the produced annotations and (3) highlight semi-automatic annotation implications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://dstc11.dstc.community/.
2.
In practice the model is only fed the 5 previous turns to limit the number of tokens.
3.
Obtained with the model at https://huggingface.co/thenlper/gte-large.

References

Banarescu, L., et al.: Abstract meaning representation for sembanking. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 178–186 (2013)
Google Scholar
Béchet, F., Raymond, C.: Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora. In: InterSpeech (2019)
Google Scholar
Bonial, C., et al.: Dialogue-AMR: abstract meaning representation for dialogue. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France (2020)
Google Scholar
Budzianowski, P., et al.: Multiwoz - a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. In: Conference on Empirical Methods in Natural Language Processing (EMNLP) (2018). https://api.semanticscholar.org/CorpusID:52897360
Cai, S., Knight, K.: Smatch: an evaluation metric for semantic feature structures. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Sofia, Bulgaria, pp. 748–752. Association for Computational Linguistics (2013). https://www.aclweb.org/anthology/P13-2131
Chen, X., Ghoshal, A., Mehdad, Y., Zettlemoyer, L., Gupta, S.: Low-resource domain adaptation for compositional task-oriented semantic parsing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2020)
Google Scholar
Devillers, L., et al.: The French media/evalda project: the evaluation of the understanding capability of spoken language dialogue systems. In: International Conference on Language Resources and Evaluation (2004)
Google Scholar
Faruqui, M., Hakkani-Tür, D.: Revisiting the boundary between ASR and NLU in the age of conversational dialog systems. Comput. Linguist. 48(1), 221–232 (2022)
Article Google Scholar
Geng, S., Josifoski, M., Peyrard, M., West, R.: Grammar-constrained decoding for structured NLP tasks without finetuning. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2023)
Google Scholar
Georgila, K., Lemon, O., Henderson, J., Moore, J.D.: Automatic annotation of context and speech acts for dialogue corpora. Nat. Lang. Eng. 15(3), 315–353 (2009)
Article Google Scholar
Gong, T., Belanich, J., Somandepalli, K., Nagrani, A., Eoff, B., Jou, B.: Lanser: language-model supported speech emotion recognition. arXiv preprint arXiv:2309.03978 (2023)
Hu, E.J., et al.: LoRA: Low-rank adaptation of large language models. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=nZeVKeeFYf9
Hu, X., et al.: Dialogue meaning representation for task-oriented dialogue systems. In: Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics (2022)
Google Scholar
Jiang, A.Q., et al.: Mistral 7B (2023)
Google Scholar
Kim, H., Mitra, K., Chen, R.L., Rahman, S., Zhang, D.: Meganno+: a human-LLM collaborative annotation system. arXiv preprint arXiv:2402.18050 (2024)
Liao, X., Zhao, Z.: Unsupervised approaches for textual semantic annotation, a survey. ACM Comput. Surv. (CSUR) 52(4), 1–45 (2019)
Article Google Scholar
Mdhaffar, S., et al.: Impact analysis of the use of speech and language models pretrained by self-supersivion for spoken language understanding. In: International Conference on Language Resources and Evaluation (LREC) (2022)
Google Scholar
Moslem, Y., Haque, R., Way, A.: Fine-tuning large language models for adaptive machine translation. arXiv preprint arXiv:2312.12740 (2023)
Savelka, J.: Unlocking practical applications in legal domain: evaluation of GPT for zero-shot semantic annotation of legal texts. In: Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, pp. 447–451 (2023)
Google Scholar
Si, S., et al.: SpokenWOZ: a large-scale speech-text benchmark for spoken task-oriented dialogue agents. In: NeurIPS Datasets and Benchmarks Track (2023). https://openreview.net/forum?id=viktK3nO5b
Soltau, H., Shafran, I., Wang, M., Rastogi, A., Han, W., Cao, Y.: DSTC-11: speech aware task-oriented dialog modeling track. In: Proceedings of The Eleventh Dialog System Technology Challenge. Association for Computational Linguistics (2023)
Google Scholar
Tomasello, P., et al.: Stop: a dataset for spoken task oriented semantic parsing. In: 2022 IEEE Spoken Language Technology Workshop (SLT) (2022). https://doi.org/10.1109/SLT54892.2023.10022703
Tur, G., De Mori, R.: SLU in commercial and research spoken dialogue systems. Wiley Telecom (2011). https://doi.org/10.1002/9781119992691.ch7
Xu, Z., Jain, S., Kankanhalli, M.: Hallucination is inevitable: an innate limitation of large language models. arXiv preprint arXiv:2401.11817 (2024)
Yang, X., Song, Z., King, I., Xu, Z.: A survey on deep semi-supervised learning. IEEE Trans. Knowl. Data Eng. 35(9), 8934–8954 (2022)
Article Google Scholar
Zhang, Y., Yang, R., Xu, X., Xiao, J., Shen, J., Han, J.: Teleclass: taxonomy enrichment and LLM-enhanced hierarchical text classification with minimal supervision. arXiv preprint arXiv:2403.00165 (2024)

Download references

Author information

Authors and Affiliations

Laboratoire d’Informatique d’Avignon (LIA), Avignon, France
Lucas Druart & Yannick Estève
Orange Innovation, Rennes, France
Lucas Druart & Valentin Vielzeuf

Authors

Lucas Druart
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Vielzeuf
View author publications
You can also search for this author in PubMed Google Scholar
Yannick Estève
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lucas Druart .

Editor information

Editors and Affiliations

Friedrich-Alexander-Universität, Erlangen, Germany
Elmar Nöth
Masaryk University, Brno, Czech Republic
Aleš Horák
Masaryk University, Brno, Czech Republic
Petr Sojka

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Druart, L., Vielzeuf, V., Estève, Y. (2024). Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. In: Nöth, E., Horák, A., Sojka, P. (eds) Text, Speech, and Dialogue. TSD 2024. Lecture Notes in Computer Science(), vol 15049. Springer, Cham. https://doi.org/10.1007/978-3-031-70566-3_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-70566-3_18
Published: 27 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70565-6
Online ISBN: 978-3-031-70566-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets