iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-031-70566-3_18
Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets | SpringerLink
Skip to main content

Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2024)

Abstract

In spoken Task-Oriented Dialogue (TOD) systems, the choice of the semantic representation describing the users’ requests is key to a smooth interaction. Indeed, the system uses this representation to reason over a database and its domain knowledge to choose its next action. The dialogue course thus depends on the information provided by this semantic representation. While textual datasets provide fine-grained semantic representations, spoken dialogue datasets fall behind. This paper provides insights into automatic enhancement of spoken dialogue datasets’ semantic representations. Our contributions are three fold: (1) assess the relevance of Large Language Model fine-tuning, (2) evaluate the knowledge captured by the produced annotations and (3) highlight semi-automatic annotation implications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://dstc11.dstc.community/.

  2. 2.

    In practice the model is only fed the 5 previous turns to limit the number of tokens.

  3. 3.

    Obtained with the model at https://huggingface.co/thenlper/gte-large.

References

  1. Banarescu, L., et al.: Abstract meaning representation for sembanking. In: Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pp. 178–186 (2013)

    Google Scholar 

  2. Béchet, F., Raymond, C.: Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora. In: InterSpeech (2019)

    Google Scholar 

  3. Bonial, C., et al.: Dialogue-AMR: abstract meaning representation for dialogue. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France (2020)

    Google Scholar 

  4. Budzianowski, P., et al.: Multiwoz - a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling. In: Conference on Empirical Methods in Natural Language Processing (EMNLP) (2018). https://api.semanticscholar.org/CorpusID:52897360

  5. Cai, S., Knight, K.: Smatch: an evaluation metric for semantic feature structures. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Sofia, Bulgaria, pp. 748–752. Association for Computational Linguistics (2013). https://www.aclweb.org/anthology/P13-2131

  6. Chen, X., Ghoshal, A., Mehdad, Y., Zettlemoyer, L., Gupta, S.: Low-resource domain adaptation for compositional task-oriented semantic parsing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2020)

    Google Scholar 

  7. Devillers, L., et al.: The French media/evalda project: the evaluation of the understanding capability of spoken language dialogue systems. In: International Conference on Language Resources and Evaluation (2004)

    Google Scholar 

  8. Faruqui, M., Hakkani-Tür, D.: Revisiting the boundary between ASR and NLU in the age of conversational dialog systems. Comput. Linguist. 48(1), 221–232 (2022)

    Article  Google Scholar 

  9. Geng, S., Josifoski, M., Peyrard, M., West, R.: Grammar-constrained decoding for structured NLP tasks without finetuning. In: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2023)

    Google Scholar 

  10. Georgila, K., Lemon, O., Henderson, J., Moore, J.D.: Automatic annotation of context and speech acts for dialogue corpora. Nat. Lang. Eng. 15(3), 315–353 (2009)

    Article  Google Scholar 

  11. Gong, T., Belanich, J., Somandepalli, K., Nagrani, A., Eoff, B., Jou, B.: Lanser: language-model supported speech emotion recognition. arXiv preprint arXiv:2309.03978 (2023)

  12. Hu, E.J., et al.: LoRA: Low-rank adaptation of large language models. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=nZeVKeeFYf9

  13. Hu, X., et al.: Dialogue meaning representation for task-oriented dialogue systems. In: Findings of the Association for Computational Linguistics: EMNLP 2022. Association for Computational Linguistics (2022)

    Google Scholar 

  14. Jiang, A.Q., et al.: Mistral 7B (2023)

    Google Scholar 

  15. Kim, H., Mitra, K., Chen, R.L., Rahman, S., Zhang, D.: Meganno+: a human-LLM collaborative annotation system. arXiv preprint arXiv:2402.18050 (2024)

  16. Liao, X., Zhao, Z.: Unsupervised approaches for textual semantic annotation, a survey. ACM Comput. Surv. (CSUR) 52(4), 1–45 (2019)

    Article  Google Scholar 

  17. Mdhaffar, S., et al.: Impact analysis of the use of speech and language models pretrained by self-supersivion for spoken language understanding. In: International Conference on Language Resources and Evaluation (LREC) (2022)

    Google Scholar 

  18. Moslem, Y., Haque, R., Way, A.: Fine-tuning large language models for adaptive machine translation. arXiv preprint arXiv:2312.12740 (2023)

  19. Savelka, J.: Unlocking practical applications in legal domain: evaluation of GPT for zero-shot semantic annotation of legal texts. In: Proceedings of the Nineteenth International Conference on Artificial Intelligence and Law, pp. 447–451 (2023)

    Google Scholar 

  20. Si, S., et al.: SpokenWOZ: a large-scale speech-text benchmark for spoken task-oriented dialogue agents. In: NeurIPS Datasets and Benchmarks Track (2023). https://openreview.net/forum?id=viktK3nO5b

  21. Soltau, H., Shafran, I., Wang, M., Rastogi, A., Han, W., Cao, Y.: DSTC-11: speech aware task-oriented dialog modeling track. In: Proceedings of The Eleventh Dialog System Technology Challenge. Association for Computational Linguistics (2023)

    Google Scholar 

  22. Tomasello, P., et al.: Stop: a dataset for spoken task oriented semantic parsing. In: 2022 IEEE Spoken Language Technology Workshop (SLT) (2022). https://doi.org/10.1109/SLT54892.2023.10022703

  23. Tur, G., De Mori, R.: SLU in commercial and research spoken dialogue systems. Wiley Telecom (2011). https://doi.org/10.1002/9781119992691.ch7

  24. Xu, Z., Jain, S., Kankanhalli, M.: Hallucination is inevitable: an innate limitation of large language models. arXiv preprint arXiv:2401.11817 (2024)

  25. Yang, X., Song, Z., King, I., Xu, Z.: A survey on deep semi-supervised learning. IEEE Trans. Knowl. Data Eng. 35(9), 8934–8954 (2022)

    Article  Google Scholar 

  26. Zhang, Y., Yang, R., Xu, X., Xiao, J., Shen, J., Han, J.: Teleclass: taxonomy enrichment and LLM-enhanced hierarchical text classification with minimal supervision. arXiv preprint arXiv:2403.00165 (2024)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Druart .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests to declare that are relevant to the content of this article.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Druart, L., Vielzeuf, V., Estève, Y. (2024). Investigating Low-Cost LLM Annotation for Spoken Dialogue Understanding Datasets. In: Nöth, E., Horák, A., Sojka, P. (eds) Text, Speech, and Dialogue. TSD 2024. Lecture Notes in Computer Science(), vol 15049. Springer, Cham. https://doi.org/10.1007/978-3-031-70566-3_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70566-3_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70565-6

  • Online ISBN: 978-3-031-70566-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics