Abstract
Common LLMs Prompt-based Neural Machine Translation methods use discrete prompt words and cured template styles, which are not conducive to fine-tuning of LLMs and contextual feature extraction. In addition, the selection of prompt instances is also a major factor affecting Prompt-NMT performance. Therefore, we propose a flexible prompt method. Specifically, we construct a dual encoder-based soft prototype, which combines spatial clustering and maximum margin constraints to generate prompt instances. Meanwhile, this paper gives a virtual template generation method, which utilizes a pseudo-prompt encoder to adapt to the current translation episodic and build a virtual prompt template, it alleviates the instance selection problem in the ICL method and also improves the template style curing problem. In the translation task of CCMT, the BLEU scores of our model are significantly improved compared with the baseline system, which fully verifies the effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
We improve LaBSE to obtain soft prototypes based on sentence embedding.
References
Zeng, A., Liu, X., Du, Z., et al.: GLM-130B: an open bilingual pre-trained model. In: ICLR 2023, Kigali, Rwanda, 1–5 May 2023
Zhang, B., Haddow, B., Birch, A.: Prompting large language model for machine translation: a case study. In: International Conference on Machine Learning, ICML 2023, 23–29 July 2023, Honolulu, Hawaii, USA. Proceedings of Machine Learning Research, vol. 202, pp. 41092–41110 (2023)
Cai, D., Wang, Y., Li, H., et al.: Neural machine translation with monolingual translation memory. In: ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, 1–6 August 2021, pp. 7307–7318 (2021)
Reheman, A., Zhou, T., Luo, Y., et al.: Prompting neural machine translation with translation memories. In: AAAI 2023, 7–14 February 2023, pp.13519–13527 (2023)
Vilar, D., Freitag, M., Cherry, C., et al.: Prompting palm for translation: assessing strategies and performance. In: ACL 2023, Toronto, Canada, 9–14 July 2023, pp. 15406–15427 (2023)
Agrawal, S., Zhou, C., Lewis, M., et al.: In-context examples selection for machine translation. In: Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, 9–14 July 2023, pp. 8857–8873 (2023)
Feng, F., Yang, Y., Cer, D., et al.: Language-agnostic BERT sentence embedding. In: ACL 2022, Dublin, Ireland, 22–27 May 2022, pp. 878–891 (2022)
Schick, T., Schutze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: EACL 2021, Online, 19–23 April 2021, pp. 255–269 (2021)
Han, X., Zhao, W., Ding, N., et al.: PTR: prompt tuning with rules for text classification. AI Open 3, 182–192 (2022)
Shin, T., Razeghi, Y., Logan IV, R.L., et al.: Autoprompt: eliciting knowledge from language models with automatically generated prompts. In: EMNLP2020, Online, 16–20 November 2020, pp. 4222–4235 (2020)
Liu, X., Zheng, Y., Du, Z., et al.: GPT understands, too. CoRR abs/2103.10385 (2021)
Conneau, A., Lample, G.: Cross-lingual language model pretraining. In: NeurIPS 2019, 8–14 December 2019, Vancouver, BC, Canada, pp. 7057–7067 (2019)
Song, K., Tan, X., Qin, T., et al.: MASS: masked sequence to sequence pre-training for language generation. In: ICML 2019, 9–15 June 2019, Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 5926–5936 (2019)
Liu, Y., Gu, J., Goyal, N., et al.: Multilingual denoising pre-training for neural machine translation. Trans. Assoc. Comput. Linguist. 8, 726–742 (2020)
Lin, Z., Pan, X., Wang, M., et al.: Pre-training multilingual neural machine translation by leveraging alignment information. In: EMNLP 2020, Online, 16–20 November 2020, pp. 2649–2663 (2020)
Muennighoff, N., Wang, T., Sutawika, L., et al.: Crosslingual generalization through multitask finetuning. In: ACL 2023, Toronto, Canada, 9–14 July 2023, pp. 15991–16111 (2023)
Costa-jussa, M.R., Cross, J.H., et al.: No language left behind: scaling human-centered machine translation. CoRR abs/2207.04672 (2022)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, 4–9 December 2017, Long Beach, CA, USA, pp. 5998–6008 (2017)
OpenAI: GPT-4 technical report. CoRR abs/2303.08774 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wu, N., Wu, W., Ji, Y., Liu, Y., Lu, M., Liu, N. (2024). ICLFP-NMT: Neural Machine Translation for ICL Flexible Prompt. In: Huang, DS., Si, Z., Zhang, Q. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science(), vol 14877. Springer, Singapore. https://doi.org/10.1007/978-981-97-5669-8_12
Download citation
DOI: https://doi.org/10.1007/978-981-97-5669-8_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5668-1
Online ISBN: 978-981-97-5669-8
eBook Packages: Computer ScienceComputer Science (R0)