Abstract
In this paper, we address the task of information extraction for transcript meetings. Meeting documents are not usually well structured and are lacking formatting and punctuations. In addition, the information are distributed over multiple sentences. We experimentally investigate the usefulness of numerical statistics and topic modelling methods on a real dataset containing multi-part dialogue texts. Such information extraction can be used for different tasks, of which we consider two: contrasting thematically related but distinct meetings from each other, and contrasting meetings involving the same participants from those involving other. In addition to demonstrating the difference between counting and topic modeling results, we also evaluate our experiments with respect to the gold standards provided for the dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Tool example: http://www.vocapia.com.
- 2.
Available here: http://groups.inf.ed.ac.uk/ami/download/.
- 3.
- 4.
- 5.
For full results, we refer the readers to: https://github.com/pegahani/Event_detection/blob/master/result/result_4_4.txt.
- 6.
For full results, we refer the readers to: https://github.com/pegahani/Event_detection/blob/master/result/result_4_block_scen.txt.
- 7.
For more results you can visit: https://github.com/pegahani/Event_detection/blob/master/result/Topic_modeling_nmf_block_34_topics.txt.
References
Augmented multi-party interaction (2010). http://www.amiproject.org
Carletta, J.: Unleashing the killer corpus: experiences in creating the multi-everything AMI meeting corpus. Lang. Resour. Eval. 41, 181–190 (2007)
Fernández, R., Frampton, M., Dowding, J., Adukuzhiyil, A., Ehlen, P., Peters, S.: Identifying relevant phrases to summarize decisions in spoken meetings. In: Proceedings of Interspeech 2008, Brisbane (2008)
Fernández, R., Frampton, M., Ehlen, P., Purver, M., Peters, S.: Modelling and detecting decisions in multi-party dialogue. In: Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, SIGdial 2008, pp. 156–163. Association for Computational Linguistics, Stroudsburg, PA, USA (2008)
Galley, M., McKeown, K., Fosler-Lussier, E., Jing, H.: Discourse segmentation of multi-party conversation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, ACL 2003, vol. 1. Association for Computational Linguistics, Stroudsburg, PA, USA (2003)
Georgescul, M., Clark, A., Armstrong, S.: Exploiting structural meeting-specific features for topic segmentation. In: TALN/RECITAL, Toulouse (France), pp. 15–24 (2007)
Gurin, Y., Szymanski, T., Keane, M.T.: Discovering news events that move markets. In: Intelligent Systems Conference 2017 (IntelliSys2017), London, United Kingdom, 7–8 Sept 2017 (2017)
He, Q., Chang, K., Lim, E.P.: Analyzing feature trajectories for event detection. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, pp. 207–214. ACM, New York (2007)
Kleinberg, J.M.: Bursty and hierarchical structure in streams. Data Min. Knowl. Discov. 7(4), 373–397 (2003). https://doi.org/10.1023/A:1024940629314
Lau, J.H., Collier, N., Baldwin, T.: On-line trend analysis with topic models:#twitter trends detection topic model online. Proc. COLING 2012, 1519–1534 (2012)
Lee, D.D., Seung, H.S.: Learning the parts of objects by nonnegative matrix factorization. Nature 401, 788–791 (1999)
Purver, M., Dowding, J., Niekrasz, J., Ehlen, P., Noorbaloochi, S., Peters, S.: Detecting and summarizing action items in multi-party dialogue. In: In Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue (2007)
Purver, M., Griffiths, T.L., Körding, K.P., Tenenbaum, J.B.: Unsupervised topic modelling for multi-party spoken discourse. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, ACL-44, pp. 17–24. Association for Computational Linguistics, Stroudsburg, PA, USA (2006)
Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. In: ICWSM, vol. 10(1), pp. 16 (2010)
Řehůřek, R., Sojka, P.: Software Framework for Topic Modelling with Large Corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta, Malta (2010). http://is.muni.cz/publication/884893/en
Riedhammer, K., Favre, B., Hakkani-Tür, D.: Packing the Meeting Summarization Knapsack. In: Interspeech, Brisbane (Australia). Unknown, Unknown or Invalid Region (2008). https://hal-amu.archives-ouvertes.fr/hal-01194290
Sayyadi, H., Hurst, M., Maykov, A.: Event detection and tracking in social streams. In: In Proceedings of the International Conference on Weblogs and Social Media (ICWSM 2009). AAAI (2009)
Tur, G., et al.: The CALO meeting speech recognition and understanding system. In: 2008 IEEE Spoken Language Technology Workshop, pp. 69–72 (2008)
Tur, G., et al.: The calo meeting assistant system. IEEE Trans. Audio Speech Lang. Process. 18, 1601–1611 (2010)
Weng, J., Lee, B.S.: Event detection in twitter. In: ICWSM, vol. 11, pp. 401–408 (2011)
Acknowledgement
This work is supported by the FUI 22 (REUs project) and the ANR (French Research National Agency) funded project NARECA ANR-13-CORD-0015.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Alizadeh, P., Cellier, P., Charnois, T., Crémilleux, B., Zimmermann, A. (2023). An Experimental Approach for Information Extraction in Multi-party Dialogue Discourse. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13396. Springer, Cham. https://doi.org/10.1007/978-3-031-23793-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-23793-5_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23792-8
Online ISBN: 978-3-031-23793-5
eBook Packages: Computer ScienceComputer Science (R0)