iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1145/3592813.3592886
Assis: Online Semi-Automatic Dialog Annotation Tool | Proceedings of the XIX Brazilian Symposium on Information Systems skip to main content
10.1145/3592813.3592886acmotherconferencesArticle/Chapter ViewAbstractPublication PagessbsiConference Proceedingsconference-collections
research-article

Assis: Online Semi-Automatic Dialog Annotation Tool

Published: 26 June 2023 Publication History

Abstract

Context: Task-oriented conversational systems demand a high volume of data to understand human language. One of the major challenges of Natural Language Processing (NLP) is the lack of structured annotated data to improve and refine language models, therefore, institutions often generate or mine their own data and have to annotate it themselves.
Problem: The annotation process is time-consuming and costly process that usually results in errors due to human fatigue and often acts as the blocking phase for many smaller teams developing AI. Companies frequently report scarcity and poor data quality when developing these systems.
Solution: This paper presents Assis, a modular, adaptable tool for semi-automatic annotation (manual and AI annotation). The tool automates and organizes the intentions and entities in task-oriented conversations. Our proposal combines components that facilitate the visual assimilation of the annotation process. Assis can be embedded with continuously refined language models based on previously annotated sentences.
IS theory: Assis was developed with the idea of Design Theory in mind, using its base of knowledge to evaluate the existing and proposed tools to its goal of facilitating annotation.
Method: Empirical results from user experience in real-life case studies and satisfaction with both the annotation results as well as the user experience, in comparison to the same study groups conducting the annotation without tools or in another software, using a feedback form after use.
Results: During one of the case studies, the tool was used to annotate more than 800 messages, with user feedback relating a high satisfaction with the reduction of the required time.
Contributions and Impact in the IS area: The tool innovates with its deployless architecture, modularity and adaptability, while introducing two new concepts for text annotation: dialogue topics and entity propagation.

References

[1]
Daniel Albright, Arrick Lanfranchi, Anwen Fredriksen, William F Styler IV, Colin Warner, Jena D Hwang, Jinho D Choi, Dmitriy Dligach, Rodney D Nielsen, James Martin, 2013. Towards comprehensive syntactic and semantic annotations of the clinical narrative. Journal of the American Medical Informatics Association 20, 5 (2013), 922–930.
[2]
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–13.
[3]
Sophia Ananiadou and Jun’ichi Tsujii. 2012. stav: text annotation visualiser. (2012).
[4]
Pawel Budzianowski, Tsung-Hsien Wen, Bo-Hsiang Tseng, Inigo Casanueva, Stefan Ultes, Osman Ramadan, and Milica Gasic. 2018. MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling. arXiv preprint arXiv:1810.00278 (2018).
[5]
Riccardo Coppola and Luca Ardito. 2021. Quality Assessment Methods for Textual Conversational Interfaces: A Multivocal Literature Review. Information 12, 11 (2021), 437.
[6]
Jan-Christoph Klie. 2018. INCEpTION: Interactive machine-assisted annotation. In DESIRES. 105.
[7]
Kostiantyn Kucher, Andreas Kerren, Carita Paradis, and Magnus Sahlgren. 2016. Visual Analysis of Text Annotations for Stance Classification with ALVA. In EuroVis (Posters). 49–51.
[8]
Todd Lingren, Louise Deleger, Katalin Molnar, Haijun Zhai, Jareen Meinzen-Derr, Megan Kaiser, Laura Stoutenborough, Qi Li, and Imre Solti. 2014. Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. Journal of the American Medical Informatics Association 21, 3 (2014), 406–413.
[9]
Erinc Merdivan, Deepika Singh, Sten Hanke, Johannes Kropf, Andreas Holzinger, and Matthieu Geist. 2020. Human annotated dialogues dataset for natural conversational agents. Applied Sciences 10, 3 (2020), 762.
[10]
Thomas S Morton and Jeremy LaCivita. 2003. WordFreak: an open tool for linguistic annotation. In Companion Volume of the Proceedings of HLT-NAACL 2003-Demonstrations. 17–18.
[11]
Hiroki Nakayama, Takahiro Kubo, Junya Kamura, Yasufumi Taniguchi, and Xu Liang. 2018. doccano: Text Annotation Tool for Human. https://github.com/doccano/doccano Software available from https://github.com/doccano/doccano.
[12]
Minh-Quoc Nghiem, Paul Baylis, and Sophia Ananiadou. 2021. Paladin: an annotation tool based on active and proactive learning. (2021), 238–243.
[13]
Fredrik Olsson. 2008. Bootstrapping named entity annotation by means of active machine learning: a method for creating corpora. Ph. D. Dissertation.
[14]
Alan Ritter, Colin Cherry, and Bill Dolan. 2011. Data-driven response generation in social media. In Empirical Methods in Natural Language Processing (EMNLP).
[15]
Matheus Ferraroni Sanches, Jáder MC de Sá, Allan Mariano de Souza, Diego A Silva, Rafael R de Souza, Júlio Cesar dos Reis, and Leandro A Villas. 2022. MCCD: Generating Human Natural Language Conversational Datasets. In ICEIS (2). 247–255.
[16]
Maria Skeppstedt, Carita Paradis, and Andreas Kerren. 2017. PAL, a tool for pre-annotation and active learning. Journal for Language Technology and Computational Linguistics 31, 1 (2017), 91–110.
[17]
Pontus Stenetorp, Sampo Pyysalo, Goran Topić, Tomoko Ohta, Sophia Ananiadou, and Jun’ichi Tsujii. 2012. BRAT: a web-based tool for NLP-assisted text annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics. 102–107.
[18]
Jie Yang, Yue Zhang, Linwei Li, and Xingxuan Li. 2017. YEDDA: A lightweight collaborative text span annotation tool. arXiv preprint arXiv:1711.03759 (2017).
[19]
Seid Muhie Yimam, Iryna Gurevych, Richard Eckart de Castilho, and Chris Biemann. 2013. Webanno: A flexible, web-based and visually supported system for distributed annotations. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 1–6.
[20]
Xiaoxue Zang, Abhinav Rastogi, Srinivas Sunkara, Raghav Gupta, Jianguo Zhang, and Jindong Chen. 2020. MultiWOZ 2.2: A dialogue dataset with additional annotation corrections and state tracking baselines. arXiv preprint arXiv:2007.12720 (2020).
[21]
Wen Zhang, Heng Wang, Kaijun Ren, and Junqiang Song. 2016. Chinese sentence based lexical similarity measure for artificial intelligence chatbot. In 2016 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI). IEEE, 1–4.

Cited By

View all
  • (2024)Identifying intentions in conversational tools: a systematic mappingProceedings of the 20th Brazilian Symposium on Information Systems10.1145/3658271.3658286(1-10)Online publication date: 20-May-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SBSI '23: Proceedings of the XIX Brazilian Symposium on Information Systems
May 2023
490 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NLP
  2. active learning
  3. annotation
  4. online
  5. tool

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SBSI '23

Acceptance Rates

Overall Acceptance Rate 181 of 557 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)3
Reflects downloads up to 01 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Identifying intentions in conversational tools: a systematic mappingProceedings of the 20th Brazilian Symposium on Information Systems10.1145/3658271.3658286(1-10)Online publication date: 20-May-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media