Abstract
Artificial intelligence has been applied to simulate many human activities in Quantitative Ethnography(QE). This paper evaluates the creation of an intelligent co-rater for coding qualitative (text) data in QE research. The intelligent task for a computer agent in this study is helping human researchers identify patterns by smartly sampling items that contain patterns of interest the researcher has yet to identify. This study compares the performance of an existing bidirectional LSTM model, bLSTM, a new nearest neighbor model, weNN, and a newly proposed combination of the two. The study focuses on learning data collected from implementations of an epistemic game and associated qualitative coding data coded by regexes. The contributions of this paper include: 1) a newly proposed combination of bLSTM and weNN, referred to as bwInter, which was identified to have the best performance among the three models, with efficiency from approximately 5.8 (lower recall band) to 10.3 (upper recall band) times greater than random searching, compared to the existing bLSTM which had 4.8 (lower recall band) to 5.8 (upper recall band); 2) an examination of the effectiveness of bwInter at five different phases of automated classifier development, which showed, when compared to random searching, increasingly better performance from earlier to later phases in classifier development; and 3) an investigation of performance across different qualitative codes, which showed that, while the effectiveness varies from code to code, the model bwInter always performed significantly better than others, with a minimum efficiency 3.20 times that of random searching. Overall, this paper suggests that, the newly identified model bwInter could be used to create highly effective intelligent co-raters that help identify missing text patterns in coding qualitative data in QE research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Arastoopour, G., et al.: Nephrotex: measuring first-year students’ ways of professional thinking in a virtual internship. In: 2012 ASEE Annual Conference & Exposition, pp. 25–971 (2012)
Blair, K., Schwartz, D.L., Biswas, G., Leelawong, K.: Pedagogical agents for learning by teaching: teachable agents. Educ. Technol., 56–61 (2007)
Blei, D.M., Ng, A.Y.: Latent dirichlet allocation. J. Mach. Learn. Res. 3(4–5), 993–1022 (2003)
Cai, Z., Eagan, B., Marquart, C., Shaffer, D.W.: LSTM neural network assisted regex development for qualitative coding. In: Damşa, C., Barany, A. (eds.) Advances in Quantitative Ethnography, ICQE 2022. Communications in Computer and Information Science, vol. 1785, pp. 17–29. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-31726-2_2
Cai, Z., Marquart, C., Shaffer, D.: Neural recall network: a neural network solution to low recall problem in regex-based qualitative coding. In: Mitrovic, A., Bosch, N. (eds.) Proceedings of the 15th International Conference on Educational Data Mining, pp. 228–238. International Educational Data Mining Society, Durham, United Kingdom (2022).https://doi.org/10.5281/zenodo.6853047
Cai, Z., Siebert-Evenstone, A., Eagan, B., Shaffer, D.W.: Using topic modeling for code discovery in large scale text data. In: Ruis, A.R., Lee, S.B. (eds.) ICQE 2021. CCIS, vol. 1312, pp. 18–31. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67788-6_2
Charmaz, K.: Constructing Grounded Theory. Sage, London (2006)
Chen, N.C., Drouhard, M., Kocielnik, R., Suh, J., Aragon, C.R.: Using machine learning to support qualitative coding in social science: shifting the focus to ambiguity. ACM Trans. Interact. Intell. Syst. 8(2), 9:1-9:20 (2018). https://doi.org/10.1145/3185515,10.1145/3185515
Chesler, N., Ruis, A., Collier, W., Swiecki, Z., Arastoopour, G., Shaffer, D.: A novel paradigm for engineering education: virtual internships with individualized mentoring and assessment of engineering thinking. J. Biomech. Eng.ng. 137(2), 1–8 (2015)
Crowston, K., Liu, X., Allen, E.E.: Machine learning and rule-based automated coding of qualitative data. Proc. Am. Soc. Inf. Sci. Technol. 47(1), 1–2 (2010)
Darling, W.M.: A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 642–647 (2011)
Foltz, P.W., Laham, D., Landauer, T.K.: The intelligent essay assessor: applications to educational technology. Interact. Multimed. Electron. J. Comput.-Enhanced Learn. 1(2), 939–944 (1999)
Gautam, D., Swiecki, Z., Shaffer, D.W., Graesser, A.C., Rus, V.: Modeling classifiers for virtual internships without participant data. In: Proceedings of the 10th International Conference on Educational Data Mining, pp. 278–283 (2017)
Glaser, B., Strauss, A.: The Discovery of Grounded Theory: Stretegies for Qualitative Research. Aldine, Chicago (1967)
Graeser, A.C., Hu, X., Rus, V., Cai, Z.: Conversation-based learning and assessment environments. In: Yan, D., Rupp, A.A., Foltz, P.W. (eds.) Handbook of Automated Scoring, pp. 383–402. Chapman and Hall/CRC, New York (2020)
Kaur, G.: Usage of regular expressions in NLP. Int. J. Res. Eng. Technol. IJERT 3(01), 7 (2014)
Li, G., Jiabao, G.: Bidirectional lstm with attention mechanism and convolutional layer for text classification. Neurocomputing 337, 325–338 (2019)
Longo, L.: Empowering qualitative research methods in education with artificial intelligence. In: Costa, A.P., Reis, L.P., Moreira, A. (eds.) WCQR 2019. AISC, vol. 1068, pp. 1–21. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-31787-4_1
Rietz, T., Maedche, A.: Towards the design of an interactive machine learning system for qualitative coding. In: ICIS (2020)
Selivanov, D., Bickel, M., Wang, Q.: Package ‘text2vec’ (2020)
Shaffer, D.W., Ruis, A.R.: How we code. In: Advances in Quantitative Ethnography: ICQE Conference Proceedings, pp. 62–77 (2021)
Wang, J., Li, H., Cai, Z., Keshtkar, F., Graesser, A., Shaffer, D.W.: Automentor: artificial intelligent mentor in educational game. In: Lane, H.C., Yacef, K., Mostow, J., Pavlik, P. (eds.) AIED 2013. LNCS (LNAI), vol. 7926, pp. 940–941. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39112-5_154
Williams, M., Moser, T.: The art of coding and thematic exploration in qualitative research. Int. Manage. Rev. 15(1), 45–55 (2019)
Acknowledgment
This work was funded in part by the National Science Foundation (DRL-2100320, DRL-2201723, DRL-2225240), the Wisconsin Alumni Research Foundation, and the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison. The opinions, findings, and conclusions do not reflect the views of the funding agencies, cooperating institutions, or other individuals.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cai, Z., Eagan, B., Williamson Shaffer, D. (2023). Negative Reversion: Toward Intelligent Co-raters for Coding Qualitative Data in Quantitative Ethnography. In: Arastoopour Irgens, G., Knight, S. (eds) Advances in Quantitative Ethnography. ICQE 2023. Communications in Computer and Information Science, vol 1895. Springer, Cham. https://doi.org/10.1007/978-3-031-47014-1_29
Download citation
DOI: https://doi.org/10.1007/978-3-031-47014-1_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47013-4
Online ISBN: 978-3-031-47014-1
eBook Packages: Computer ScienceComputer Science (R0)