Abstract
With the expansion of digital sphere and advancement of technology, cyberbullying has become increasingly common, especially among teenagers. In this work, we have created a benchmark Hindi-English code-mixed corpus called BullySent, annotated with bully and sentiment labels for investigating how sentiment label information helps to identify cyberbully in a better way. For a vast portion of India, both of these languages constitute the primary means of communication, and language mixing is common in everyday speech. A multi-task framework called MT-BERT+VecMap based on two different embedding schemes for the efficient representations of code-mixed data, has been developed. Our proposed multi-task framework outperforms all the single-task baselines with the highest accuracy values of 81.12(+/−1.65)% and 77.46(+/−0.99)% for the cyberbully detection task and sentiment analysis task, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Artetxe, M., Labaka, G., Agirre, E.: Learning bilingual word embeddings with (almost) no bilingual data. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 451–462 (2017)
Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 759–760 (2017)
Bohra, A., Vijay, D., Singh, V., Akhtar, S.S., Shrivastava, M.: A dataset of hindi-english code-mixed social media text for hate speech detection. In: Proceedings of the Second Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media, pp. 36–41 (2018)
Caruana, R.: Multitask learning. Mach. Learn. 28(1), 41–75 (1997)
Chauhan, D.S., Dhanush, S., Ekbal, A., Bhattacharyya, P.: Sentiment and emotion help sarcasm? a multi-task learning framework for multi-modal sarcasm, sentiment and emotion analysis. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4351–4360 (2020)
Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259 (2014)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: Proceedings of the International Conference on Weblog and Social Media 2011. Citeseer (2011)
Grave, E., Bojanowski, P., Gupta, P., Joulin, A., Mikolov, T.: Learning word vectors for 157 languages. arXiv preprint arXiv:1802.06893 (2018)
Gupta, D., Ekbal, A., Bhattacharyya, P.: A deep neural network based approach for entity extraction in code-mixed indian social media text. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Khapra, M.M., Ramanathan, A., Kunchukuttan, A., Visweswariah, K., Bhattacharyya, P.: When transliteration met crowdsourcing: an empirical study of transliteration via crowdsourcing using efficient, non-redundant and fair quality control. In: LREC, pp. 196–202. Citeseer (2014)
Myers-Scotton, C.: Duelling Languages: Grammatical Structure in Codeswitching. Oxford University Press, Oxford (1997)
Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. arXiv preprint cs/0506075 (2005)
Reynolds, K., Kontostathis, A., Edwards, L.: Using machine learning to detect cyberbullying. In: 2011 10th International Conference on Machine Learning and Applications and Workshops, vol. 2, pp. 241–244. IEEE (2011)
Singh, A., Saha, S., Hasanuzzaman, M., Dey, K.: Multitask learning for complaint identification and sentiment analysis. Cognitive Computation, pp. 1–16 (2021)
Smith, P.K., Mahdavi, J., Carvalho, M., Fisher, S., Russell, S., Tippett, N.: Cyberbullying: its nature and impact in secondary school pupils. J. Child Psychol. Psychiatry 49(4), 376–385 (2008)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Acknowledgement
The Authors would like to acknowledge the support of Ministry of Home Affairs (MHA), India for conducting this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Maity, K., Saha, S. (2021). A Multi-task Model for Sentiment Aided Cyberbullying Detection in Code-Mixed Indian Languages. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-92273-3_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92272-6
Online ISBN: 978-3-030-92273-3
eBook Packages: Computer ScienceComputer Science (R0)