Abstract
The contemporary unsupervised word representation methods have been successful in capturing semantic statistics on various Natural Language Processing tasks. However, these methods proved to be futile in addressing tasks like polysemy or homonymy, which prevail in such tasks. There has been a rise in the number of state-of-the-art transfer learning techniques bringing into play the language models pre-trained on large inclusive corpus. Motivated by these techniques, the present paper proposes an efficacious transfer learning based ensemble model. This model is inspired by ULMFit and presents results on challenging sentiment analysis tasks such as contextualization and regularization. We have empirically validated the efficiency of our proposed model by applying it to three conventional datasets for sentiment classification task. Our model accomplished the state-of-the-art outcomes remarkably when compared to acknowledged baselines in terms of classification accuracy.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abid F, Li C, Alam M (2020) Multi-source social media data sentiment analysis using bidirectional recurrent convolutional neural networks. Comput Commun 157:102–115
Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. ICLR, San Diego
Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
Bird S, Klein E, Loper E (2009) Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc., Newton
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Bouazizi M, Ohtsuki T (2017) A pattern-based approach for multi-class sentiment analysis in twitter. IEEE Access 5:20617–20639
Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst 31:102–107
Cambria E, Poria S, Gelbukh A, Thelwall M (2017) Sentiment analysis is a big suitcase. IEEE Intell Syst 32:74–80
Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
Crowdflower (2016) Airline Twitter Sentiment. https://data.world/crowdflower/airline-twitter-sentiment. Online accessed 01 December 2019
de Araujo PHL, de Campos TE, de Sousa MMS (2020) Inferring the source of official texts: can SVM beat ULMFiT?. Springer, Evora, pp 76–86
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding, NAACL-HLT, Association for Computational Linguistics
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. ICML, Bellevue, pp 513–520
Gupta C, Jain A, Joshi N (2019) A novel approach to feature hierarchy in aspect based sentiment analysis using OWA operator. Springer, Chandigarh, pp 661–667
Haddoud M, Mokhtari A, Lecroq T, Abdeddaïm S (2016) Combining supervised term-weighting metrics for SVM text classification with extended term representation. Knowl Inf Syst 3:909–931
Hermann KM, Blunsom P (2013) The role of syntax in vector space models of compositional semantics. Association for Computational Linguistics, Sofia, pp 894–904
Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. Assoc Comput Linguist 1:328–339
Jean-François P (2017) Feature engineering for deep learning. https://medium.com/inside-machine-learning/feature-engineering-for-deep-learning-2b1fc7605ace. Online accessed 15 December 2019
Jiang M, Liang Y, Feng X, Fan X, Pei Z, Xue Yu, Guan R (2018) Text classification based on deep belief network and softmax regression. Neural Comput Appl 29:61–70
Jianqiang Z, Xiaolin G (2017) Comparison research on text pre-processing methods on twitter sentiment analysis. IEEE Access 5:2870–2879
Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. Association for Computational Linguistics, Valencia, pp 427–431
Krishnamurthy G, Majumder N, Poria S, Cambria E (2018) A deep learning approach for multimodal deception detection. arXiv preprint arXiv:1803.00344
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of machine learning research, pp 1188–1196, Beijing
Liu R, Shi Y, Ji C, Jia M (2019) A survey of sentiment analysis based on transfer learning. IEEE Access 7:85401–85412
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis, Portland, Association for Computational Linguistics, pp 142–150
Manning CD, Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT press, Cambridge
McCann B, Bradbury J, Xiong C, Socher R (2017) Learned in translation: contextualized word vectors. Association for Computing Machinery, California, pp 6294–6305
Merity S, Keskar NS, Socher R (2018) Regularizing and optimizing LSTM language models. ICLR
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013a) Distributed representations of words and phrases and their compositionality. Association for Computing Machinery, New York, pp 3111–3119
Mikolov T, Chen K, Corrado G, Dean J (2013b) Efficient estimation of word representations in vector space. In: Proceedings of workshop at ICLR
Mirończuk MM, Protasiewicz J (2018) A recent overview of the state-of-the-art elements of text classification. Expert Syst Appl 106:36–54
Neelakantan A, Shankar J, Passos A, McCallum A (2015) Efficient non-parametric estimation of multiple embeddings per word in vector space. Association for Computational Linguistics, Doha, pp 1059–1069
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans knowl Data Eng 22:1345–1359
Pathak AR, Agarwal B, Pandey M, Rautaray S (2020) Application of deep learning approaches for sentiment analysis. Springer, Singapore, pp 1–31
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. Association for Computational Linguistics, Qatar, Valencia, pp 1532–1543
Pérez-Rosas V, Abouelenien M, Mihalcea R, Burzo M (2015) Deception detection using real-life trial data. Association for Computing Machinery, Seattle, pp 59–66
Peters ME, Ammar W, Bhagavatula C, Power R (2017) Semi-supervised sequence tagging with bidirectional language models. Association for Computational Linguistics, Vancouver, pp 1756–1765
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL-HLT, pp 2227–2237
Rane A, Kumar A (2018) Sentiment classification system of twitter data for US airline service analysis. IEEE, Tokyo, pp 769–773
Saif H, He Y, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of Twitter. Inf Process Manag 52:5–19
Shaukat Z, Zulfiqar AA, Xiao C, Azeem M, Mahmood T (2020) Sentiment analysis on IMDB using lexicon and neural networks. SN Appl Sci 2:1–10
Soleymani M, Garcia D, Jou B, Schuller B, Chang S-F, Pantic M (2017) A survey of multimodal sentiment analysis. Image Vis Comput 65:3–14
Turney PD, Pantel P (2010) From frequency to meaning: vector space models of semantics. J Artif Intell Res 37:141–188
Vaswani A, Shazeer N, Parmar N, Jakob U, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Wang Y, Hou Y, Che W, Liu T (2020) From static to dynamic word representations: a survey. Int J Mach Learn Cybern 11:1611–1630
Wu Y, Li J, Wu J, Chang J (2020) Siamese capsule networks with global and local features for text classification. Neurocomputing 390:88–98
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhutdinov R, Zemel R, Bengio Y (2015) Show, attend and tell: neural image caption generation with visual attention. In: Proceedings of the 32nd international conference on machine learning, PMLR, vol 37, pp 2048–2057
Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks?. Association for Computing Machinery, Montreal, pp 3320–3328
Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural language processing. IEEE Comput Intell Mag 13:55–75
Zheng J, Cai F, Chen H, de Rijke M (2020) Pre-train, Interact, Fine-tune: a novel interaction representation for text classification. Inf Process Manag 57:102215
Acknowledgements
This work was supported by free academic credits from Google Cloud Platform.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Malhotra, S., Kumar, V. & Agarwal, A. Bidirectional transfer learning model for sentiment analysis of natural language. J Ambient Intell Human Comput 12, 10267–10287 (2021). https://doi.org/10.1007/s12652-020-02800-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-020-02800-7