Abstract
Images play a vital role in social media platforms, which can more vividly reflect people’s inner emotions and preferences, so visual sentiment analysis has become an important research topic. In this paper, we propose a Supervised Contrastive Learning-based model for image emotion classification, which consists of two modules of low-level feature extraction and deep emotional feature extraction, and feature fusion is used to enhance the overall perception of image emotions. In the low-level feature extraction module, the LBP-U (Local Binary Patterns with Uniform Patterns) algorithm is employed to extract texture features from the images, which can effectively capture the texture information of the images, aiding in the differentiation of images belonging to different emotion categories. In the deep emotional feature extraction module, we introduce a Supervised Contrastive Learning approach to improve the extraction of deep emotional features by narrowing the intra-class distance among images of the same emotion category while expanding the inter-class distance between images of different emotion categories. Through fusing the low-level and deep emotional features, our model comprehensively utilizes features at different levels, thereby enhancing the overall emotion classification performance. To assess the classification performance and generalization capability of the proposed model, we conduct experiments on the publicly FI (Flickr and Instagram) Emotion dataset. Comparative analysis of the experimental results demonstrates that our proposed model has good performance for image emotion classification. Additionally, we conduct ablation experiments to analyze the impact of different levels of features and various loss functions on the model’s performance, thereby validating the superiority of our proposed approach.
Similar content being viewed by others
Data Availability
No datasets were generated or analysed during the current study.
Availability of data and materials
The experimental dataset used in our study is the FI Emotion Dataset established by You et al. [52], which is publicly available on https://qzyou.github.io/.
References
Zhang, Q., Sun, J., Yuan, K., Jiang, Y.: An image emotion classification method based on supervised contrastive learning. In: 2023 8th International Conference on Data Science in Cyberspace (DSC), pp. 313–320. IEEE (2023)
Yin, H., Song, X., Yang, S., Li, J.: Sentiment analysis and topic modeling for COVID-19 vaccine discussions. World Wide Web 25(3), 1067–1083 (2022)
Kalimeri, K., G. Beiró, M., Urbinati, A., Bonanomi, A., Rosina, A., Cattuto, C.: Human values and attitudes towards vaccination in social media. In: Companion Proceedings of the 2019 World Wide Web Conference, pp. 248–254 (2019)
Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., Bandyopadhyay, S.: Enhanced SenticNet with affective labels for concept-based opinion mining. IEEE Intell. Syst. 28(2), 31–38 (2013)
Mäntylä, M.V., Graziotin, D., Kuutila, M.: The evolution of sentiment analysis-a review of research topics, venues, and top cited papers. Comput. Sci. Rev. 27, 16–32 (2018)
Parlar, T., Ozel, S., Song, F.: Analysis of data pre-processing methods for sentiment analysis of reviews. Comput. Sci. 20 (2019)
Manek, A.S., Shenoy, P.D., Mohan, M.C.: Aspect term extraction for sentiment analysis in large movie reviews using Gini index feature selection method and SVM classifier. World Wide Web 20, 135–154 (2017)
Feng, J., Rao, Y., Xie, H., Wang, F.L., Li, Q.: User group based emotion detection and topic discovery over short text. World Wide Web 23, 1553–1587 (2020)
Khosla, A., Das Sarma, A., Hamid, R.: What makes an image popular? In: Proceedings of the 23rd International Conference on World Wide Web, pp. 867–876 (2014)
Lin, Z., Huang, F., Li, Y., Yang, Z., Liu, W.: A layer-wise deep stacking model for social image popularity prediction. World Wide Web 22, 1639–1655 (2019)
Xu, B., Fu, Y., Jiang, Y.-G., Li, B., Sigal, L.: Video emotion recognition with transferred deep feature encodings. In: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, pp. 15–22 (2016)
Liu, S., Zhang, X., Yang, J.: SER30K: A large-scale dataset for sticker emotion recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 33–41 (2022)
You, Q.: Sentiment and emotion analysis for social multimedia: Methodologies and applications. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 1445–1449 (2016)
Wang, S., Wang, Y., Tang, J., Shu, K., Ranganath, S., Liu, H.: What your images reveal: Exploiting visual contents for point-of-interest recommendation. In: Proceedings of the 26th International Conference on World Wide Web, pp. 391–400 (2017)
Gelli, F., Uricchio, T., Bertini, M., Del Bimbo, A., Chang, S.-F.: Image popularity prediction in social media using sentiment and context features. In: Proceedings of the 23rd ACM International Conference on Multimedia, pp. 907–910 (2015)
Won, D., Steinert-Threlkeld, Z.C., Joo, J.: Protest activity detection and perceived violence estimation from social media images. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 786–794 (2017)
Rao, T., Li, X., Xu, M.: Learning multi-level deep representations for image emotion classification. Neural Process. Lett. 51, 2043–2061 (2020)
Yang, J., She, D., Sun, M.: Joint image emotion classification and distribution learning via deep convolutional neural network. In: IJCAI, pp. 3266–3272 (2017)
Chen, M., Zhang, L., Allebach, J.P.: Learning deep features for image emotion classification. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4491–4495. IEEE (2015)
Zhao, S., Yao, X., Yang, J., Jia, G., Ding, G., Chua, T.-S., Schuller, B.W., Keutzer, K.: Affective image content analysis: Two decades review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 44(10), 6729–6751 (2021)
Kim, H.-R., Kim, Y.-S., Kim, S.J., Lee, I.-K.: Building emotional machines: Recognizing image emotions through deep neural networks. IEEE Trans. Multimedia 20(11), 2980–2992 (2018)
Zhao, S., Gao, Y., Jiang, X., Yao, H., Chua, T.-S., Sun, X.: Exploring principles-of-art features for image emotion recognition. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 47–56 (2014)
Yanulevskaya, V., Gemert, J.C., Roth, K., Herbold, A.-K., Sebe, N., Geusebroek, J.-M.: Emotional valence categorization using holistic image features. In: 2008 15th IEEE International Conference on Image Processing, pp. 101–104. IEEE, (2008)
Jia, J., Wu, S., Wang, X., Hu, P., Cai, L., Tang, J.: Can we understand van Gogh’s mood? learning to infer affects from images in social networks. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 857–860 (2012)
Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2751–2758. IEEE (2012)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Computer vision–ECCV 2006: 9th European conference on Computer Vision, Graz, Austria, May 7-13, 2006, Proceedings, Part III 9, pp. 288–301. Springer (2006)
Colombo, C., Del Bimbo, A., Pala, P.: Semantics in visual information retrieval. IEEE Multimedia 6(3), 38–53 (1999)
Matthews, T., Nixon, M.S., Niranjan, M.: Enriching texture analysis with semantic data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1248–1255 (2013)
Yanulevskaya, V., Uijlings, J., Bruni, E., Sartori, A., Zamboni, E., Bacci, F., Melcher, D., Sebe, N.: In the eye of the beholder: Employing statistical analysis and eye tracking for analyzing abstract paintings. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 349–358 (2012)
Ojala, T., Pietikainen, M., Harwood, D.: Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of 12th International Conference on Pattern Recognition, vol. 1, pp. 582–585. IEEE (1994)
Machajdik, J., Hanbury, A.: Affective image classification using features inspired by psychology and art theory. In: Proceedings of the 18th ACM International Conference on Multimedia, pp. 83–92 (2010)
Lu, X., Suryanarayan, P., Adams Jr, R.B., Li, J., Newman, M.G., Wang, J.Z.: On shape and the computability of emotions. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 229–238 (2012)
Borth, D., Ji, R., Chen, T., Breuel, T., Chang, S.-F.: Large-scale visual sentiment ontology and detectors using adjective noun pairs. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 223–232 (2013)
Yuan, J., Mcdonough, S., You, Q., Luo, J.: Sentribute: Image sentiment analysis from a mid-level perspective. In: Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining, pp. 1–8 (2013)
Liu, N., Dellandrea, E., Tellez, B., Chen, L.: Associating textual features with visual ones to improve affective image classification. In: Affective Computing and Intelligent Interaction: 4th International Conference, ACII 2011, Memphis, TN, USA, October 9–12, 2011, Proceedings, Part I 4. pp. 195–204. Springer (2011)
Chen, T., Yu, F.X., Chen, J., Cui, Y., Chen, Y.-Y., Chang, S.-F.: Object-based visual sentiment concept analysis and application. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 367–376 (2014)
Chen, T., Borth, D., Darrell, T., Chang, S.-F.: Deepsentibank: Visual sentiment concept classification with deep convolutional neural networks. arXiv:1410.8586 (2014)
Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: A comprehensive review. Neural Comput. 29(9), 2352–2449 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inform. Process. Syst. 25 (2012)
Xu, C., Cetintas, S., Lee, K.-C., Li, L.-J.: Visual sentiment prediction with deep convolutional neural networks (2014)
Islam, J., Zhang, Y.: Visual sentiment analysis for social images using transfer learning approach. In: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom). pp. 124–130, IEEE (2016)
You, Q., Luo, J., Jin, H., Yang, J.: Robust image sentiment analysis using progressively trained and domain transferred deep networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)
Campos, V., Jou, B., Giro-i-Nieto, X.: From pixels to sentiment: Fine-tuning CNNs for visual sentiment prediction. Image Vis. Comput. 65, 15–22 (2017)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Song, K., Yao, T., Ling, Q., Mei, T.: Boosting image sentiment analysis with visual attention. Neurocomputing 312, 218–228 (2018)
Faster, R.: Towards real-time object detection with region proposal networks. Adv. Neural Inform. Process. Syst. 9199(10.5555), 2969239–2969250 (2015)
Zhu, Y., Zhang, W., Zhang, M., Zhang, K., Zhu, Y.: Image emotion distribution learning based on enhanced fuzzy KNN algorithm with sparse learning. J. Intell. Fuzzy Syst. 41(6), 6443–6460 (2021)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Computer Vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pp. 630–645. Springer (2016)
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., Krishnan, D.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)
Yu, Z., Yu, J., Fan, J., Tao, D.: Multi-modal factorized bilinear pooling with co-attention learning for visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1821–1830 (2017)
You, Q., Luo, J., Jin, H., Yang, J.: Building a large scale dataset for image emotion recognition: The fine print and the benchmark. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60, 91–110 (2004)
Karayev, S., Trentacoste, M., Han, H., Agarwala, A., Darrell, T., Hertzmann, A., Winnemoeller, H.: Recognizing image style. arXiv:1311.3715 (2013)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Funding
This work is supported by the National Natural Science Foundation of China (72271083, 72342011, 72271084, 72101076, and 72101072), the Fundamental Research Funds for the Central Universities (JZ2023YQTD0075) and National Engineering Laboratory for Big Data Distribution and Exchange Technologies.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Jianshan Sun: Methodology, Writing - Review; Editing, Supervision. Qing Zhang: Data collection and analysis, Methodology, Visualization, Writing - Original Draft. Kun Yuan: Data collection and analysis, Methodology. Yuanchun Jiang: Writing - Review; Editing, Supervision. Xinran Chen: Writing - Review; Editing. All authors commented on previous versions of the manuscript, and all authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We note that a shorter conference version of this article appeared in the 8th IEEE International Conference on Data Science in Cyberspace (IEEE DSC 2023) [1]
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, J., Zhang, Q., Yuan, K. et al. A supervised contrastive learning-based model for image emotion classification. World Wide Web 27, 29 (2024). https://doi.org/10.1007/s11280-024-01260-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11280-024-01260-9