Abstract
Online social networks have gone mainstream: millions of users have come to rely on the wide range of services provided by social networks. However, the ease use of social networks for communicating information also makes them particularly vulnerable to social spammers, i.e., ill-intentioned users whose main purpose is to degrade the information quality of social networks through the proliferation of different types of malicious data (e.g., social spam, malware downloads, and phishing) that are collectively called low-quality content. Since Twitter is also rife with low-quality content, several researchers have devised various low-quality detection strategies that inspect tweets for the existence of this kind of content. We carried out a brief literature survey of these low-quality detection strategies, examining which strategies are still applicable in the current scenario – taken into account that Twitter has undergone a lot of changes in the last few years. To gather some evidence of the usefulness of the attributes used by the low-quality detection strategies, we carried out a preliminary evaluation of these attributes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
As of this writing, the tweet limit is 2,400 per day. It is worth emphasizing that there is also a limit per half-hour period, so users are not able to tweet all 2,400 tweets at one time.
References
Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: PhishAri: automatic realtime phishing detection on twitter. In: 2012 eCrime Researchers Summit. IEEE, October 2012. https://doi.org/10.1109/ecrime.2012.6489521
Almaatouq, A., et al.: Twitter: who gets caught? Observed trends in social micro-blogging spam. In: Proceedings of the 2014 ACM conference on Web science - WebSci. ACM Press (2014). https://doi.org/10.1145/2615569.2615688
Azeta, A.A., Omoregbe, N.A., Ayo, C.K., Raymond, A., Oroge, A., Misra, S.: An anti-cultism social education media system. In: 2014 Global Summit on Computer & Information Technology (GSCIT). IEEE, June 2014. https://doi.org/10.1109/gscit.2014.6970097
Behera, R., Rath, S., Misra, S., Damaševičius, R., Maskeliūnas, R.: Large scale community detection using a small world model. Appl. Sci. 7(11), 1173 (2017). https://doi.org/10.3390/app7111173
Behera, R.K., Rath, S.K., Misra, S., Damaševičius, R., Maskeliūnas, R.: Distributed centrality analysis of social network data using MapReduce. Algorithms 12(8), 161 (2019). https://doi.org/10.3390/a12080161
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Bosma, M., Meij, E., Weerkamp, W.: A framework for unsupervised spam detection in social networking sites. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 364–375. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_31
Chen, W., Yeo, C.K., Lau, C.T., Lee, B.S.: A study on real-time low-quality content detection on Twitter from the users’ perspective. PLOS One 12(8), 1–22 (2017). https://doi.org/10.1371/journal.pone.0182487
Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of the 21th SIGKDD. ACM Press (2015). https://doi.org/10.1145/2783258.2788606
Gao, H., Chen, Y., Lee, K., Palsetia, D., Choudhary, A.: Poster. In: Proceedings of the 18th ACM conference on Computer and communications security. ACM Press (2011). https://doi.org/10.1145/2046707.2093489
Hu, X., Tang, J., Gao, H., Liu, H.: Social spammer detection with sentiment information. In: 2014 IEEE International Conference on Data Mining. IEEE, December 2014. https://doi.org/10.1109/icdm.2014.141
Jin, X., Lin, C.X., Luo, J., Han, J.: Socialspamguard: a data mining-based spam detection system for social media networks. In: Proceedings of the International Conference on Very Large Data Bases (2011)
Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)
Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence. IEEE Computer Society Press (1995). https://doi.org/10.1109/tai.1995.479783
Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013). https://doi.org/10.1016/j.eswa.2012.12.015
McCord, M., Chuah, M.: Spam detection on Twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13
Miller, Z., Dickinson, B., Deitrick, W., Hu, W., Wang, A.H.: Twitter spammer detection using data stream clustering. Inf. Sci. 260, 64–73 (2014). https://doi.org/10.1016/j.ins.2013.11.016
Santos, I., et al.: Twitter content-based spam filtering. In: Herrero, Á., et al. (eds.) International Joint Conference SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 449–458. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01854-6_46
Song, J., Lee, S., Kim, J.: Spam filtering in Twitter using sender-receiver relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 301–317. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_16
Sridharan, V., Shankar, V., Gupta, M.: Twitter games. In: Proceedings of the 28th ACSAC. ACM Press (2012). https://doi.org/10.1145/2420950.2421007
Tan, E., Guo, L., Chen, S., Zhang, X., Zhao, Y.: Spammer behavior analysis and detection in user generated content on social networks. In: 2012 IEEE 32nd International Conference on Distributed Computing Systems. IEEE, June 2012. https://doi.org/10.1109/icdcs.2012.40
Thomas, K., Grier, C., Song, D., Paxson, V.: Suspended accounts in retrospect. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. ACM Press (2011). https://doi.org/10.1145/2068816.2068840
Ungerleider, N.: Almost 10% of Twitter is spam (2015). https://www.fastcompany.com/3044485/almost-10-of-twitter-is-spam. Accessed 02 July 2019
Wang, A.H.: Don’t follow me: spam detection in twitter. In: 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10, July 2010
Wang, B., Zubiaga, A., Liakata, M., Procter, R.: Making the most of tweet-inherent features for social spam detection on Twitter. arXiv preprint arXiv:1503.07405 (2015)
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans. Inf. Forensics Secur. 8(8), 1280–1293 (2013). https://doi.org/10.1109/tifs.2013.2267732
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_17
Zheng, X., Zhang, X., Yu, Y., Kechadi, T., Rong, C.: ELM-based spammer detection in social networks. J. Supercomput. 72(8), 2991–3005 (2015). https://doi.org/10.1007/s11227-015-1437-5
Łuksza, K.: Bot traffic is bigger than human. make sure it doesn’t affect you! (2018). https://voluum.com/blog/bot-traffic-bigger-than-human-make-sure-they-dont-affect-you/
Acknowledgments
This work was partially funded by the Brazilian National Institute of Science and Technology for the Web - INWeb, MASWeb, CAPES, CNPq, Finep, Fapesp and Fapemig.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Resende, J., Durelli, V.H.S., Moraes, I., Silva, N., Dias, D.R.C., Rocha, L. (2020). An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2020. ICCSA 2020. Lecture Notes in Computer Science(), vol 12249. Springer, Cham. https://doi.org/10.1007/978-3-030-58799-4_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-58799-4_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58798-7
Online ISBN: 978-3-030-58799-4
eBook Packages: Computer ScienceComputer Science (R0)