An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?

Resende, Júlio; Durelli, Vinicius H. S.; Moraes, Igor; Silva, Nícollas; Dias, Diego R. C.; Rocha, Leonardo

doi:10.1007/978-3-030-58799-4_42

Júlio Resende¹⁹,
Vinicius H. S. Durelli¹⁹,
Igor Moraes¹⁹,
Nícollas Silva²⁰,
Diego R. C. Dias¹⁹ &
…
Leonardo Rocha¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12249))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1661 Accesses

Abstract

Online social networks have gone mainstream: millions of users have come to rely on the wide range of services provided by social networks. However, the ease use of social networks for communicating information also makes them particularly vulnerable to social spammers, i.e., ill-intentioned users whose main purpose is to degrade the information quality of social networks through the proliferation of different types of malicious data (e.g., social spam, malware downloads, and phishing) that are collectively called low-quality content. Since Twitter is also rife with low-quality content, several researchers have devised various low-quality detection strategies that inspect tweets for the existence of this kind of content. We carried out a brief literature survey of these low-quality detection strategies, examining which strategies are still applicable in the current scenario – taken into account that Twitter has undergone a lot of changes in the last few years. To gather some evidence of the usefulness of the attributes used by the low-quality detection strategies, we carried out a preliminary evaluation of these attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficient Detection of Content Polluters in Social Networks

Information Abuse in Twitter and Online Social Networks: A Survey

Implementation of Tweet Stream Summarization for Malicious Tweet Detection

Notes

1.
As of this writing, the tweet limit is 2,400 per day. It is worth emphasizing that there is also a limit per half-hour period, so users are not able to tweet all 2,400 tweets at one time.

References

Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: PhishAri: automatic realtime phishing detection on twitter. In: 2012 eCrime Researchers Summit. IEEE, October 2012. https://doi.org/10.1109/ecrime.2012.6489521
Almaatouq, A., et al.: Twitter: who gets caught? Observed trends in social micro-blogging spam. In: Proceedings of the 2014 ACM conference on Web science - WebSci. ACM Press (2014). https://doi.org/10.1145/2615569.2615688
Azeta, A.A., Omoregbe, N.A., Ayo, C.K., Raymond, A., Oroge, A., Misra, S.: An anti-cultism social education media system. In: 2014 Global Summit on Computer & Information Technology (GSCIT). IEEE, June 2014. https://doi.org/10.1109/gscit.2014.6970097
Behera, R., Rath, S., Misra, S., Damaševičius, R., Maskeliūnas, R.: Large scale community detection using a small world model. Appl. Sci. 7(11), 1173 (2017). https://doi.org/10.3390/app7111173
Article Google Scholar
Behera, R.K., Rath, S.K., Misra, S., Damaševičius, R., Maskeliūnas, R.: Distributed centrality analysis of social network data using MapReduce. Algorithms 12(8), 161 (2019). https://doi.org/10.3390/a12080161
Article MATH Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Google Scholar
Bosma, M., Meij, E., Weerkamp, W.: A framework for unsupervised spam detection in social networking sites. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 364–375. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_31
Chapter Google Scholar
Chen, W., Yeo, C.K., Lau, C.T., Lee, B.S.: A study on real-time low-quality content detection on Twitter from the users’ perspective. PLOS One 12(8), 1–22 (2017). https://doi.org/10.1371/journal.pone.0182487
Article Google Scholar
Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of the 21th SIGKDD. ACM Press (2015). https://doi.org/10.1145/2783258.2788606
Gao, H., Chen, Y., Lee, K., Palsetia, D., Choudhary, A.: Poster. In: Proceedings of the 18th ACM conference on Computer and communications security. ACM Press (2011). https://doi.org/10.1145/2046707.2093489
Hu, X., Tang, J., Gao, H., Liu, H.: Social spammer detection with sentiment information. In: 2014 IEEE International Conference on Data Mining. IEEE, December 2014. https://doi.org/10.1109/icdm.2014.141
Jin, X., Lin, C.X., Luo, J., Han, J.: Socialspamguard: a data mining-based spam detection system for social media networks. In: Proceedings of the International Conference on Very Large Data Bases (2011)
Google Scholar
Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)
Google Scholar
Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence. IEEE Computer Society Press (1995). https://doi.org/10.1109/tai.1995.479783
Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013). https://doi.org/10.1016/j.eswa.2012.12.015
Article Google Scholar
McCord, M., Chuah, M.: Spam detection on Twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13
Chapter Google Scholar
Miller, Z., Dickinson, B., Deitrick, W., Hu, W., Wang, A.H.: Twitter spammer detection using data stream clustering. Inf. Sci. 260, 64–73 (2014). https://doi.org/10.1016/j.ins.2013.11.016
Article Google Scholar
Santos, I., et al.: Twitter content-based spam filtering. In: Herrero, Á., et al. (eds.) International Joint Conference SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 449–458. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01854-6_46
Song, J., Lee, S., Kim, J.: Spam filtering in Twitter using sender-receiver relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 301–317. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_16
Chapter Google Scholar
Sridharan, V., Shankar, V., Gupta, M.: Twitter games. In: Proceedings of the 28th ACSAC. ACM Press (2012). https://doi.org/10.1145/2420950.2421007
Tan, E., Guo, L., Chen, S., Zhang, X., Zhao, Y.: Spammer behavior analysis and detection in user generated content on social networks. In: 2012 IEEE 32nd International Conference on Distributed Computing Systems. IEEE, June 2012. https://doi.org/10.1109/icdcs.2012.40
Thomas, K., Grier, C., Song, D., Paxson, V.: Suspended accounts in retrospect. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. ACM Press (2011). https://doi.org/10.1145/2068816.2068840
Ungerleider, N.: Almost 10% of Twitter is spam (2015). https://www.fastcompany.com/3044485/almost-10-of-twitter-is-spam. Accessed 02 July 2019
Wang, A.H.: Don’t follow me: spam detection in twitter. In: 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10, July 2010
Google Scholar
Wang, B., Zubiaga, A., Liakata, M., Procter, R.: Making the most of tweet-inherent features for social spam detection on Twitter. arXiv preprint arXiv:1503.07405 (2015)
Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans. Inf. Forensics Secur. 8(8), 1280–1293 (2013). https://doi.org/10.1109/tifs.2013.2267732
Article Google Scholar
Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_17
Chapter Google Scholar
Zheng, X., Zhang, X., Yu, Y., Kechadi, T., Rong, C.: ELM-based spammer detection in social networks. J. Supercomput. 72(8), 2991–3005 (2015). https://doi.org/10.1007/s11227-015-1437-5
Article Google Scholar
Łuksza, K.: Bot traffic is bigger than human. make sure it doesn’t affect you! (2018). https://voluum.com/blog/bot-traffic-bigger-than-human-make-sure-they-dont-affect-you/

Download references

Acknowledgments

This work was partially funded by the Brazilian National Institute of Science and Technology for the Web - INWeb, MASWeb, CAPES, CNPq, Finep, Fapesp and Fapemig.

Author information

Authors and Affiliations

Universidade Federal de São João del-Rei, São João del-Rei, Brazil
Júlio Resende, Vinicius H. S. Durelli, Igor Moraes, Diego R. C. Dias & Leonardo Rocha
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Nícollas Silva

Authors

Júlio Resende
View author publications
You can also search for this author in PubMed Google Scholar
Vinicius H. S. Durelli
View author publications
You can also search for this author in PubMed Google Scholar
Igor Moraes
View author publications
You can also search for this author in PubMed Google Scholar
Nícollas Silva
View author publications
You can also search for this author in PubMed Google Scholar
Diego R. C. Dias
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Rocha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego R. C. Dias .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Chair- Center of ICT/ICE, Covenant University, Ota, Nigeria
Sanjay Misra
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Cagliari, Cagliari, Italy
Ivan Blečić
Clayton School of Information Technology, Monash University, Clayton, VIC, Australia
David Taniar
Department of Information Science, Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Maria A.C. Rocha
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino
Polytechnic University of Bari, Bari, Italy
Carmelo Maria Torre
Department of Neurology, University of Massachusetts Medical School, Worcester, MA, USA
Yeliz Karaca

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Resende, J., Durelli, V.H.S., Moraes, I., Silva, N., Dias, D.R.C., Rocha, L. (2020). An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2020. ICCSA 2020. Lecture Notes in Computer Science(), vol 12249. Springer, Cham. https://doi.org/10.1007/978-3-030-58799-4_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-58799-4_42
Published: 01 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58798-7
Online ISBN: 978-3-030-58799-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Detection of Content Polluters in Social Networks

Information Abuse in Twitter and Online Social Networks: A Survey

Implementation of Tweet Stream Summarization for Malicious Tweet Detection

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Detection of Content Polluters in Social Networks

Information Abuse in Twitter and Online Social Networks: A Survey

Implementation of Tweet Stream Summarization for Malicious Tweet Detection

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation