iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-030-58799-4_42
An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not? | SpringerLink
Skip to main content

An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2020 (ICCSA 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12249))

Included in the following conference series:

  • 1661 Accesses

Abstract

Online social networks have gone mainstream: millions of users have come to rely on the wide range of services provided by social networks. However, the ease use of social networks for communicating information also makes them particularly vulnerable to social spammers, i.e., ill-intentioned users whose main purpose is to degrade the information quality of social networks through the proliferation of different types of malicious data (e.g., social spam, malware downloads, and phishing) that are collectively called low-quality content. Since Twitter is also rife with low-quality content, several researchers have devised various low-quality detection strategies that inspect tweets for the existence of this kind of content. We carried out a brief literature survey of these low-quality detection strategies, examining which strategies are still applicable in the current scenario – taken into account that Twitter has undergone a lot of changes in the last few years. To gather some evidence of the usefulness of the attributes used by the low-quality detection strategies, we carried out a preliminary evaluation of these attributes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    As of this writing, the tweet limit is 2,400 per day. It is worth emphasizing that there is also a limit per half-hour period, so users are not able to tweet all 2,400 tweets at one time.

References

  1. Aggarwal, A., Rajadesingan, A., Kumaraguru, P.: PhishAri: automatic realtime phishing detection on twitter. In: 2012 eCrime Researchers Summit. IEEE, October 2012. https://doi.org/10.1109/ecrime.2012.6489521

  2. Almaatouq, A., et al.: Twitter: who gets caught? Observed trends in social micro-blogging spam. In: Proceedings of the 2014 ACM conference on Web science - WebSci. ACM Press (2014). https://doi.org/10.1145/2615569.2615688

  3. Azeta, A.A., Omoregbe, N.A., Ayo, C.K., Raymond, A., Oroge, A., Misra, S.: An anti-cultism social education media system. In: 2014 Global Summit on Computer & Information Technology (GSCIT). IEEE, June 2014. https://doi.org/10.1109/gscit.2014.6970097

  4. Behera, R., Rath, S., Misra, S., Damaševičius, R., Maskeliūnas, R.: Large scale community detection using a small world model. Appl. Sci. 7(11), 1173 (2017). https://doi.org/10.3390/app7111173

    Article  Google Scholar 

  5. Behera, R.K., Rath, S.K., Misra, S., Damaševičius, R., Maskeliūnas, R.: Distributed centrality analysis of social network data using MapReduce. Algorithms 12(8), 161 (2019). https://doi.org/10.3390/a12080161

    Article  MATH  Google Scholar 

  6. Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on Twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)

    Google Scholar 

  7. Bosma, M., Meij, E., Weerkamp, W.: A framework for unsupervised spam detection in social networking sites. In: Baeza-Yates, R., et al. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 364–375. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_31

    Chapter  Google Scholar 

  8. Chen, W., Yeo, C.K., Lau, C.T., Lee, B.S.: A study on real-time low-quality content detection on Twitter from the users’ perspective. PLOS One 12(8), 1–22 (2017). https://doi.org/10.1371/journal.pone.0182487

    Article  Google Scholar 

  9. Fakhraei, S., Foulds, J., Shashanka, M., Getoor, L.: Collective spammer detection in evolving multi-relational social networks. In: Proceedings of the 21th SIGKDD. ACM Press (2015). https://doi.org/10.1145/2783258.2788606

  10. Gao, H., Chen, Y., Lee, K., Palsetia, D., Choudhary, A.: Poster. In: Proceedings of the 18th ACM conference on Computer and communications security. ACM Press (2011). https://doi.org/10.1145/2046707.2093489

  11. Hu, X., Tang, J., Gao, H., Liu, H.: Social spammer detection with sentiment information. In: 2014 IEEE International Conference on Data Mining. IEEE, December 2014. https://doi.org/10.1109/icdm.2014.141

  12. Jin, X., Lin, C.X., Luo, J., Han, J.: Socialspamguard: a data mining-based spam detection system for social media networks. In: Proceedings of the International Conference on Very Large Data Bases (2011)

    Google Scholar 

  13. Lee, K., Eoff, B.D., Caverlee, J.: Seven months with the devils: a long-term study of content polluters on Twitter. In: Fifth International AAAI Conference on Weblogs and Social Media (2011)

    Google Scholar 

  14. Liu, H., Setiono, R.: Chi2: feature selection and discretization of numeric attributes. In: Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence. IEEE Computer Society Press (1995). https://doi.org/10.1109/tai.1995.479783

  15. Martinez-Romo, J., Araujo, L.: Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst. Appl. 40(8), 2992–3000 (2013). https://doi.org/10.1016/j.eswa.2012.12.015

    Article  Google Scholar 

  16. McCord, M., Chuah, M.: Spam detection on Twitter using traditional classifiers. In: Calero, J.M.A., Yang, L.T., Mármol, F.G., García Villalba, L.J., Li, A.X., Wang, Y. (eds.) ATC 2011. LNCS, vol. 6906, pp. 175–186. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23496-5_13

    Chapter  Google Scholar 

  17. Miller, Z., Dickinson, B., Deitrick, W., Hu, W., Wang, A.H.: Twitter spammer detection using data stream clustering. Inf. Sci. 260, 64–73 (2014). https://doi.org/10.1016/j.ins.2013.11.016

    Article  Google Scholar 

  18. Santos, I., et al.: Twitter content-based spam filtering. In: Herrero, Á., et al. (eds.) International Joint Conference SOCO’13-CISIS’13-ICEUTE’13. AISC, vol. 239, pp. 449–458. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-01854-6_46

  19. Song, J., Lee, S., Kim, J.: Spam filtering in Twitter using sender-receiver relationship. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 301–317. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_16

    Chapter  Google Scholar 

  20. Sridharan, V., Shankar, V., Gupta, M.: Twitter games. In: Proceedings of the 28th ACSAC. ACM Press (2012). https://doi.org/10.1145/2420950.2421007

  21. Tan, E., Guo, L., Chen, S., Zhang, X., Zhao, Y.: Spammer behavior analysis and detection in user generated content on social networks. In: 2012 IEEE 32nd International Conference on Distributed Computing Systems. IEEE, June 2012. https://doi.org/10.1109/icdcs.2012.40

  22. Thomas, K., Grier, C., Song, D., Paxson, V.: Suspended accounts in retrospect. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference. ACM Press (2011). https://doi.org/10.1145/2068816.2068840

  23. Ungerleider, N.: Almost 10% of Twitter is spam (2015). https://www.fastcompany.com/3044485/almost-10-of-twitter-is-spam. Accessed 02 July 2019

  24. Wang, A.H.: Don’t follow me: spam detection in twitter. In: 2010 International Conference on Security and Cryptography (SECRYPT), pp. 1–10, July 2010

    Google Scholar 

  25. Wang, B., Zubiaga, A., Liakata, M., Procter, R.: Making the most of tweet-inherent features for social spam detection on Twitter. arXiv preprint arXiv:1503.07405 (2015)

  26. Yang, C., Harkreader, R., Gu, G.: Empirical evaluation and new design for fighting evolving twitter spammers. IEEE Trans. Inf. Forensics Secur. 8(8), 1280–1293 (2013). https://doi.org/10.1109/tifs.2013.2267732

    Article  Google Scholar 

  27. Yang, C., Harkreader, R.C., Gu, G.: Die free or live hard? Empirical evaluation and new design for fighting evolving Twitter spammers. In: Sommer, R., Balzarotti, D., Maier, G. (eds.) RAID 2011. LNCS, vol. 6961, pp. 318–337. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23644-0_17

    Chapter  Google Scholar 

  28. Zheng, X., Zhang, X., Yu, Y., Kechadi, T., Rong, C.: ELM-based spammer detection in social networks. J. Supercomput. 72(8), 2991–3005 (2015). https://doi.org/10.1007/s11227-015-1437-5

    Article  Google Scholar 

  29. Łuksza, K.: Bot traffic is bigger than human. make sure it doesn’t affect you! (2018). https://voluum.com/blog/bot-traffic-bigger-than-human-make-sure-they-dont-affect-you/

Download references

Acknowledgments

This work was partially funded by the Brazilian National Institute of Science and Technology for the Web - INWeb, MASWeb, CAPES, CNPq, Finep, Fapesp and Fapemig.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego R. C. Dias .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Resende, J., Durelli, V.H.S., Moraes, I., Silva, N., Dias, D.R.C., Rocha, L. (2020). An Evaluation of Low-Quality Content Detection Strategies: Which Attributes Are Still Relevant, Which Are Not?. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2020. ICCSA 2020. Lecture Notes in Computer Science(), vol 12249. Springer, Cham. https://doi.org/10.1007/978-3-030-58799-4_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58799-4_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58798-7

  • Online ISBN: 978-3-030-58799-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics