Social network mining of requester communities in crowdsourcing markets

Schall, Daniel; Skopik, Florian

doi:10.1007/s13278-012-0080-x

Social network mining of requester communities in crowdsourcing markets

Original Article
Published: 09 August 2012

Volume 2, pages 329–344, (2012)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Daniel Schall¹ &
Florian Skopik²

418 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Crowdsourcing is a new computing approach where human tasks are outsourced to a large number of human workers. Crowdsourcing has not only attracted attention from industry but also from various academic communities. Amazon Mechanical Turk (AMT) has been the first commercial platform offering crowdsourcing services to its customers. AMT is often referred to as a platform supplying ‘artificial’ artificial-intelligence. Recent research efforts have not been addressing the analysis of the community structure of large-scale crowdsourcing platforms. In this work, we discuss detailed statistics of the popular AMT marketplace to provide insights in task properties and requester behavior. Here we present a model to automatically infer requester communities based on task keywords. Hierarchical clustering is used to identify relations between keywords associated with tasks. We present novel techniques to rank communities and requesters by using a graph-based algorithm. Furthermore, we introduce models and methods for the discovery of relevant crowdsourcing brokers who are able to act as intermediaries between requesters and platforms such as AMT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://www.mturk.com/mturk/welcome.

References

Alonso O, Rose DE, Stewart B (2008) Crowdsourcing for relevance evaluation. SIGIR Forum 42(2):9–15
Article Google Scholar
Barabasi A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509
Article MathSciNet Google Scholar
Benkler Y (2001) Coase’s penguin, or linux and the nature of the firm. CoRR. cs.CY/0109077
Bhattacharyya P, Garg A, Wu S (2011) Analysis of user keyword similarity in online social networks. Soc Netw Anal Min 1:143–158. doi:10.1007/s13278-010-0006-4
Article Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Branting L (2011) Context-sensitive detection of local community structure. Soc Netw Anal Min 1–11. doi:10.1007/s13278-011-0035-7
Burt RS (1992) Structural holes: the social structure of competition. Harvard University Press, Cambridge
Callison-Burch C, Dredze M (2010) Creating speech and language data with amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, CSLDAMT ’10. Association for Computational Linguistics, Stroudsburg, pp 1–12
Carvalho VR, Lease M, Yilmaz E (2011) Crowdsourcing for search evaluation. SIGIR Forum 44(2):17–22
Article Google Scholar
Cazabet R, Takeda H, Hamasaki M, Amblard F (2012) Using dynamic community detection to identify trends in user-generated content. Soc Netw Anal Min 1–11. doi:10.1007/s13278-012-0074-8
Chakrabarti S (2007) Dynamic personalized pagerank in entity-relation graphs. In: Proceedings of the 16th international conference on World Wide Web, WWW ’07. ACM, New York, pp 571–580
Chang J, Boyd-Graber J, Gerrish S, Wang C, Blei D (2009) Reading tea leaves: how humans interpret topic models. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems, vol 22. Morgan Kaufmann, San Mateo, pp 288–296
ClickWorker. http://www.clickworker.com/. Accessed 2012
CrowdFlower. http://crowdflower.com/. Accessed 2012
Doan A, Ramakrishnan, R, Halevy Y (2011) Crowdsourcing systems on the world-wide web. Commun ACM 54(4):86–96
Article Google Scholar
Eda T, Yoshikawa M, Yamamuro M (2008) Locally expandable allocation of folksonomy tags in a directed acyclic graph. In: Proceedings of the 9th international conference on Web information systems engineering, WISE ’08. Springer, Berlin, pp 151–162
Fazeen M, Dantu R, Guturu P (2011) Identification of leaders, lurkers, associates and spammers in a social network: context-dependent and context-independent approaches. Soc Netw Anal Min 1:241–254. doi:10.1007/s13278-011-0017-9
Article Google Scholar
Fisher D, Smith M, Welser HT (2006) You are who you talk to: Detecting roles in usenet newsgroups. In: Proceedings of the 39th annual Hawaii international conference on system sciences, HICSS ’06, vol 03. IEEE Computer Society, Washington, p 59.2
Flickr. http://www.flickr.com/. Accessed 2012
Fogaras D, Rácz B, Csalogány K, Sarlós T (2005) Towards scaling fully personalized pagerank: algorithms, lower bounds, and experiments. Internet Math 2(3):333–358
Article MathSciNet MATH Google Scholar
Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) Crowddb: answering queries with crowdsourcing. In: Proceedings of the 2011 international conference on management of data, SIGMOD ’11. ACM, New York, pp 61–72
Gemmell J, Shepitsen A, Mobasher B, Burke R (2008) Personalizing navigation in folksonomies using hierarchical tag clustering. In: Proceedings of the 10th international conference on data warehousing and knowledge discovery, DaWaK ’08. Springer, Berlin, pp 196–205
Golder S, Huberman BA (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208
Article Google Scholar
Haveliwala TH (2002) Topic-sensitive pagerank. In: Proceedings of the 11th international conference on World Wide Web, WWW ’02. ACM, New York, pp 517–526
Heer J, Bostock M (2010) Crowdsourcing graphical perception: using mechanical turk to assess visualization design. In: Proceedings of the 28th international conference on Human factors in computing systems, CHI ’10. ACM, New York, pp 203–212
Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Syst 22(1):5–53
Article Google Scholar
Heymann P, Garcia-Molina H (2006) Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report, Computer Science Department, Standford University
Howe J (2006) The rise of crowdsourcing. Wired 14(14):1–5
Google Scholar
Howe J (2008) Crowdsourcing: Why the Power of the Crowd is Driving the Future of Business. Crown Business, New York
Ipeirotis PG (2010) Analyzing the amazon mechanical turk marketplace. XRDS 17:16–21
Article Google Scholar
Ipeirotis PG (2012) Mechanical turk: Now with 40.92 % spam, 2010. http://bit.ly/mUGs1n. Accessed 2012
Jeh G, Widom J (2003) Scaling personalized web search. In: Proceedings of the 12th international conference on World Wide Web, WWW ’03. ACM, New York, pp 271–279
Kittur A, Chi EH, Suh B (2008) Crowdsourcing user studies with mechanical turk. In: Proceedings of the twenty-sixth annual SIGCHI conference on human factors in computing systems, CHI ’08. ACM, New York, pp 453–456
Kleinberg JM (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632
Article MathSciNet MATH Google Scholar
Kourtellis N, Alahakoon T, Simha R, Lamnitchi A, Tripathi R (2012) Identifying high betweenness centrality nodes in large social networks. Soc Netw Anal Min 1–16. doi:10.1007/s13278-012-0076-6
Lampe C, Resnick P (2004) Slash(dot) and burn: distributed moderation in a large online conversation space. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’04. ACM, New York, pp 543–550
Little G, Chilton LB, Goldman M, Miller RC (2010) Turkit: human computation algorithms on mechanical turk. In: Proceedings of the 23nd annual ACM symposium on User interface software and technology, UIST ’10. ACM, New York, pp 57–66
Marge M, Banerjee S, Rudnicky AI (2010) Using the amazon mechanical turk for transcription of spoken language. In: Proceedings of the IEEE international conference on acoustics, speech, and, signal processing, pp 5270–5273
Michlmayr E, Cayzer S (2007) Learning user profiles from tagging data and leveraging them for personal(ized) information access. In: Tagging and metadata for social information organization, workshop, WWW07
Munro R, Bethard S, Kuperman V, Lai VT, Melnick R, Potts C, Schnoebelen T, Tily H (2010) Crowdsourcing and language studies: the new generation of linguistic data. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, CSLDAMT ’10. Association for Computational Linguistics, Stroudsburg, pp 122–130
oDesk. http://www.odesk.com/. Accessed 2012
Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web
Parameswaran A, Park H, Garcia-Molina H, Polyzotis N, Widom J (2011) Deco: declarative crowdsourcing. Stanford University technical report
Psaier H, Skopik F, Schall D, Dustdar S (2011) Resource and agreement management in dynamic crowdcomputing environments. EDOC. IEEE Computer Society, Los Vaqueros Circle Los Alamitos, pp 193–202
Quinn AJ, Bederson BB (2011) Human computation: a survey and taxonomy of a growing field. In: Proceedings of the 2011 annual conference on Human factors in computing systems, CHI ’11. ACM, New York, pp 1403–1412
Romesburg C (2004) Cluster analysis for researchers. Krieger Pub. Co., Malabar
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. PNAS 105:1118
Article Google Scholar
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manage 24(5):513–523
Article Google Scholar
Samasource. http://samasource.org/. Accessed 2012
Satzger B, Psaier H, Schall D, Dustdar S (2011) Stimulating skill evolution in market-based crowdsourcing. In: BPM, pp 66–82
Schall D (2011) A human centric runtime framework for mixed service-oriented systems. Distrib Parallel Databases 29:333–360. doi:10.1007/s10619-011-7081-z
Article Google Scholar
Schall D (2012) Expertise ranking using activity and contextual link measures. Data Knowl Eng 71(1):92–113. doi:10.1016/j.datak.2011.08.001
Article Google Scholar
Schall D, Skopik F (2011) An analysis of the structure and dynamics of large-scale q/a communities. In: Eder J, Bieliková M, Tjoa AM (eds) ADBIS. Lecture notes in computer science, vol 6909. Springer, Berlin, pp 285–301
Schall D, Skopik F, Psaier H, Dustdar S (2011) Bridging socially-enhanced virtual communities. In: Chu WC, Wong WE, Palakal MJ, Hung C-C (eds) SAC. ACM, New York, pp 792–799
Shepitsen A, Gemmell J, Mobasher B, Burke R (2008) Personalized recommendation in social tagging systems using hierarchical clustering. In: Proceedings of the 2008 ACM conference on recommender systems, RecSys ’08. ACM, New York, pp 259–266
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM, New York, pp 327–336
Skopik F, Schall D, Dustdar S (2009) Start trusting strangers? bootstrapping and prediction of trust. In: Vossen G, Long DDE, Yu JX (eds) WISE. Lecture notes in computer science, vol 5802. Springer, Berlin, pp 275–289
SmartSheet. http://www.smartsheet.com/. Accessed 2012
SpeechInk. http://www.speechink.com/. Accessed 2012
Vukovic M (2009) Crowdsourcing for enterprises. In: Proceedings of the 2009 congress on services-I, Services ’09. IEEE Computer Society, Washington

Download references

Author information

Authors and Affiliations

Siemens Corporate Technology, Siemensstrasse 90, 1211, Wien, Austria
Daniel Schall
Safety and Security Department, AIT Austrian Institute of Technology, 2444 , Seibersdorf, Austria
Florian Skopik

Authors

Daniel Schall
View author publications
You can also search for this author in PubMed Google Scholar
Florian Skopik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Schall.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schall, D., Skopik, F. Social network mining of requester communities in crowdsourcing markets. Soc. Netw. Anal. Min. 2, 329–344 (2012). https://doi.org/10.1007/s13278-012-0080-x

Download citation

Received: 18 May 2012
Revised: 10 July 2012
Accepted: 17 July 2012
Published: 09 August 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s13278-012-0080-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Social network mining of requester communities in crowdsourcing markets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Crowdstore: A Crowdsourcing Graph Database

Task assignment for social-oriented crowdsourcing

Weaponized Crowdsourcing: An Emerging Threat and Potential Countermeasures

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Social network mining of requester communities in crowdsourcing markets

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Crowdstore: A Crowdsourcing Graph Database

Task assignment for social-oriented crowdsourcing

Weaponized Crowdsourcing: An Emerging Threat and Potential Countermeasures

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation