Abstract
We propose a new mechanism to preserve privacy while leveraging user profiles in distributed recommender systems. Our mechanism relies on two contributions: (i) an original obfuscation scheme, and (ii) a randomized dissemination protocol. We show that our obfuscation scheme hides the exact profiles of users without significantly decreasing their utility for recommendation. In addition, we precisely characterize the conditions that make our randomized dissemination protocol differentially private. We compare our mechanism with a non-private as well as with a fully private alternative. We consider a real dataset from a user survey and report on simulations as well as planetlab experiments. We dissect our results in terms of accuracy and privacy trade-offs, bandwidth consumption, as well as resilience to a censorship attack. In short, our extensive evaluation shows that our twofold mechanism provides a good trade-off between privacy and accuracy, with little overhead and high resilience.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
we use the terms ‘node’ and ‘user’ interchangeably to refer to the pair ‘user/machine’.
References
Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: PODS, ACM, New York, NY, pp 247–255
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: SIGMOD, ACM, New York, NY, pp 439–450
Ahmad W, Khokhar A (2007) An architecture for privacy preserving collaborative filtering on web portals. In: IAS, Manchester, pp 273–278
Alaggan M, Gambs S, Kermarrec A-M (2012) BLIP: non-interactive differentially-private similarity computation on bloom filters. In: Richa AW, Scheideler C (eds) Stabilization, safety, and security of distributed systems. Lecture notes in computer science, vol 7596. Springer, Berlin, pp 202–216
Boutet A, Frey D, Guerraoui R, Jégou A, Kermarrec A-M (2013) WHATSUP: a decentralized instant news recommender. In: IEEE 27th international symposium on parallel distributed processing (IPDPS), Boston, MA, pp 741–752
Canny J (2002) Collaborative filtering with privacy via factor analysis. In: SIGIR, ACM, New York, NY, pp 238–245
Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: WWW, ACM, New York, NY, pp 271–280
Dwork C (2008) Differential privacy: a survey of results. In: Theory and applications of models of computation. Springer, Berlin, pp 1–19
Dwork C, McSherry F, Nissim K, Smith A (2006) Calibrating noise to sensitivity in private data analysis. In: Halevi S, Rabin T (eds) Theory of cryptography. Lecture notes in computer science, vol 3876. Springer, Berlin, pp 265–284
Goldreich O (2003) Cryptography and cryptographic protocols. Distrib Comput 16(2–3):177–199. doi:10.1007/s00446-002-0077-1
Haeberlen A, Pierce BC, Narayan A (2011) Differential privacy under fire. In: SEC, USENIX Association, Berkeley, CA, p 33
Huang Z, Du W, Chen B (2005) Deriving private information from randomized data. In: SIGMOD, ACM, New York, NY, pp 37–48
Kanerva P, Kristoferson J, Holst A (2000) Random indexing of text samples for latent semantic analysis. In: CCSS, University of Pennsylvania, Philadelphia, PA, p 1036
Kargupta H, Datta S, Wang Q, Sivakumar K (2003) On the privacy preserving properties of random data perturbation techniques. In: ICDM, pp 99–106
Machanavajjhala A, Korolova A, Sarma AD (2011) Personalized social recommendations: accurate or private. Proc VLDB Endow 4(7):440–450
Polat H, Du W (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In: ICDM, pp 625–628
Polat H, Du W (2005) SVD-based collaborative filtering with privacy. In: SAC, ACM, New York, NY, pp 791–795
Singh A, Castro M, Druschel P, Rowstron A (2004) Defending against eclipse attacks on overlay networks. In: SIGOPS, ACM, New York, NY
Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4
Tarkoma S, Rothenberg CE, Lagerspetz E (2012) Theory and practice of bloom filters for distributed systems. IEEE Commun Surv Tutor 14(1):131–155
van Rijsbergen CJ (1979) Information retrieval. Butterworth, Oxford
Voulgaris S, Gavidia D, van Steen M (2005) CYCLON: inexpensive membership management for unstructured P2P overlays. J Netw Syst Manag 13(2):197–217. doi:10.1007/s10922-005-4441-x
Voulgaris S, van Steen M (2005) Epidemic-style management of semantic overlays for content-based searching. In: Euro-Par 2005 parallel processing. Springer, Berlin, pp 1143–1152
Wan M, Jönsson A, Wang C, Li L, Yang Y (2012) A random indexing approach for web user clustering and web prefetching. In: New frontiers in applied data mining. Springer, Berlin, pp 40–52
Warner SL (1965) Randomized response: a survey technique for eliminating evasive answer bias. J Am Stat Assoc 60(309):63–69
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boutet, A., Frey, D., Guerraoui, R. et al. Privacy-preserving distributed collaborative filtering. Computing 98, 827–846 (2016). https://doi.org/10.1007/s00607-015-0451-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-015-0451-z