Abstract
In this paper, we propose an efficient KNN service, called KPS (KNN-Peer-Sampling). The KPS service can be used in various contexts e.g. recommendation systems, information retrieval and data mining. KPS borrows concepts from P2P gossip-based clustering protocols to provide a localized and efficient KNN computation in large-scale systems. KPS is a sampling-based iterative approach, combining randomness, to provide serendipity and avoid local minimum, and clustering, to ensure fast convergence. We compare KPS against the state of the art KNN centralized computation algorithm NNDescent, on multiple datasets. The experiments confirm the efficiency of KPS over NNDescent: KPS improves significantly on the computational cost while converging quickly to a close to optimal KNN graph. For instance, the cost, expressed in number of pairwise similarity computations, is reduced by \(\approx \)23 % and \(\approx \)49 % to construct high quality KNN graphs for Jester and MovieLens datasets, respectively. In addition, the randomized nature of KPS ensures eventual convergence, not always achieved with NNDescent.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that the strength of KPS is to achieve good results with less information so this way of comparing is not in our favor.
References
Jester dataset. http://grouplens.org/datasets/jester/
Jester joke recommender. http://shadow.ieor.berkeley.edu/humor/
Movielens dataset. http://grouplens.org/datasets/movielens/
Agrawal, D., Das, S., El Abbadi, A.: Big data, cloud computing: current state and future opportunities. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT/ICDT 2011, pp. 530–533. ACM (2011)
Amato, G., Falchi, F.: KNN based image classification relying on local feature similarity. In: Proceedings of the Third International Conference on SImilarity Search and APplications, SISAP 2010, pp. 101–108. ACM (2010)
Bai, X., Guerraoui, R., Kermarrec, A.-M., Leroy, V.: Collaborative personalized top-k processing. ACM Trans. Database Syst. 36(4), 26 (2011)
Bertier, M., Frey, D., Guerraoui, R., Kermarrec, A.-M., Leroy, V.: The gossple anonymous social network. In: Gupta, I., Mascolo, C. (eds.) Middleware 2010. LNCS, vol. 6452, pp. 191–211. Springer, Heidelberg (2010)
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)
Boutet, A., Frey, D., Guerraoui, R., Jegou, A., Kermarrec, A.-M.: WhatsUp: a decentralized instant news recommender. In: Proceedings of the 27th IEEE International Symposium on Parallel Distributed Processing, IPDPS 2013, pp. 741–752 (2013)
Chen, J., Fang, H.-R., Saad, Y.: Fast approximate KNN graph construction for high dimensional data via recursive Lanczos bisection. J. Mach. Learn. Res. 10, 1989–2012 (2009)
Connor, M., Kumar, P.: Fast construction of k-nearest neighbor graphs for point clouds. IEEE Trans. Vis. Comput. Graph. 16(4), 599–608 (2010)
Desrosiers, C., Karypis, G.: A comprehensive survey of neighborhood-based recommendation methods. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 107–144. Springer, Heidelberg (2011)
Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: Proceedings of the 20th International Conference on World Wide Web, WWW 2011, pp. 577–586. ACM (2011)
Giakkoupis, G., Kermarrec, A.-M., Woelfel, P.: Gossip protocols for renaming and sorting. In: Afek, Y. (ed.) DISC 2013. LNCS, vol. 8205, pp. 194–208. Springer, Heidelberg (2013)
Guo, G., Wang, H., Bell, D.J., Bi, Y., Greer, K.: KNN model-based approach in classification. In: Meersman, R., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 986–996. Springer, Heidelberg (2003)
Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 1312–1317. AAAI Press (2011)
Huiskes, M.J., Lew, M.S.: The MIR flickr retrieval evaluation. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, MIR 2008, pp. 39–43. ACM (2008)
Jelasity, M., Montresor, A., Babaoglu, O.: T-Man: gossip-based fast overlay topology construction. Comput. Netw.: Int. J. Comput. Telecommun. Netw. 53(13), 2321–2339 (2009)
Jelasity, M., Voulgaris, S., Guerraoui, R., Kermarrec, A.-M., van Steen, M.: Gossip-based peer sampling. ACM Trans. Comput. Syst. 25(3) (2007)
Olman, V., Mao, F., Wu, H., Xu, Y.: Parallel clustering algorithm for large data sets with applications in bioinformatics. The IEEE/ACM Trans. Comput. Biol. Bioinform. 6(2), 344–352 (2009)
Ormándi, R., Hegedűs, I., Jelasity, M.: Overlay management for fully distributed user-based collaborative filtering. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 446–457. Springer, Heidelberg (2010)
Pan, R., Dolog, P., Xu, G.: KNN-based clustering for improving social recommender systems. In: Cao, L., Zeng, Y., Symeonidis, A.L., Gorodetsky, V.I., Yu, P.S., Singh, M.P. (eds.) ADMI. LNCS, vol. 7607, pp. 115–125. Springer, Heidelberg (2013)
Paredes, R., Chávez, E., Figueroa, K., Navarro, G.: Practical construction of k-nearest neighbor graphs in metric spaces. In: Àlvarez, C., Serna, M. (eds.) WEA 2006. LNCS, vol. 4007, pp. 85–97. Springer, Heidelberg (2006)
Voulgaris, S., van Steen, M.: VICINITY: a pinch of randomness brings out the structure. In: Eyers, D., Schwan, K. (eds.) Middleware 2013. LNCS, vol. 8275, pp. 21–40. Springer, Heidelberg (2013)
Wang, J., Wang, J., Zeng, G., Tu, Z., Gan, R., Li, S.: Scalable k-NN graph construction for visual descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012, pp. 1106–1113 (2012)
Zhong, R., Li, G., Tan, K.-L., Zhou, L.: G-tree: an efficient index for knn search on road networks. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, pp. 39–48. ACM (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Benkaouz, Y., Erradi, M., Kermarrec, AM. (2016). Nearest Neighbors Graph Construction: Peer Sampling to the Rescue. In: Abdulla, P., Delporte-Gallet, C. (eds) Networked Systems. NETYS 2016. Lecture Notes in Computer Science(), vol 9944. Springer, Cham. https://doi.org/10.1007/978-3-319-46140-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-46140-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46139-7
Online ISBN: 978-3-319-46140-3
eBook Packages: Computer ScienceComputer Science (R0)