Abstract
Similarity search is a fundamental task in applications such as recommender systems, image retrieval, and text retrieval. Graph-based indexes for similarity search traverse a graph constructed on the dataset to retrieve the query’s neighbors, using edges to navigate to and explore the query’s local neighborhood. Edge selection techniques are crucial for the performance of graph-based indexes, enhancing accuracy and efficiency by preventing local minima, reducing graph diameter, and improving sparsity. The Half-Space Proximal (HSP) Graph is an edge-minimal monotonic graph defined by a geometric edge selection which ensures a diverse, yet sparse set of edges. Unfortunately, the quadratic construction complexity of the HSP Graph renders it impractical for large-scale search scenarios. This work investigates an approximation of the HSP Graph that aims to preserve the monotonic property locally. By leveraging a hierarchical partitioning of the dataset, this work proposes a top-down, distributed graph construction which uses a coarse-scale graph on pivots to facilitate the construction of the layer below. This paper investigates the effectiveness of this approach as a submission to the SISAP 2024 Indexing Challenge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
source code: https://github.com/cole-foster/sisap-2024.git
References
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51(1), 117–122 (2008)
Azizi, I., et al.: ELPIS: graph-based similarity search for scalable data science. VLDB 16(6), 1548–1559 (2023)
Baranchuk, D., et al.: Revisiting the inverted indices for billion-scale approximate nearest neighbors. In: ECCV, pp. 202–216 (2018)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Comm. ACM 18(9), 509–517 (1975)
Beygelzimer, A., et al.: Cover trees for nearest neighbor. In: ICML, pp. 97–104 (2006)
Borgeaud, S., et al.: Improving language models by retrieving from trillions of tokens. In: ICML, pp. 2206–2240 (2022)
Bratić, B., et al.: NN-descent on high-dimensional data. In: WIMS, pp. 1–8 (2018)
Chavez, E., et al.: Half-space proximal: a new local test for extracting a bounded dilation spanner of a unit disk graph. In: OPODIS, pp. 235–245. Springer, Berlin (2005)
Ciaccia, P., Patella, M., Zezula, P.: M-tree: an efficient access method for similarity search in metric spaces. In: VLDB, vol. 97, pp. 426–435 (1997)
Dearholt, D.W., et al.: Monotonic search networks for computer vision databases. In: ACSSC, vol. 2, pp. 548–553. IEEE (1988)
Dong, W., Moses, C., Li, K.: Efficient k-nearest neighbor graph construction for generic similarity measures. In: TheWebConf, pp. 577–586 (2011)
Foster, C., Chávez, E., Kimia, B.: Finding HSP neighbors via an exact, hierarchical approach. In: SISAP, pp. 3–18. Springer, Berlin (2023)
Fu, C., Cai, D.: EFANNA: an extremely fast approximate nearest neighbor search algorithm based on kNN graph (2016). arXiv preprint arXiv:1609.07228
Fu, C., et al.: Fast approximate nearest neighbor search with the navigating spreading-out graph. VLDB 12(5), 461–474 (2019)
Guo, R., et al.: Accelerating large-scale inference with anisotropic vector quantization. In: ICML, pp. 3887–3896 (2020)
Jayaram, S., et al.: DiskANN: fast accurate billion-point nearest neighbor search on a single node. NeurIPS 32 (2019)
Johnson, J., et al.: Billion-scale similarity search with GPUs. Trans. Big Data 7(3), 535–547 (2019)
Malkov, Y.A., Yashunin, D.A.: Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE 42(4), 824–836 (2018)
Navarro, G.: Searching in metric spaces by spatial approximation. VLDB J. 11(1), 28–46 (2002)
Peng, Y., et al.: Efficient approximate nearest neighbor search in multi-dimensional databases. ACM Manag. Data 1(1), 1–27 (2023)
Ruiz, G., Chávez, E.: Proximal navigation graphs and t-spanners (2014). arXiv
Schuhmann, C., et al.: LAION-5B: an open large-scale dataset for training next generation image-text models. NeurIPS 35, 25278–25294 (2022)
Shiau, R., et al.: Shop the look: building a large scale visual shopping system at Pinterest. In: SIGKDD, pp. 3203–3212 (2020)
Shrivastava, A., Li, P.: Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS). NeurIPS 27 (2014)
Spotify: Annoy (2023). https://github.com/spotify/annoy
Talamantes, A., Chavez, E.: Instance-based learning using the half-space proximal graph. Pattern Recogn. Lett. 156, 88–95 (2022)
Tellez, E.S., Aumüller, M., Chavez, E.: Overview of the SISAP 2023 indexing challenge. In: SISAP, pp. 255–264. Springer, Berlin (2023)
Vemuri, H., et al.: Personalized retrieval over millions of items. In: SIGIR, pp. 1014–1022 (2023)
Acknowledgements
We gratefully acknowledge the support of NSF award 1910530.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Foster, C., Chávez, E., Kimia, B. (2025). Top-Down Construction of Locally Monotonic Graphs for Similarity Search. In: Chávez, E., Kimia, B., Lokoč, J., Patella, M., Sedmidubsky, J. (eds) Similarity Search and Applications. SISAP 2024. Lecture Notes in Computer Science, vol 15268. Springer, Cham. https://doi.org/10.1007/978-3-031-75823-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-75823-2_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-75822-5
Online ISBN: 978-3-031-75823-2
eBook Packages: Computer ScienceComputer Science (R0)