Authors:
Naoya Higuchi
1
;
Yasunobu Imamura
1
;
Tetsuji Kuboyama
2
;
Kouichi Hirata
1
and
Takeshi Shinohara
1
Affiliations:
1
Kyushu Institute of Technology, Kawazu 680-4, Iizuka 820-8502 and Japan
;
2
Gakushuin University, Mejiro 1-5-1, Toshima, Tokyo 171-8588 and Japan
Keyword(s):
Similarity Search, Sketch, Ball Partitioning, Hamming Distance, Dimension Reduction, Distance Lower Bound.
Abstract:
We discuss the nearest neighbor search using sketch which is a kind of locality sensitive hash (LSH). Nearest neighbor search using sketch is done in two stages. In the first stage, the top K candidates, which have close sketches to a query, are selected, where K ≥ 1. In the second stage, the nearest object to the query from K candidates is selected by performing actual distance calculations. Conventionally, higher accurate search requires wider sketches than 32-bit. In this paper, we propose search methods using narrow 16-bit sketch, which enables efficient data management by buckets and implement a faster first stage. To keep accuracy, search using 16-bit sketch requires larger K than using 32-bit sketch. By sorting the data objects according to sketch’s values, cost influence due to the increase in the number of candidates K can be reduced by improving memory locality in the second stage search. The proposed method achieves about 10 times faster search speed while maintaining accu
racy.
(More)