Abstract
We present a compact semi-dynamic text index which allows us to find short patterns efficiently. For parameters \(k\le q \le \log _\sigma n - \log _\sigma \log _\sigma n\) and alphabet size \(\sigma = O(\mathrm {polylog}(n))\), all \( occ \) occurrences of a pattern of length at most \(q-k+1\) can be obtained in \(O(k \times occ + \log _\sigma n)\) time, where \(n\) is the length of the text. Adding characters to the end of the text is supported in amortized constant time. Our index requires \((n/k) \log (n/k) + n \log \sigma + o(n)\) bits of space, which is compact (i.e., \(O(n \log \sigma )\)) when \(k = \varTheta (\log _{\sigma } n)\). As a byproduct, we present a succinct van Emde Boas tree which supports insertion, deletion, predecessor, and successor on a dynamic set of integers over the universe \([0, m-1]\) in \(O(\log \log m)\) time and requires only \(m + o(m)\) bits of space.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that they addressed more general edit operations such as insertion of a string to an arbitrary position and deletion/substitution of a substring of the text.
References
Claude, F., Farina, A., MartÃnez-Prieto, M.A., Navarro, G.: Compressed \(q\)-gram indexing for highly repetitive biological sequences. In: Proceedings of the BIBE 2010, pp. 86–91 (2010)
van Emde Boas, P.: Preserving order in a forest in less than logarithmic time. In: FOCS, pp. 75–84. IEEE Computer Society (1975)
Ferragina, P., Grossi, R.: Optimal on-line search and sublinear time update in string matching. SIAM J. Comput. 27(3), 713–736 (1998)
Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: FOCS, pp. 390–398. IEEE Computer Society (2000)
Gupta, A., Hon, W.K., Shah, R., Vitter, J.S.: Dynamic rank/select dictionaries with applications to XML indexing. Technical report 06–014, Purdue University (2006)
Hon, W.K., Lam, T.W., Sadakane, K., Sung, W.K., Yiu, S.M.: Compressed index for dynamic text. In: Data Compression Conference, pp. 102–111 (2004)
Leiserson, C.E., Prokop, H., Randall, K.H.: Using de Bruijn sequences to index a 1 in a computer word (1998) (unpublished manuscript)
Mäkinen, V., Navarro, G.: Succinct suffix arrays based on run-length encoding. Nord. J. Comput. 12(1), 40–66 (2005)
Manber, U., Myers, G.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)
Navarro, G.: Indexing text using the Ziv-Lempel trie. J. Discret. Algorithms 2(1), 87–114 (2004)
Navarro, G., Nekrich, Y.: Optimal dynamic sequence representations. In: Proceedings of the SODA 2013, pp. 865–876 (2013)
Rasmussen, K.R., Stoye, J., Myers, E.W.: Efficient \(q\)-gram filters for finding all \(\epsilon \)-matches over a given length. J. Comput. Biol. 13(2), 296–308 (2006)
Salson, M., Lecroq, T., Léonard, M., Mouchard, L.: Dynamic extended suffix arrays. J. Discret. Algorithms 8(2), 241–257 (2010)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)
Weiner, P.: Linear pattern-matching algorithms. In: Proceedings of 14th IEEE Annual Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Willard, D.E.: Log-logarithmic worst-case range queries are possible in space \(\Theta (N)\). Inf. Process. Lett. 17(2), 81–84 (1983)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Matsuoka, Y., I, T., Inenaga, S., Bannai, H., Takeda, M. (2015). Semi-dynamic Compact Index for Short Patterns and Succinct van Emde Boas Tree. In: Cicalese, F., Porat, E., Vaccaro, U. (eds) Combinatorial Pattern Matching. CPM 2015. Lecture Notes in Computer Science(), vol 9133. Springer, Cham. https://doi.org/10.1007/978-3-319-19929-0_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-19929-0_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19928-3
Online ISBN: 978-3-319-19929-0
eBook Packages: Computer ScienceComputer Science (R0)