Abstract
The approximate period recovery problem asks to compute all approximate word-periods of a given word S of length n: all primitive words P (\(|P|=p\)) which have a periodic extension at edit distance smaller than \(\tau _p\) from S, where \(\tau _p = \lfloor \frac{n}{(3.75+\epsilon )\cdot p} \rfloor \) for some \(\epsilon >0\). Here, the set of periodic extensions of P consists of all finite prefixes of \(P^\infty \).
We improve the time complexity of the fastest known algorithm for this problem of Amir et al. [Theor. Comput. Sci., 2018] from \({\mathcal {O}}(n^{4/3})\) to \({\mathcal {O}}(n \log n)\). Our tool is a fast algorithm for Approximate Pattern Matching in Periodic Text. We consider only verification for the period recovery problem when the candidate approximate word-period P is explicitly given up to cyclic rotation; the algorithm of Amir et al. reduces the general problem in \({\mathcal {O}}(n)\) time to a logarithmic number of such more specific instances.
J. Radoszewski and J. Straszyński—Supported by the “Algorithms for text processing with errors and uncertainties” project carried out within the HOMING programme of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund.
W. Rytter—Supported by the Polish National Science Center, grant no. 2014/13/B/ST6/00770.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amir, A., Amit, M., Landau, G.M., Sokol, D.: Period recovery of strings over the Hamming and edit distances. Theor. Comput. Sci. 710, 2–18 (2018). https://doi.org/10.1016/j.tcs.2017.10.026
Amir, A., Eisenberg, E., Levy, A., Porat, E., Shapira, N.: Cycle detection and correction. ACM Trans. Algorithms 9(1), 13:1–13:20 (2012). https://doi.org/10.1145/2390176.2390189
Amit, M., Crochemore, M., Landau, G.M., Sokol, D.: Locating maximal approximate runs in a string. Theor. Comput. Sci. 700, 45–62 (2017). https://doi.org/10.1016/j.tcs.2017.07.021
Apostolico, A., Ehrenfeucht, A.: Efficient detection of quasiperiodicities in strings. Theor. Comput. Sci. 119(2), 247–265 (1993). https://doi.org/10.1016/0304-3975(93)90159-Q
Apostolico, A., Farach, M., Iliopoulos, C.S.: Optimal superprimitivity testing for strings. Inf. Process. Lett. 39(1), 17–20 (1991). https://doi.org/10.1016/0020-0190(91)90056-N
Breslauer, D.: An on-line string superprimitivity test. Inf. Process. Lett. 44(6), 345–347 (1992). https://doi.org/10.1016/0020-0190(92)90111-8
Crochemore, M., Hancart, C., Lecroq, T.: Algorithms on Strings. Cambridge University Press, Cambridge (2007). https://doi.org/10.1017/cbo9780511546853
Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientific, Singapore (2003). https://doi.org/10.1142/4838
Fischetti, V.A., Landau, G.M., Sellers, P.H., Schmidt, J.P.: Identifying periodic occurrences of a template with applications to protein structure. Inf. Process. Lett. 45(1), 11–18 (1993). https://doi.org/10.1016/0020-0190(93)90245-5
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997). https://doi.org/10.1017/cbo9780511574931
Kociumaka, T., Kubica, M., Radoszewski, J., Rytter, W., Waleń, T.: A linear time algorithm for seeds computation. In: Rabani, Y. (ed.) 23rd Annual ACM-SIAM Symposium on Discrete Algorithms. SODA 2012, pp. 1095–1112. SIAM (2012). https://doi.org/10.1137/1.9781611973099
Kolpakov, R.M., Kucherov, G.: Finding approximate repetitions under Hamming distance. Theor. Comput. Sci. 303(1), 135–156 (2003). https://doi.org/10.1016/s0304-3975(02)00448-6
Landau, G.M., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989). https://doi.org/10.1016/0196-6774(89)90010-2
Li, Y., Smyth, W.F.: Computing the cover array in linear time. Algorithmica 32(1), 95–106 (2002). https://doi.org/10.1007/s00453-001-0062-2
Popov, V.Y.: The approximate period problem for DNA alphabet. Theor. Comput. Sci. 304(1–3), 443–447 (2003). https://doi.org/10.1016/S0304-3975(03)00211-1
Sim, J.S., Iliopoulos, C.S., Park, K., Smyth, W.F.: Approximate periods of strings. Theor. Comput. Sci. 262(1), 557–568 (2001). https://doi.org/10.1016/S0304-3975(00)00365-0
Sokol, D., Benson, G., Tojeira, J.: Tandem repeats over the edit distance. Bioinformatics 23(2), 30–35 (2007). https://doi.org/10.1093/bioinformatics/btl309
Sokol, D., Tojeira, J.: Speeding up the detection of tandem repeats over the edit distance. Theor. Comput. Sci. 525, 103–110 (2014). https://doi.org/10.1016/j.tcs.2013.04.021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kociumaka, T., Radoszewski, J., Rytter, W., Straszyński, J., Waleń, T., Zuba, W. (2018). Faster Recovery of Approximate Periods over Edit Distance. In: Gagie, T., Moffat, A., Navarro, G., Cuadros-Vargas, E. (eds) String Processing and Information Retrieval. SPIRE 2018. Lecture Notes in Computer Science(), vol 11147. Springer, Cham. https://doi.org/10.1007/978-3-030-00479-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-00479-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00478-1
Online ISBN: 978-3-030-00479-8
eBook Packages: Computer ScienceComputer Science (R0)