Abstract
We study algorithms for solving a problem of constructing a text (a long string) from a dictionary (a sequence of small strings). The problem has an application in bioinformatics and has a connection with the sequence assembly method for reconstructing a long DNA sequence from small fragments. Our problem is the construction a string t of length n using strings \(s^1,\dots , s^m\) with possible overlapping. Firstly, we provide a classical (randomized) algorithm with running time \(O\left( n+L +m(\log n)^2\right) =\tilde{O}(n+L)\) where L is the sum of lengths of \(s^1,\dots ,s^m\). Secondly, we provide a quantum algorithm with running time \(O\left( n +\log n\cdot (\log m+\log \log n)\cdot \sqrt{m\cdot L}\right) =\tilde{O}\left( n +\sqrt{m\cdot L}\right) \). Additionally, we show that the lower bound for a classical (randomized or deterministic) algorithm is \(\varOmega (n+L)\). Thus, our classical algorithm is optimal up to a log factor, and our quantum algorithm shows a speed-up when compared with any classical (randomized or deterministic) algorithm in the case of non-constant length of strings in the dictionary.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ablayev F, Ablayev M, Huang JZ, Khadiev K, Salikhova N, Wu D (2019) Big Data Min Anal 3(1):41
Ambainis A (2018) In: Proc. int. conf. of math. 2018, vol 4, pp 3283–3304
Ambainis A, Balodis K, Iraids J, Khadiev K, Kļevickis V, Prūsis K, Shen Y, Smotrovs J, Vihrovs J (2020) In: 45th International symposium on mathematical foundations of computer science (MFCS 2020), Leibniz international proceedings in informatics (LIPIcs), vol 170, pp 8:1–8:14
Baichoo S, Ouzounis CA (2017) Biosystems 156:72
Bennett CH, Bernstein E, Brassard G, Vazirani U (1997) SIAM J Comput 26(5):1510
Bleidorn C (2016) Syst Biodivers 14(1):1
Boyer M, Brassard G, Høyer P, Tapp A (1998) Fortschr Phys 46(4–5):493
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. McGraw-Hill
de Wolf R (2001) Quantum computing and communication complexity
Freivalds R (1979) In: Mathematical foundations of computer science, LNCS, vol 74 (1979). LNCS 74:57–69
Glos A, Nahimovs N, Balakirev K, Khadiev K (2021) Quantum Inf Process 20(1):1
Grover LK (1996) In: Proceedings of the twenty-eighth annual ACM symposium on theory of computing. ACM, pp 212–219
Jordan S Quantum algorithms zoo (2021). http://www.quantumalgorithmzoo.org/
Kapralov R, Khadiev K, Mokut J, Shen Y, Yagafarov M (2020) arXiv preprint arXiv:2008.00270
Karp RM, Rabin MO (1987) IBM J Res Dev 31(2):249
Khadiev K, Ilikaev A (2019) International conference on theory and practice of. natural computing, pp 234–245
Khadiev K, Kravchenko D, Serov D (2019) In: Proceedings of CSR 2019, LNCS, vol 11532, pp 228–236
Khadiev K, Mannapov I, Safina L (2019) CEUR workshop proceedings 2500
Khadiev K, Safina L (2019) In: Proceedings of UCNC 2019. LNCS, vol 4362, pp 150–163
Kothari R (2014) In: 31st International symposium on theoretical aspects of computer science, p 482
Kravchenko D, Khadiev K, Serov D, Kapralov R (2020) Lect Notes Comput Sci 12448:83
Laaksonen A (2017) Guide to competitive programming. Springer
Li Z, Li J, Huo H (2018) In: String Processing and information retrieval. Springer International Publishing, Cham, pp 268–284
Lin CYY, Lin HH (2016) Theory Comput 12(18):1
Manber U, Myers G (1990) In: Proceedings of the first annual ACM-SIAM symposium on discrete algorithms (Society for Industrial and Applied Mathematics,) SODA ’90, pp 319–327
Montanaro A (2017) Algorithmica 77(1):16
Myers EW, Sutton GG, Delcher AL, Dew IM, Fasulo DP, Flanigan MJ, Kravitz SA, Mobarry CM, Reinert KH, Remington KA et al (2000) Science 287(5461):2196
Nielsen MA, Chuang IL (2010) Quantum computation and quantum information. Cambridge University Press
Pop M, Phillippy A, Delcher AL, Salzberg SL (2004) Brief Bioinform 5(3):237
Ramesh H, Vinay V (2003) J Discrete Algorithms 1(1):103
Shmilovici A, Ben-Gal I (2007) Comput Stat 22(1):49
Shor PW (1994) In: FOCS (IEEE Computer Society), pp 124–134
Voelkerding KV, Dames SA, Durtschi JD (2009) Clin Chem 55(4):641
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported by Russian Science Foundation Grant 19-71-00149. Section 5 was funded by the subsidy allocated to Kazan Federal University for the state assignment in the sphere of scientific activities, Project No. 0671-2020-0065; and it has been supported by the Kazan Federal University Strategic Academic Leadership Program.and it has been supported by the Kazan Federal University Strategic Academic.
Appendices
Appendix A: Implementation of segment tree’s operations
Assume that the following elements associated with each vertex v of the segment tree:
-
g(v) is the target value for a segment tree. If it is not assigned, then \(g(v)=-\infty \)
-
h(v) is an additional value.
-
l(v) is the left border of the segment that is associated with the vertex v.
-
r(v) is the right border of the segment that is associated with the vertex v.
-
LeftC(v) is the left child of the vertex.
-
RightC(v) is the right child of the vertex
If a vertex v is a leaf, then \(LeftC(v)=RightC(v)=NULL\).
Let st is associated with the root of the tree.
Let us present the algorithm for Update operation. It is recursive procedure.
Let us present the algorithm for Push operation. It is recursive procedure.
Appendix B: Quantum algorithm for comparing two strings with different length
Let us present a quantum algorithm for Comparing two strings with possibly different length. There is a quantum algorithm for comparing two strings of the same length l:
Lemma 4
(Khadiev and Ilikaev 2019) There is a quantum algorithm that compares two strings of length l in the lexicographical order in \(O(\sqrt{l}\log \gamma )\) running time and \(O\left( \frac{1}{ \gamma ^3}\right) \) error probability for a positive integer \(\gamma \).
This algorithm is based on modification of Grover’s search algorithm (Grover 1996; Boyer et al. 1998) that finds the minimal argument satisfying the required property (Kothari 2014; Lin and Lin 2016; Kapralov et al. 2020). Similar idea was used, for example in Ambainis et al. (2020).
Let QC\(\textsc {ompare\_strings\_base}(u,v,l)\) be a quantum subroutine for comparing two strings u and v of length l in the lexicographical order. We choose \(\gamma = m\log n\). In fact, the procedure compares prefixes of u and v of the length l. QC\(\textsc {ompare\_strings\_base}(u,v,l)\) returns 1 if \(u>v\); it returns \(-1\) if \(u<v\); and it returns 0 if \(u=v\).
Next, we use a QC\(\textsc {ompare}(u,v)\) that compares u and v strings in the lexicographical order. Assume that \(\vert u\vert <\vert v\vert \). Then, if u is a prefix of v, then \(u<v\). If u is not a prefix of v, then the result is the same as for QC\(\textsc {ompare\_strings\_base}(u,v,\vert u\vert )\). In the case of \(\vert u\vert >\vert v\vert \), the algorithm is similar. The idea is presented in Algorithm 8.
Rights and permissions
About this article
Cite this article
Khadiev, K., Remidovskii, V. Classical and quantum algorithms for constructing text from dictionary problem. Nat Comput 20, 713–724 (2021). https://doi.org/10.1007/s11047-021-09863-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-021-09863-1