iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://link.springer.com/doi/10.1007/3-540-44676-1_14
Finding Approximate Repetitions under Hamming Distance | SpringerLink
Skip to main content

Finding Approximate Repetitions under Hamming Distance

  • Conference paper
  • First Online:
Algorithms — ESA 2001 (ESA 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2161))

Included in the following conference series:

Abstract

The problem of computing tandem repetitions with K possible mismatches is studied. Two main definitions are considered, and for both of them an O(nK log K + S) algorithm is proposed (S the size of the output). This improves, in particular, the bound obtained in [17].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. G. Benson. An algorithm for finding tandem repeats of unspecified pattern size. In S. Istrail, P. Pevzner, and M. Waterman, editors, Proceedings of the 2nd Annual International Conference on Computational Molecular Biology (RECOMB 98), pages 20–29. ACM Press, 1998.

    Google Scholar 

  2. G. Benson. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research, 27(2):573–580, 1999.

    Article  MathSciNet  Google Scholar 

  3. G. Benson and M. Waterman. A method for fast database search for all k-nucleotide repeats. Nucleic Acids Research, 22:4828–4836, 1994.

    Article  Google Scholar 

  4. R. Cole and R. Hariharan. Approximate string matching: A simpler faster algorithm. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 463–472, San Francisco, California, 25–27 January 1998.

    Google Scholar 

  5. M. Crochemore and W. Rytter. Squares, cubes, and time-space efficient string searching. Algorithmica, 13:405–425, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  6. M. Crochemore. An optimal algorithm for computing the repetitions in a word. Information Processing Letters, 12:244–250, 1981.

    Article  MATH  MathSciNet  Google Scholar 

  7. M. Crochemore. Recherche linéaire d’un carré dans un mot. Comptes Rendus Acad. Sci. Paris Sér. I Math., 296:781–784, 1983.

    MATH  MathSciNet  Google Scholar 

  8. M. Giraud and G. Kucherov. Maximal repetitions and application to DNA sequences. In Proceedings of the Journées Ouvertes: Biologie, Informatique et Mathématiques, pages 165–172, Montpelier, 3–5 mai 2000.

    Google Scholar 

  9. Z. Galil and J. Seiferas. Time-space optimal string matching. Journal of Computer and System Sciences, 26(3):280–294, 1983.

    Article  MathSciNet  Google Scholar 

  10. D. Gusfield. Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, 1997.

    Google Scholar 

  11. C.S. Iliopoulos, D. Moore, and W.F. Smyth. A characterization of the squares in a Fibonacci string. Theoretical Computer Science, 172:281–291, 1997.

    Article  MATH  MathSciNet  Google Scholar 

  12. R. Kolpakov and G. Kucherov. Finding maximal repetitions in a word in linear time. In Proceedings of the 1999 Symposium on Foundations of Computer Science, New York (USA). IEEE Computer Society, October 17–19 1999.

    Google Scholar 

  13. R. Kolpakov and G. Kucherov. On maximal repetitions in words. In Proceedings of the 12-th International Symposium on Fundamentals of Computation Theory, 1999, Iasi (Romania), Lecture Notes in Computer Science, August 30–September 3 1999.

    Google Scholar 

  14. R. Kolpakov and G. Kucherov. Finding approximate repetitions under hamming distance. Technical Report RR-4163, INRIA, avril 2001. available at http://www.inria.fr/rrrt/rr-4163.html.

  15. S. R. Kosaraju. Computation of squares in string. In M. Crochemore and D. Gusfield, editors, Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching, number 807 in Lecture Notes in Computer Science, pages 146–150. Springer Verlag, 1994.

    Google Scholar 

  16. M. Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and Its Applications. Addison Wesley, 1983.

    Google Scholar 

  17. G. Landau and J. Schmidt. An algorithm for approximate tandem repeats. In A. Apostolico, M. Crochemore, Z. Galil, and U. Manber, editors, Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching, number 684 in Lecture Notes in Computer Science, pages 120–133, Padova, Italy, 1993. Springer-Verlag, Berlin.

    Chapter  Google Scholar 

  18. G. Landau, J. Schmidt, and D. Sokol. An algorithm for approximate tandem repeats. Journal of Computational Biology, 8(1):1–18, 2001.

    Article  Google Scholar 

  19. M. G. Main. Detecting leftmost maximal periodicities. Discrete Applied Mathematics, 25:145–153, 1989.

    Article  MathSciNet  MATH  Google Scholar 

  20. M.G. Main and R.J. Lorentz. An O(n log n) algorithm for finding all repetitions in a string. Journal of Algorithms, 5(3):422–432, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  21. E. Rivals, O. Delgrange, J-P. Delahaye, and M. Dauchet. A first step towards chromosome analysis by compression algorithms. In N.G. Bourbakis, editor, Proceedings of the 1st IEEE Symposium on Intelligence in Neural and Biological Systems (INBS), Herndon, VA. IEEE Computer Society, May 29–31 1995.

    Google Scholar 

  22. M. Rodeh, V.R. Pratt, and S. Even. Linear algorithm for data compression via string matching. Journal of the ACM, 28(1):16–24, Jan 1981.

    Google Scholar 

  23. J. Stoye and D. Gusfield. Linear time algorithms for finding and representing all the tandem repeats in a string. Technical Report CSE-98-4, Computer Science Department, University of California, Davis, 1998.

    Google Scholar 

  24. A.O. Slisenko. Detection of periodicities and string matching in real time. Journal of Soviet Mathematics, 22:1316–1386, 1983.

    Article  MATH  Google Scholar 

  25. M.-F. Sagot and E.W. Myers. Identifying satellites in nucleic acid sequences. In fS. Istrail, P. Pevzner, and M. Waterman, editors, Proceedings of the 2nd Annual International Conference on Computational Molecular Biology (RECOMB 98), pages 234–242. ACM Press, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kolpakov, R., Kucherov, G. (2001). Finding Approximate Repetitions under Hamming Distance. In: auf der Heide, F.M. (eds) Algorithms — ESA 2001. ESA 2001. Lecture Notes in Computer Science, vol 2161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44676-1_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-44676-1_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42493-2

  • Online ISBN: 978-3-540-44676-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics