iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-319-46049-9_12
Parallel Computation for the All-Pairs Suffix-Prefix Problem | SpringerLink
Skip to main content

Parallel Computation for the All-Pairs Suffix-Prefix Problem

  • Conference paper
  • First Online:
String Processing and Information Retrieval (SPIRE 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9954))

Included in the following conference series:

Abstract

We show how to parallelize the optimal algorithm proposed by Tustumi et al. [19] to solve the all-pairs suffix-prefix matching problem for general alphabets. We compared our parallel algorithm with \(\mathsf {SOF}\)  [17], a practical solution for DNA sequences that exhibits good time and space performance in multithreading environments. The experimental results showed that our parallel algorithm achieves a consistent speedup when compared with the sequential algorithm, and it is competitive with \(\mathsf {SOF}\) when the minimum overlap length is small.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    sdsl-lite library is available at https://github.com/simongog/sdsl-lite.

  2. 2.

    https://github.com/felipelouza/apsp.

  3. 3.

    http://confluence.qu.edu.qa/download/attachments/9240580/Prefix.tgz.

  4. 4.

    http://www.uni-ulm.de/in/theo/research/seqana.html.

  5. 5.

    ftp://ftp.bioinfo.wsu.edu/www.citrusgenomedb.org/Citrus_sinensis/C.sinesis_unigene_v1.0/.

  6. 6.

    malloc_count library is available at http://panthema.net/2013/malloc_count.

References

  1. Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. J. Discrete Algorithms 2(1), 53–86 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  2. Dinh, H., Rajasekaran, S.: A memory-efficient data structure representing exact-match overlap graphs with application for next-generation DNA assembly. Bioinformatics 27(14), 1901–1907 (2011)

    Article  Google Scholar 

  3. El-Metwally, S., Hamza, T., Zakaria, M., Helmy, M.: Next-generation sequence assembly: four stages of data processing and computational challenges. PLoS Comput. Biol. 9(12), e1003345 (2013)

    Article  Google Scholar 

  4. Gog, S., Beller, T., Moffat, A., Petri, M.: From theory to practice: plug and play with succinct data structures. In: Gudmundsson, J., Katajainen, J. (eds.) SEA 2014. LNCS, vol. 8504, pp. 326–337. Springer, Heidelberg (2014)

    Google Scholar 

  5. Gonnella, G., Kurtz, S.: Readjoiner: a fast and memory efficient string graph-based sequence assembler. BMC Bioinform. 13(1), 82 (2012)

    Article  Google Scholar 

  6. Gonnet, G.H., Baeza-Yates, R.A., Snider, T.: New indices for text: pat trees and pat arrays. In: Information Retrieval, pp. 66–82. Prentice-Hall Inc, Upper Saddle River (1992)

    Google Scholar 

  7. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, New York (1997)

    Book  MATH  Google Scholar 

  8. Gusfield, D., Landau, G.M., Schieber, B.: An efficient algorithm for the all pairs suffix-prefix problem. Inf. Process. Lett. 41(4), 181–185 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kalyanaraman, A., Aluru, S.: Handbook of computational molecular biology, chap. In: Expressed Sequence Tags: Clustering and applications. CRC Press, Boca Raton (2005)

    Google Scholar 

  10. Kasai, T., Lee, G.H., Arimura, H., Arikawa, S., Park, K.: Linear-time longest-common-prefix computation in suffix arrays and its applications. In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, pp. 181–192. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  11. Louza, F.A., Gog, S., Telles, G.P.: Induced suffix sorting for string collections. In: Proceeding DCC, pp. 43–52. IEEE, Snowbird (2016)

    Google Scholar 

  12. Louza, F.A., Telles, G.P., Ciferri, C.D.D.A.: External memory generalized suffix and LCP arrays construction. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 201–210. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  13. Manber, U., Myers, E.W.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22(5), 935–948 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  14. Ohlebusch, E.: Bioinformatics Algorithms: Sequence Analysis, Genome Rearrangements, and Phylogenetic Reconstruction. Verlag, Oldenbusch (2013)

    MATH  Google Scholar 

  15. Ohlebusch, E., Gog, S.: Efficient algorithms for the all-pairs suffix-prefix problem and the all-pairs substring-prefix problem. Inf. Process. Lett. 110(3), 123–128 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  16. Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Comp. Surv. 39(2), 1–31 (2007)

    Article  Google Scholar 

  17. Rachid, M.H., Malluhi, Q.: A practical and scalable tool to find overlaps between sequences. BioMed Res. Int. 2015, 1–12 (2015)

    Google Scholar 

  18. Simpson, J.T., Durbin, R.: Efficient construction of an assembly string graph using the FM-index. Bioinformatics 26(12), i367–i373 (2010)

    Article  Google Scholar 

  19. Tustumi, W.H., Gog, S., Telles, G.P., Louza, F.A.: An improved algorithm for the all-pairs suffix-prefix problem. J. Discrete Algorithms 47, 34–43 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  20. Weiner, P.: Linear pattern matching algorithms. In: Proceeding Annual Symposium on Switching and Automata Theory, pp. 1–11. IEEE Computer Society, Washington, DC (1973)

    Google Scholar 

Download references

Acknowledgments

FAL acknowledges the financial support CAPES and CNPq (grant No. 162338/2015-5). GPT acknowledges the support of CNPq. The authors thank Prof. Nalvo Almeida for granting access to the machine used for the experiments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felipe A. Louza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Louza, F.A., Gog, S., Zanotto, L., Araujo, G., Telles, G.P. (2016). Parallel Computation for the All-Pairs Suffix-Prefix Problem. In: Inenaga, S., Sadakane, K., Sakai, T. (eds) String Processing and Information Retrieval. SPIRE 2016. Lecture Notes in Computer Science(), vol 9954. Springer, Cham. https://doi.org/10.1007/978-3-319-46049-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46049-9_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46048-2

  • Online ISBN: 978-3-319-46049-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics