Abstract
In this article, we develop a novel radical framework for de novo transcriptome assembly based on suffix trees, called DTAST. DTAST extends contigs by reads that have the longest overlaps with the contigs’ terminuses. These reads can be found in linear time of the length of the reads through a well-designed suffix tree structure. Besides, DTAST proposes two strategies to extract transcript-representing paths: a depth-first enumeration strategy and a hybrid strategy based on length and coverage. Experimental results showed that DTAST performs more competitive than the other compared state-of-the-art de novo assemblers. The software with choice for either strategy is available at https://github.com/Jane110111107/DTAST.
This work is supported by National Natural Science Foundation of China under No. 61672325, No. 61472222, No. 61732009, and No. 61761136017.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B.J., Pachter, L.: Transcript assembly and abundance estimation from RNA-seq reveals thousands of new transcripts and switching among isoforms. Nat. Biotechnol. 28(5), 511 (2010)
Chang, Z., Li, G., Liu, J., Zhang, Y., Ashby, C., Liu, D., Cramer, C.L., Huang, X.: Bridger: a new framework for de novo transcriptome assembly using RNA-seq data. Genome Biol. 16(1), 30 (2015)
Pertea, M., Pertea, G.M., Antonescu, C.M., Chang, T.C., Mendell, J.T., Salzberg, S.L.: Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33(3), 290–295 (2015)
Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., et al.: Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29(7), 644–652 (2011)
Pevzner, P.A., Tang, H., Waterman, M.S.: An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. 98(17), 9748–9753 (2001)
Xie, Y., Wu, G., Tang, J., Luo, R., Patterson, J., Liu, S., Huang, W., He, G., Gu, S., Li, S., et al.: Soapdenovo-trans: de novo transcriptome assembly with short RNA-seq reads. Bioinformatics 30(12), 1660–1666 (2014)
Schulz, M.H., Zerbino, D.R., Vingron, M., Birney, E.: Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28(8), 1086–1092 (2012)
Peng, Y., Leung, H.C., Yiu, S.M., Lv, M.J., Zhu, X.G., Chin, F.Y.: IDBA-tran: a more robust de novo de bruijn graph assembler for transcriptomes with uneven expression levels. Bioinformatics 29(13), i326–i334 (2013)
Robertson, G., Schein, J., Chiu, R., Corbett, R., Field, M., Jackman, S.D., Mungall, K., Lee, S., Okada, H.M., Qian, J.Q., et al.: De novo assembly and analysis of RNA-seq data. Nat. Methods 7(11), 909–912 (2010)
Liu, J., Li, G., Chang, Z., Yu, T., Liu, B., McMullen, R., Chen, P., Huang, X.: Binpacker: packing-based de novo transcriptome assembly from RNA-seq data. PLoS Comput. Biol. 12(2), e1004772 (2016)
Zhao, J., Feng, H., Zhu, D., Zhang, C., Xu, Y.: IsoTree: de novo transcriptome assembly from RNA-Seq reads. In: Cai, Z., Daescu, O., Li, M. (eds.) ISBRA 2017. LNCS, vol. 10330, pp. 71–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59575-7_7
Heber, S., Alekseyev, M., Sze, S.H., Tang, H., Pevzner, P.A.: Splicing graphs and EST assembly problem. Bioinformatics 18(suppl 1), S181–S188 (2002)
Griebel, T., Zacher, B., Ribeca, P., Raineri, E., Lacroix, V., Guigó, R., Sammeth, M.: Modelling and simulating generic RNA-seq experiments with the flux simulator. Nucleic Acids Res. 40(20), 10073–10083 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Zhao, J., Feng, H., Zhu, D., Zhang, C., Xu, Y. (2018). DTAST: A Novel Radical Framework for de Novo Transcriptome Assembly Based on Suffix Trees. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_75
Download citation
DOI: https://doi.org/10.1007/978-3-319-95930-6_75
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95929-0
Online ISBN: 978-3-319-95930-6
eBook Packages: Computer ScienceComputer Science (R0)