iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://pubmed.ncbi.nlm.nih.gov/33833738/
Metagenomic Data Assembly - The Way of Decoding Unknown Microorganisms - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2021 Mar 23:12:613791.
doi: 10.3389/fmicb.2021.613791. eCollection 2021.

Metagenomic Data Assembly - The Way of Decoding Unknown Microorganisms

Affiliations
Review

Metagenomic Data Assembly - The Way of Decoding Unknown Microorganisms

Alla L Lapidus et al. Front Microbiol. .

Abstract

Metagenomics is a segment of conventional microbial genomics dedicated to the sequencing and analysis of combined genomic DNA of entire environmental samples. The most critical step of the metagenomic data analysis is the reconstruction of individual genes and genomes of the microorganisms in the communities using metagenomic assemblers - computational programs that put together small fragments of sequenced DNA generated by sequencing instruments. Here, we describe the challenges of metagenomic assembly, a wide spectrum of applications in which metagenomic assemblies were used to better understand the ecology and evolution of microbial ecosystems, and present one of the most efficient microbial assemblers, SPAdes that was upgraded to become applicable for metagenomics.

Keywords: SPAdes; algorithms; metagenomic assembly; metagenomics; microbiota.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
High-level overview of metaSPAdes assembly pipeline with the crucial steps and data flow outlined. “AG” denotes an ordinary (strain-level) assembly graph and “CAG” is a consensus assembly graph.
Figure 2
Figure 2
A pre-genomic tree of life representing three main domains: Bacteria, Archaea, and Eukarya. Stylized image reproduced from Woese and Fox (1997).
Figure 3
Figure 3
Next generation sequencing based extended tree of life representing the tree that includes 92 bacterial phyla, 26 archaeal phyla, and all 5 Eukaryotic supergroups. Stylized image reproduced from Hug et al. (2016).

Similar articles

Cited by

References

    1. Almeida A., Nayfach S., Boland M., Strozzi F., Beracochea M., Shi Z. J., et al. . (2021). A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat. Biotechnol. 39, 105–114. 10.1038/s41587-020-0603-3, PMID: - DOI - PMC - PubMed
    1. Alneberg J., Bjarnason B. S., de Bruijn I., Schirmer M., Quick J., Ijaz U. Z., et al. . (2014). Binning metagenomic contigs by coverage and composition. Nat. Methods 11, 1144–1146. 10.1038/nmeth.3103, PMID: - DOI - PubMed
    1. Andrews S., Krueger F., Segonds-Pichon A., Biggins L., Krueger C., Wingett S., et al. . (2010). FastQC: a quality control tool for high throughput sequence data. Available at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (Accessed March 10, 2021).
    1. Antipov D., Hartwick N., Shen M., Raiko M., Lapidus A., Pevzner P. A. (2016a). plasmidSPAdes: assembling plasmids from whole genome sequencing data. Bioinformatics 32, 3380–3387. 10.1093/bioinformatics/btw493, PMID: - DOI - PubMed
    1. Antipov D., Korobeynikov A., McLean J. S., Pevzner P. A. (2016b). hybridSPAdes: an algorithm for hybrid assembly of short and long reads. Bioinformatics 32, 1009–1015. 10.1093/bioinformatics/btv688, PMID: - DOI - PMC - PubMed

LinkOut - more resources