iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://www.ncbi.nlm.nih.gov/pubmed/26072508
ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jun 15;31(12):i44-52.
doi: 10.1093/bioinformatics/btv234.

ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes

Affiliations

ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes

Siavash Mirarab et al. Bioinformatics. .

Abstract

Motivation: The estimation of species phylogenies requires multiple loci, since different loci can have different trees due to incomplete lineage sorting, modeled by the multi-species coalescent model. We recently developed a coalescent-based method, ASTRAL, which is statistically consistent under the multi-species coalescent model and which is more accurate than other coalescent-based methods on the datasets we examined. ASTRAL runs in polynomial time, by constraining the search space using a set of allowed 'bipartitions'. Despite the limitation to allowed bipartitions, ASTRAL is statistically consistent.

Results: We present a new version of ASTRAL, which we call ASTRAL-II. We show that ASTRAL-II has substantial advantages over ASTRAL: it is faster, can analyze much larger datasets (up to 1000 species and 1000 genes) and has substantially better accuracy under some conditions. ASTRAL's running time is [Formula: see text], and ASTRAL-II's running time is [Formula: see text], where n is the number of species, k is the number of loci and X is the set of allowed bipartitions for the search space.

Availability and implementation: ASTRAL-II is available in open source at https://github.com/smirarab/ASTRAL and datasets used are available at http://www.cs.utexas.edu/~phylo/datasets/astral2/.

Contact: smirarab@gmail.com

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Characteristics of the simulation. (a) RF distance between the true species tree and the true gene trees (50 replicates of 1000 genes) for Dataset I. Tree height directly affects the amount of true discordance; the speciation rate affects true gene tree discordance only with 10 M tree length. The number of taxa has a modest effect on the discordance (see Supplementary Fig. S13). (b) RF distance between true gene trees and estimated gene trees for Dataset I. See also Supplementary Figure S1 for inter- and intra-replicate gene tree error distributions
Fig. 2.
Fig. 2.
Comparison of methods with respect to species tree topological accuracy. (Top) Two hundred taxa and varying tree shapes and number of genes. (Bottom) Varying number of taxa and genes and tree shaped fixed to 2 M/1e-6. ASTRAL-II is always at least as accurate as NJst and MP-EST
Fig. 3.
Fig. 3.
Running time comparison with varying number of taxa and genes (Dataset II). Average running time is shown for NJst and ASTRAL-II. Note that ASTRAL-II is much faster on large datasets
Fig. 4.
Fig. 4.
Comparison of ASTRAL-II run using estimated and true gene trees and CA-ML on Dataset I
Fig. 5.
Fig. 5.
Comparison of species tree accuracy with 200 taxa, divided into three categories of gene tree estimation error. Boxes show number of genes
Fig. 6.
Fig. 6.
Comparison of species trees computed on the angiosperm dataset of Xi et al. (2014). MP-EST and ASTRAL-II differ in the placement of Amborella; the concatenation tree agrees with ASTRAL-II

Similar articles

Cited by

References

    1. Bayzid M.S., et al. (2014) Disk covering methods improve phylogenomic analyses. BMC Genomics , 15(Suppl 6), S7. - PMC - PubMed
    1. Degnan J.H., Rosenberg N.A. (2009) Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. , 24, 332–340. - PubMed
    1. Drew B.T., et al. (2014) Another look at the root of the angiosperms reveals a familiar tale. Syst. Biol. , 63, 368–382. - PubMed
    1. Edwards S.V. (2009) Is a new and general theory of molecular systematics emerging? Evolution , 63, 1–19. - PubMed
    1. Fletcher W., Yang Z. (2009) Indelible: a flexible simulator of biological sequence evolution. Mol. Biol. Evol. , 26, 1879–1888. - PMC - PubMed

Publication types