Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments
- PMID: 17654362
- DOI: 10.1080/10635150701472164
Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments
Abstract
Alignment quality may have as much impact on phylogenetic reconstruction as the phylogenetic methods used. Not only the alignment algorithm, but also the method used to deal with the most problematic alignment regions, may have a critical effect on the final tree. Although some authors remove such problematic regions, either manually or using automatic methods, in order to improve phylogenetic performance, others prefer to keep such regions to avoid losing any information. Our aim in the present work was to examine whether phylogenetic reconstruction improves after alignment cleaning or not. Using simulated protein alignments with gaps, we tested the relative performance in diverse phylogenetic analyses of the whole alignments versus the alignments with problematic regions removed with our previously developed Gblocks program. We also tested the performance of more or less stringent conditions in the selection of blocks. Alignments constructed with different alignment methods (ClustalW, Mafft, and Probcons) were used to estimate phylogenetic trees by maximum likelihood, neighbor joining, and parsimony. We show that, in most alignment conditions, and for alignments that are not too short, removal of blocks leads to better trees. That is, despite losing some information, there is an increase in the actual phylogenetic signal. Overall, the best trees are obtained by maximum-likelihood reconstruction of alignments cleaned by Gblocks. In general, a relaxed selection of blocks is better for short alignment, whereas a stringent selection is more adequate for longer ones. Finally, we show that cleaned alignments produce better topologies although, paradoxically, with lower bootstrap. This indicates that divergent and problematic alignment regions may lead, when present, to apparently better supported although, in fact, more biased topologies.
Similar articles
-
The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.Pac Symp Biocomput. 2008:25-36. doi: 10.1142/9789812776136_0004. Pac Symp Biocomput. 2008. PMID: 18229674
-
Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.Mol Biol Evol. 2000 Apr;17(4):540-52. doi: 10.1093/oxfordjournals.molbev.a026334. Mol Biol Evol. 2000. PMID: 10742046
-
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1. Syst Biol. 2012. PMID: 22139466
-
Visualization of multiple alignments, phylogenies and gene family evolution.Nat Methods. 2010 Mar;7(3 Suppl):S16-25. doi: 10.1038/nmeth.1434. Nat Methods. 2010. PMID: 20195253 Review.
-
SEQUENCE-FREE PHYLOGENETICS WITH MASS SPECTROMETRY.Mass Spectrom Rev. 2022 Jan;41(1):3-14. doi: 10.1002/mas.21658. Epub 2020 Nov 10. Mass Spectrom Rev. 2022. PMID: 33169385 Review.
Cited by
-
The discovery of an overseen pygmy backswimmer in Europe (Heteroptera, Nepomorpha, Pleidae).Sci Rep. 2024 Nov 15;14(1):28139. doi: 10.1038/s41598-024-78224-6. Sci Rep. 2024. PMID: 39548171
-
Phased chromosome-level genome provides insights into the molecular adaptation for migratory lifestyle and population diversity for Pacific saury, Cololabis saira.Commun Biol. 2024 Nov 15;7(1):1513. doi: 10.1038/s42003-024-07126-0. Commun Biol. 2024. PMID: 39543266 Free PMC article.
-
First De Novo genome assembly and characterization of Gaultheria prostrata.Front Plant Sci. 2024 Oct 29;15:1456102. doi: 10.3389/fpls.2024.1456102. eCollection 2024. Front Plant Sci. 2024. PMID: 39534108 Free PMC article.
-
Dataset of the complete mitogenomes of Galaxea (Scleractinia: Euphyllidae).Data Brief. 2024 Oct 22;57:111060. doi: 10.1016/j.dib.2024.111060. eCollection 2024 Dec. Data Brief. 2024. PMID: 39534065 Free PMC article.
-
PI3K-AKT-mediated phosphorylation of Thr260 in CgCaspase-3/6/7 regulates heat-induced activation in oysters.Commun Biol. 2024 Nov 7;7(1):1459. doi: 10.1038/s42003-024-07184-4. Commun Biol. 2024. PMID: 39511363 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources