Phylogenomic inference of protein molecular function: advances and challenges
- PMID: 14734307
- DOI: 10.1093/bioinformatics/bth021
Phylogenomic inference of protein molecular function: advances and challenges
Abstract
Motivation: Protein families evolve a multiplicity of functions through gene duplication, speciation and other processes. As a number of studies have shown, standard methods of protein function prediction produce systematic errors on these data. Phylogenomic analysis--combining phylogenetic tree construction, integration of experimental data and differentiation of orthologs and paralogs--has been proposed to address these errors and improve the accuracy of functional classification. The explicit integration of structure prediction and analysis in this framework, which we call structural phylogenomics, provides additional insights into protein superfamily evolution.
Results: Results of protein functional classification using phylogenomic analysis show fewer expected false positives overall than when pairwise methods of functional classification are employed. We present an overview of the motivations and fundamental principles of phylogenomic analysis, new methods developed for the key tasks, benchmark datasets for these tasks (when available) and suggest procedures to increase accuracy. We also discuss some of the methods used in the Celera Genomics high-throughput phylogenomic classification of the human genome.
Availability: Software tools from the Berkeley Phylogenomics Group are available at http://phylogenomics.berkeley.edu
Similar articles
-
Berkeley Phylogenomics Group web servers: resources for structural phylogenomic analysis.Nucleic Acids Res. 2007 Jul;35(Web Server issue):W27-32. doi: 10.1093/nar/gkm325. Epub 2007 May 8. Nucleic Acids Res. 2007. PMID: 17488835 Free PMC article.
-
Automated protein subfamily identification and classification.PLoS Comput Biol. 2007 Aug;3(8):e160. doi: 10.1371/journal.pcbi.0030160. PLoS Comput Biol. 2007. PMID: 17708678 Free PMC article.
-
On the quality of tree-based protein classification.Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12. Bioinformatics. 2005. PMID: 15647305
-
Key challenges in proteomics and proteoinformatics. Progress in proteins.IEEE Eng Med Biol Mag. 2005 May-Jun;24(3):34-40. doi: 10.1109/memb.2005.1436456. IEEE Eng Med Biol Mag. 2005. PMID: 15971839 Review. No abstract available.
-
Alignment-free inference of hierarchical and reticulate phylogenomic relationships.Brief Bioinform. 2019 Mar 22;20(2):426-435. doi: 10.1093/bib/bbx067. Brief Bioinform. 2019. PMID: 28673025 Free PMC article. Review.
Cited by
-
A catalog of the diversity and ubiquity of bacterial microcompartments.Nat Commun. 2021 Jun 21;12(1):3809. doi: 10.1038/s41467-021-24126-4. Nat Commun. 2021. PMID: 34155212 Free PMC article.
-
Core and Accessory Genome Analysis of Vibrio mimicus.Microorganisms. 2021 Jan 18;9(1):191. doi: 10.3390/microorganisms9010191. Microorganisms. 2021. PMID: 33477474 Free PMC article.
-
Revisiting Evaluation of Multiple Sequence Alignment Methods.Methods Mol Biol. 2021;2231:299-317. doi: 10.1007/978-1-0716-1036-7_17. Methods Mol Biol. 2021. PMID: 33289899 Review.
-
Patterns and Constraints in the Evolution of Sperm Individualization Genes in Insects, with an Emphasis on Beetles.Genes (Basel). 2019 Oct 4;10(10):776. doi: 10.3390/genes10100776. Genes (Basel). 2019. PMID: 31590243 Free PMC article.
-
Phylogenetic Clustering of Genes Reveals Shared Evolutionary Trajectories and Putative Gene Functions.Genome Biol Evol. 2018 Sep 1;10(9):2255-2265. doi: 10.1093/gbe/evy178. Genome Biol Evol. 2018. PMID: 30137329 Free PMC article.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials
Miscellaneous