Abstract
The complete genomes of living organisms have provided much information on their phylogenetic relationships. In the past few years, we proposed three alternative methods to model the noise background in the composition vector of protein sequences from a complete genome. The first method is based on the frequencies of the 20 kinds of amino acids appearing in the genome and the multiplicative model. The second method is based on the iterated function system model in fractal geometry. The last method is based on the relationship between a word and its two sub-words in the theory of symbolic dynamics. Here we introduce these methods. The complete genomes of prokaryotes and eukaryotes are selected to test these algorithms. Our distance-based phylogenetic tree of prokaryotes and eukaryotes agrees with the biologists’ “tree of life” based on the 16S-like rRNA genes in a majority of basic branches and most lower taxa.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Anh, V.V., Lau, K.S., Yu, Z.G.: Recognition of an organism from fragments of its complete genome. Phys. Rev. E 66, 031910 (2002)
Brown, T.A.: Genetics, 3rd edn. Chapman & Hall, London (1998)
Brown, J.R., Doolittle, W.F.: Archaea and the prokaryote-to-eukaryote transition. Micro-biol. Mol. Biol. Rev. 61, 456–502 (1997)
Charlebois, R.L., Beiko, R.G., Ragan, M.A.: Branching out. Nature 421, 217–217 (2003)
Chatton, E.: Titres et travaux scientifiques (Sette, Sottano, Italy) (1937)
Chu, K.H., Qi, J., Yu, Z.G., Anh, V.V.: Origin and Phylogeny of Chloroplasts revealed by a simple correlation analysis of complete genome. Mol. Biol. Evol. 21, 200–206 (2004)
Doolittle, R.F.: Microbial genomes opened up. Nature 392, 339–342 (1998)
Doolittle, R.F.: Phylogenetic classification and the universal tree. Science 284, 2124–2128 (1999)
Eisen, J.A., Fraser, C.M.: Phylogenomics: intersection of evolution and genomics. Science 300, 1706–1707 (2003)
Felsenstein, J.: PHYLIP (phylogeny Inference package) version 3.5c (1993), Distributed by the author at http://evolution.genetics.washington.edu/phylip.html
Fitch, W.M., Margoliash, E.: Construction of phylogenetic trees. Science 155, 279–284 (1967)
Fitz-Gibbon, S.T., House, C.H.: Whole genome-based phylogenetic analysis of free-living microorganisms. Nucleic Acids Res. 27, 4218–4222 (1999)
Gupta, R.S.: Protein phylogenies and signature sequences: A reappraisal of evolutionary relationships among Archaebacteria, Eubacteria, and Eukaryotes. Microbiol. Mol. Biol. Rev. 62, 1435–1491 (1998)
Iwabe, N., et al.: Evolutionary relationship of archaebacteria, eubacteria and eukaryotes in-ferred from phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA 86, 9355–9359 (1989)
Li, M., Badger, J.H., Chen, X., Kwong, S., Kearney, P., Zhang, H.: An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics 17, 149–154 (2001)
Lin, J., Gerstein, M.: Whole-genome trees based on the occurrence of folds and orthologs, implications for comparing genomes at different levels. Genome Res. 10, 808–818 (2000)
Martin, W., Herrmann, R.G.: Gene transfer from organelles to the nucleus: How much, what happens, and why? Plant Physiol. 118, 9–17 (1998)
Mayr, E.: Two empires or three. Proc. Natl. Acad. Sci. U.S.A. 95, 9720–9723 (1998)
Qi, J., Luo, H., Hao, B.: CVTree: a phylogenetic tree reconstruction tool based on whole genomes. Nucleic Acids Research 32, W45–W47 (2004a)
Qi, J., Wang, B., Hao, B.: Whole proteome prokaryote phylogeny without sequence alignment: a K-string composition approach. J. Mol. Evol. 58, 1–11 (2004b)
Ragan, M.A.: Detection of lateral gene transfer among microbial genomes. Curr. Opin. Gen. Dev. 11, 620–626 (2001)
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Sankoff, D., Leaduc, G., Antoine, N., Paquin, B., Lang, B.F., Cedergren, R.: Gene order comparisons for phylogenetic inference: Evolution of the mitochondrial genome. Proc. Natl. Acad. Sci. U.S.A. 89, 6575–6579 (1992)
Stuart, G.W., Moffet, K., Baker, S.: Integrated gene species phylogenies from unaligned whole genome protein sequences. Bioinformatics 18, 100–108 (2002a)
Stuart, G.W., Moffet, K., Leader, J.J.: A comprehensive vertebrate phylogeny using vector representations of protein sequences from whole genomes. Mol. Biol. Evol. 19, 554–562 (2002b)
Tekaia, F., Lazcano, A., Dujon, B.: The genomic tree as revealed from whole proteome comparisons. Genome Res. 9, 550–557 (1999)
Vrscay, E.R.: Fractal Geometry and analysis. In: Belair, J. (ed.). NATO ASI series. Kluwer Academic Publishers, Dordrecht (1991)
Weiss, O., Jimenez, M.A., Herzel, H.: Information content of protein sequences. J. Theor. Biol. 206, 379–386 (2000)
Woese, C.R.: Bacterial evolution. Microbiol. Rev. 51, 221–271 (1987)
Woese, C.R.: The universal ansestor. Proc. Natl. Acad. Sci. USA 95, 6854–6859 (1998)
Woese, C.R., Kandler, O., Wheelis, M.L.: Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya. Proc. Natl. Acad. Sci. USA 87, 4576–4579 (1990)
Yu, Z.G., Anh, V.: Phylogenetic tree of prokaryotes based on complete genomes using fractal and correlation analyses. In: Proceedings of the Second Asia-Pacific Bioinformatics Conference, Dunedin, New Zealand. The Australian Computer Society Inc. (2004)
Yu, Z.G., Jiang, P.: Distance, correlation and mutual information among portraits of organisms based on complete genomes. Phys. Lett. A 286, 34–46 (2001)
Yu, Z.G., Anh, V., Lau, K.S.: Multifractal and correlation analysis of protein sequences from complete genome. Phys. Rev. E. 68, 021913 (2003a)
Yu, Z.G., Anh, V., Lau, K.S.: Chaos game representation, and multifractal and correlation analysis of protein sequences from complete genome based on detailed HP model. J. Theor. Biol. 226, 341–348 (2004)
Yu, Z.G., Anh, V., Lau, K.S., Chu, K.H.: The genomic tree of living organisms based on a fractal model. Phys. Lett. A 317, 293–302 (2003b)
Yu, Z.G., Zhou, L.Q., Anh, V.V., Chu, K.H., Long, S.C., Deng, J.Q.: Phylogeny of prokaryotes and chloroplasts revealed by a simple composition approach on all protein sequences from whole genome without sequence alignment. J. Mol. Evol. 60, 538–545 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, ZG., Anh, V., Zhou, LQ. (2005). Fractal and Dynamical Language Methods to Construct Phylogenetic Tree Based on Protein Sequences from Complete Genomes. In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11539902_40
Download citation
DOI: https://doi.org/10.1007/11539902_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28320-1
Online ISBN: 978-3-540-31863-7
eBook Packages: Computer ScienceComputer Science (R0)