Summary
The post-genomic era has seen a significant increase in the use of computational prediction methods to gain insights into structure and function of proteins. Prediction tools are used to guide the experimental design to test various hypotheses about structure and function of known proteins. However, these tools are particularly useful when studying putative protein sequences with no known function. The genomic era produced a large number of sequences that are described as either hypothetical proteins or as proteins with unknown function. Current molecular biology techniques are not adequate to efficiently study this vast reservoir of genetic information. However, computer algorithms can process large amounts of sequence data to predict structure and function. These knowledge-based computational tools use available experimental data and are regularly updated to improve their predictive power. The simplest form of function prediction is achieved by comparison of the query sequence to all available sequences using BLAST. If the query sequence is highly similar to previously characterized proteins, then it is likely that the query sequence has similar functions. However, if the query sequence does not have any homologous sequence with known function, then more sophisticated computational tools are necessary to gain insight into structure and function. Various methods have been developed to search for known domains, motifs, patterns, or profiles. The quality of predictions is dependent on the type of tools used and is limited to the closeness of the query sequence to known proteins.
In this chapter, we will describe and discuss methods and tools we used to predict structure and function of a putative protein sequence (Msa) with unknown function. We will address the advantages and limitations of all these approaches by using the Msa protein from the human pathogen Staphylococcus aureus as a case study. Msa is a novel protein that is involved in regulation of virulence. Since Msa has no known homolog, computational tools are being used to predict its structure and mechanism of action. These predictions are used to design experiments to study Msa and explore its use as a therapeutic target to combat antibiotic-resistant infections.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25(17), 3389–3402 (1997)
Pandey, G., Kumar, V., Steinbach, M.: Computational approaches for protein function prediction: A survey. Tech. Rep. TR 06-028, Department of Computer Science and Engineering, University of Minnesota (2006)
Sambanthamoorthy, K., Smeltzer, M.S., Elasri, M.O.: Identification and characterization of msa (sa1233), a gene involved in expression of sara and several virulence factors in staphylococcus aureus. Microbiology 152(Pt 9), 2559–2572 (2006)
Nagarajan, V., Elasri, M.O.: Structure and function predictions of the msa protein in staphylococcus aureus. BMC Bioinformatics 8(suppl 7), S5 (2007)
Matsuda, S., Vert, J.P., Saigo, H., Ueda, N., Toh, H., Akutsu, T.: A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14(11), 2804–2813 (2005)
Pasquier, C., Promponas, V.J., Hamodrakas, S.J.: Pred-class: cascading neural networks for generalized protein classification and genome-wide applications. Proteins 44(3), 361–369 (2001)
Nakai, K., Horton, P.: Psort: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24(1), 34–36 (1999)
Bhasin, M., Garg, A., Raghava, G.P.: Pslpred: prediction of subcellular localization of bacterial proteins. Bioinformatics 21(10), 2522–2524 (2005)
Yu, C.S., Chen, Y.C., Lu, C.H., Hwang, J.K.: Prediction of protein subcellular localization. Proteins 64(3), 643–651 (2006)
Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: Svm-prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 31(13), 3692–3697 (2003)
Jensen, L.J., Gupta, R., Staerfeldt, H.H., Brunak, S.: Prediction of human protein function according to gene ontology categories. Bioinformatics 19(5), 635–642 (2003)
Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch, A.: In: Walker JM (ed.) The Proteomics Protocols Handbook, pp. 571–607. Humana Press (2005)
Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: Signalp 3.0. J. Mol. Biol. 340(4), 783–795 (2004)
Gardy, J.L., Laird, M.R., Chen, F., Rey, S., Walsh, C.J., Ester, M., Brinkman, F.S.: Psortb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21(5), 617–623 (2005)
Gomi, M., Sonoyama, M., Mitaku, S.: High performance system for signal peptide prediction: Sosuisignal. Chem-Bio. Informatics Journal 4(4), 142–147 (2004)
von Heijne, G.: A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14(11), 4683–4690 (1986)
Kall, L., Krogh, A., Sonnhammer, E.L.: A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338(5), 1027–1036 (2004)
Ikeda, M., Arai, M., Lao, D.M., Shimizu, T.: Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. Silico Biol. 2(1), 19–33 (2002)
Hofmann, K., Stoffel, W.: Tmbase - a database of membrane spanning protein segments. Biol. Chem. Hoppe-Seyler 374, 166 (1993)
Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305(3), 567–580 (2001)
Juretic, D., Zoranic, L., Zucic, D.: Basic charge clusters and predictions of membrane protein topology. J. Chem. Inf. Comput. Sci. 42(3), 620–632 (2002)
Tusnady, G.E., Simon, I.: The hmmtop transmembrane topology prediction server. Bioinformatics 17(9), 849–850 (2001)
Jones, D.T., Taylor, W.R., Thornton, J.M.: A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33(10), 3038–3049 (1994)
Cserzo, M., Wallin, E., Simon, I., von Heijne, G., Elofsson, A.: Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 10(6), 673–676 (1997)
Kihara, D., Shimizu, T., Kanehisa, M.: Prediction of membrane proteins based on classification of transmembrane segments. Protein Eng. 11(11), 961–970 (1998)
Heijne, G.v.: Membrane protein structure prediction. hydrophobicity analysis and the positive-inside rule. J. Mol. Biol. 225(2), 487–494 (1992)
Deleage, G., Blanchet, C., Geourjon, C.: Protein structure prediction. Implications for the biologist. Biochimie 79(11), 681–686 (1997)
Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: Pfam: clans, web tools and services. Nucleic Acids Res. 34(Database issue), D247–D251 (2006)
Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., Kahn, D.: The prodom database of protein domain families: more emphasis on 3d. Nucleic Acids Res. 33(Database issue), D212–D215 (2005)
Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., Lopez, R.: Interproscan: protein domains identifier. Nucleic Acids Res. 33(web server issue), W116–W120 (2005)
Letunic, I., Copley, R.R., Schmidt, S., Ciccarelli, F.D., Doerks, T., Schultz, J., Ponting, C.P., Bork, P.: Smart 4.0: towards genomic data integration. Nucleic Acids Res. 32(Database issue), D142–D144 (2004)
Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Castro, E.D., Langendijk-Genevaux, P.S., Pagni, M., Sigrist, C.J.: The prosite database. Nucleic Acids Res. 34(Database issue), D227–D230 (2006)
Solovyev, V.V., Kolchanov, N.A.: Search for functional sites using consensus. In: Kolchanov, N.A., Lim, H.A. (eds.), pp. 16–21. World Scientific, Singapore (1994)
Castro, E.de., Sigrist, C.J., Gattiker, A., Bulliard, V., Langendijk-Genevaux, P.S., Gasteiger, E., Bairoch, A., Hulo, N.: Scanprosite: detection of prosite signature matches and prorule-associated functional and structural residues in proteins. Nucleic Acids Res. 34(Web Server issue), W362–W365 (2006)
Kelley, L.A., MacCallum, R.M., Sternberg, M.J.: Enhanced genome annotation using structural profiles in the program 3d-pssm. J. Mol. Biol. 299(2), 499–520 (2000)
Schwede, T., Kopp, J., Guex, N., Peitsch, M.C.: Swiss-model: An automated protein homology-modeling server. Nucleic Acids Res. 31(13), 3381–3385 (2003)
Vriend, G.: What if: a molecular modeling and drug design program. J. Mol. Graph 8(1), 29, 52–56 (1990)
Ramachandran, G.N., Ramakrishnan, C., Sasisekharan, V.: Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95–99 (1963)
Laskowski, R.A., Watson, J.D., Thornton, J.M.: Profunc: a server for predicting protein function from 3d structure. Nucleic Acids Res. 33(web server issue), W89–W93 (2005)
Laurie, A.T., Jackson, R.M.: Q-sitefinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 21(9), 1908–1916 (2005)
Liang, S., Zhang, C., Liu, S., Zhou, Y.: Protein binding site prediction using an empirical scoring function. Nucleic Acids Res. 34(13), 3698–3707 (2006)
Jambon, M., Imberty, A., Deleage, G., Geourjon, C.: A new bioinformatic approach to detect common 3d sites in protein structures. Proteins 52(2), 137–145 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Nagarajan, V., Elasri, M.O. (2008). Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog. In: Smolinski, T.G., Milanova, M.G., Hassanien, AE. (eds) Computational Intelligence in Biomedicine and Bioinformatics. Studies in Computational Intelligence, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70778-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-70778-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70776-9
Online ISBN: 978-3-540-70778-3
eBook Packages: EngineeringEngineering (R0)