Abstract
Protein interactome is an important research focus in the post-genomic era. The identification of interacting motif pairs is essential for exploring the mechanism of protein interactions. We describe a stochastic AdaBoost approach for discovering motif pairs from known interactions and pairs of proteins that are putatively not to interact. Our interacting motif pairs are validated by multiple-chain PDB structures and show more significant than those selected by traditional statistical method. Furthermore, in a cross-validated comparison, our model can be used to predict interactions between proteins with higher sensitivity (66.42%) and specificity (87.38%) comparing with the Naive Bayes model and the dominating model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Phizicky, E.M., Fields, S.: Protein-Protein Interactions: Methods for Detection and Analysis. Microbiol. Rev. 59(1), 94–123 (1995)
MacBeath, G., Schreiber, S.L.: Printing Proteins as Microarrays for High-Throughput Function Determination. Science 289(5485), 1760–1763 (2000)
Uetz, P., Giot, L., Cagney, G., et al.: A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces Cerevisiae. Nature 403(6770), 623–627 (2000)
Ito, T., Chiba, T., Ozawa, R., et al.: A Comprehensive Two-Hybrid Analysis to Explore the Yeast Protein Interactome. Proc. Natl. Acad. Sci. U S A 98(8), 4569–4574 (2001)
Zhu, H., Bilgin, M., Bangham, R., et al.: Global Analysis of Protein Activities Using Proteome Chips. Science 293(5537), 2101–2105 (2001)
Gavin, A.C., Bosche, M., Krause, R., et al.: Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes. Nature 415(6868), 141–147 (2002)
Ho, Y., Gruhler, A., Heilbut, A., et al.: Systematic Identification of Protein Complexes in Saccharomyces Cerevisiae by Mass Spectrometry. Nature 415(6868), 180–183 (2002)
Mrowka, R., Patzak, A., Herzel, H.: Is There a Bias in Proteome Research? Genome. Res. 11(12), 1971–1973 (2001)
Huynen, M.A., Bork, P.: Measuring Genome Evolution. Proc. Natl. Acad. Sci. U S A 95(11), 5849–5856 (1998)
Pellegrini, M., Marcotte, E.M., Thompson, M.J., et al.: Assigning Protein Functions by Comparative Genome Analysis: Protein Phylogenetic Profiles. Proc. Natl. Acad. Sci. U S A 96(8), 4285–4288 (1999)
Enright, A.J., Iliopoulos, I., Kyrpides, N.C., et al.: Protein Interaction Maps for Complete Genomes Based on Gene Fusion Events. Nature 402(6757), 86–90 (1999)
Marcotte, E.M., Pellegrini, M., Ng, H.L., et al.: Detecting Protein Function and Protein-Protein Interactions from Genome Sequences. Science 285(5428), 751–753 (1999)
Dandekar, T., Snel, B., Huynen, M., et al.: Conservation of Gene Order: A Fingerprint of Proteins that Physically Interact. Trends Biochem. Sci. 23(9), 324–328 (1998)
Overbeek, R., Fonstein, M., D’Souza, M., et al.: The Use of Gene Clusters to Infer Functional Coupling. Proc. Natl. Acad. Sci. U S A 96(6), 2896–2901 (1999)
Wojcik, J., Schachter, V.: Protein-Protein Interaction Map Inference Using Interacting Domain Profile Pairs. Bioinformatics 17(Suppl. 1), S296–S305 (2001)
Deng, M., Mehta, S., Sun, F., et al.: Inferring Domain-Domain Interactions from Protein-Protein Interactions. Genome. Res. 12(10), 1540–1548 (2002)
Kim, W.K., Park, J., Suh, J.K.: Large Scale Statistical Prediction of Protein-Protein Interaction by Potentially Interacting Domain (pid) Pair. In: Genome Inform Ser Workshop Genome Inform, vol. 13, pp. 42–50 (2002)
Bock, J.R., Gough, D.A.: Whole-Proteome Interaction Mining. Bioinformatics 19(1), 125–134 (2003)
Gomez, S.M., Rzhetsky, A.: Towards the Prediction of Complete Protein-Protein Interaction Networks. In: Pac. Symp. Biocomput., pp. 413–424 (2002)
Gomez, S.M., Noble, W.S., Rzhetsky, A.: Learning to Predict Protein-Protein Interactions from Protein Sequences. Bioinformatics 19(15), 1875–1881 (2003)
Han, D.S., Kim, H.S., Jang, W.H., Lee, S.D., Suh, J.K.: PreSPI: A Domain Combination Based Prediction System for Protein-Protein Interaction. Nucleic Acids Res. 32(21), 6312–6320 (2004)
Hayashida, M., Ueda, N., Akutsu, T.: Inferring Strengths of Protein-Protein Interactions from Experimental Data Using Linear Programming. Bioinformatics 19(Suppl. 2), II58–II65 (2003)
Ng, S.K., Zhang, Z., Tan, S.H.: Integrative Approach for Computationally Inferring Protein Domain Interactions. Bioinformatics 19(8), 923–929 (2003)
Chen, X.W., Liu, M.: Prediction of Protein-Protein Interactions Using Random Decision Forest Framework. Bioinformatics 21(24), 4394–4400 (2005)
Espadaler, J., Romero-Isart, O., Jackson, R.M., Oliva, B.: Prediction of Protein-Protein Interactions Using Distant Conservation of Sequence Patterns and Structure Relationships. Bioinformatics 21(16), 3360–3368 (2005)
Liu, Y., Liu, N., Zhao, H.: Inferring Protein-Protein Interactions through High-Throughput Interaction Data from Diverse Organisms. Bioinformatics 21(15), 3279–3285 (2005)
Nye, T.M., Berzuini, C., Gilks, W.R., Babu, M.M., Teichmann, S.A.: Statistical Analysis of Domains in Interacting Protein Pairs. Bioinformatics 21(7), 993–1001 (2005)
Riley, R., Lee, C., Sabatti, C., Eisenberg, D.: Inferring Protein Domain Interactions from Databases of Interacting Proteins. Genome. Biol. 6(10), R89 (2005)
Lehrach, W.P., Husmeier, D., Williams, C.K.: A Regularized Discriminative Model for the Prediction of Protein-Peptide Interactions. Bioinformatics 22(5), 532–540 (2006)
Sprinzak, E., Margalit, H.: Correlated Sequence-Signatures as Markers of Protein-Protein Interaction. J. Mol. Biol. 311(4), 681–692 (2001)
Wang, H., Segal, E., Ben-Hur, A., et al.: Identifying Protein-Protein Interaction Sites on a Genome-Wide Scale. In: Advances in Neural Information Processing Systems 17, pp. 1465–1472. MIT Press, Cambridge (2005)
Fang, J., Haasl, R.J., Dong, Y., Lushington, G.H.: Discover Protein Sequence Signatures from Protein-Protein Interaction Data. BMC Bioinformatics 6(1), 277 (2005)
Falquet, L., Pagni, M., Bucher, P., et al.: The PROSITE Database, its Status in 2002. Nucleic Acids Res. 30(1), 235–238 (2002)
Yu, H., Qian, M., Deng, M.: Understanding Protein-Protein Interactions: From Domain Level to Motif Level. In: Proceeding of Sino-Germany Conference: Network, From Biology to Theory, Springer, Heidelberg (2005)
Freund, Y., Schapire, R.E.: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Salwinski, L., Miller, C.S., Smith, A.J., et al.: The Database of Interacting Proteins: 2004 Update. Nucleic Acids Res. 32(Database issue), D449–D451 (2004)
Jansen, R., Gerstein, M.: Analyzing Protein Function on a Genomic Scale: The Importance of Gold-Standard Positives and Negatives for Network Prediction. Curr. Opin. Microbiol. 7(5), 535–545 (2004)
Deshpande, N., Addess, K.J., Bluhm, W.F., et al.: The RCSB Protein Data Bank: A Redesigned Query System and Relational Database Based on the mmCIF Schema. Nucleic Acids Res. 33(Database issue), D233–D237 (2005)
Taylor, W.R., Jones, D.T.: Deriving an Amino Acid Distance Matrix. J. Theor. Biol. 164(1), 65–83 (1993)
Littlestone, N.: Learning Quickly when Irrelevant Attributes Abound: A New Linear-Threshold Algorithm. Machine Learning 2(4), 285–318 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, H., Qian, M., Deng, M. (2006). Using a Stochastic AdaBoost Algorithm to Discover Interactome Motif Pairs from Sequences. In: Huang, DS., Li, K., Irwin, G.W. (eds) Computational Intelligence and Bioinformatics. ICIC 2006. Lecture Notes in Computer Science(), vol 4115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11816102_66
Download citation
DOI: https://doi.org/10.1007/11816102_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37277-6
Online ISBN: 978-3-540-37282-0
eBook Packages: Computer ScienceComputer Science (R0)