Analysis and prediction of RNA-binding residues using sequence, evolutionary conservation, and predicted secondary structure and solvent accessibility
- PMID: 20887256
- DOI: 10.2174/138920310794109193
Analysis and prediction of RNA-binding residues using sequence, evolutionary conservation, and predicted secondary structure and solvent accessibility
Abstract
Identification and prediction of RNA-binding residues (RBRs) provides valuable insights into the mechanisms of protein-RNA interactions. We analyzed the contributions of a wide range of factors including amino acid sequence, evolutionary conservation, secondary structure and solvent accessibility, to the prediction/characterization of RBRs. Five feature sets were designed and feature selection was performed to find and investigate relevant features. We demonstrate that (1) interactions with positively charged amino acids Arg and Lys are preferred by the egatively charged nucleotides; (2) Gly provides flexibility for the RNA binding sites; (3) Glu with negatively charged side chain and several hydrophobic residues such as Leu, Val, Ala and Phe are disfavored in the RNA-binding sites; (4) coil residues, especially in long segments, are more flexible (than other secondary structures) and more likely to interact with RNA; (5) helical residues are more rigid and consequently they are less likely to bind RNA; and (6) residues partially exposed to the solvent are more likely to form RNA-binding sites. We introduce a novel sequence-based predictor of RBRs, RBRpred, which utilizes the selected features. RBRpred is comprehensively tested on three datasets with varied atom distance cutoffs by performing both five-fold cross validation and jackknife tests and achieves Matthew's correlation coefficient (MCC) of 0.51, 0.48 and 0.42, respectively. The quality is comparable to or better than that for state-of-the-art predictors that apply the distancebased cutoff definition. We show that the most important factor for RBRs prediction is evolutionary conservation, followed by the amino acid sequence, predicted secondary structure and predicted solvent accessibility. We also investigate the impact of using native vs. predicted secondary structure and solvent accessibility. The predictions are sufficient for the RBR prediction and the knowledge of the actual solvent accessibility helps in predictions for lower distance cutoffs.
Similar articles
-
Sequence based residue depth prediction using evolutionary information and predicted secondary structure.BMC Bioinformatics. 2008 Sep 20;9:388. doi: 10.1186/1471-2105-9-388. BMC Bioinformatics. 2008. PMID: 18803867 Free PMC article.
-
Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature.Proteins. 2011 Apr;79(4):1230-9. doi: 10.1002/prot.22958. Epub 2011 Jan 25. Proteins. 2011. PMID: 21268114
-
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.BMC Bioinformatics. 2012 May 10;13:89. doi: 10.1186/1471-2105-13-89. BMC Bioinformatics. 2012. PMID: 22574904 Free PMC article.
-
Comprehensive review and empirical analysis of hallmarks of DNA-, RNA- and protein-binding residues in protein chains.Brief Bioinform. 2019 Jul 19;20(4):1250-1268. doi: 10.1093/bib/bbx168. Brief Bioinform. 2019. PMID: 29253082 Review.
-
A comprehensive comparative review of sequence-based predictors of DNA- and RNA-binding residues.Brief Bioinform. 2016 Jan;17(1):88-105. doi: 10.1093/bib/bbv023. Epub 2015 May 1. Brief Bioinform. 2016. PMID: 25935161 Review.
Cited by
-
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.Brief Bioinform. 2024 Mar 27;25(3):bbae162. doi: 10.1093/bib/bbae162. Brief Bioinform. 2024. PMID: 38739759 Free PMC article. Review.
-
EPDRNA: A Model for Identifying DNA-RNA Binding Sites in Disease-Related Proteins.Protein J. 2024 Jun;43(3):513-521. doi: 10.1007/s10930-024-10183-3. Epub 2024 Mar 16. Protein J. 2024. PMID: 38491248
-
RNAincoder: a deep learning-based encoder for RNA and RNA-associated interaction.Nucleic Acids Res. 2023 Jul 5;51(W1):W509-W519. doi: 10.1093/nar/gkad404. Nucleic Acids Res. 2023. PMID: 37166951 Free PMC article.
-
HybridRNAbind: prediction of RNA interacting residues across structure-annotated and disorder-annotated proteins.Nucleic Acids Res. 2023 Mar 21;51(5):e25. doi: 10.1093/nar/gkac1253. Nucleic Acids Res. 2023. PMID: 36629262 Free PMC article.
-
Predictive modeling of moonlighting DNA-binding proteins.NAR Genom Bioinform. 2022 Dec 2;4(4):lqac091. doi: 10.1093/nargab/lqac091. eCollection 2022 Dec. NAR Genom Bioinform. 2022. PMID: 36474806 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources