iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: http://www.ncbi.nlm.nih.gov/pubmed/29036382
MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction - PubMed Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec 15;33(24):3909-3916.
doi: 10.1093/bioinformatics/btx496.

MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction

Affiliations

MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction

Duolin Wang et al. Bioinformatics. .

Abstract

Motivation: Computational methods for phosphorylation site prediction play important roles in protein function studies and experimental design. Most existing methods are based on feature extraction, which may result in incomplete or biased features. Deep learning as the cutting-edge machine learning method has the ability to automatically discover complex representations of phosphorylation patterns from the raw sequences, and hence it provides a powerful tool for improvement of phosphorylation site prediction.

Results: We present MusiteDeep, the first deep-learning framework for predicting general and kinase-specific phosphorylation sites. MusiteDeep takes raw sequence data as input and uses convolutional neural networks with a novel two-dimensional attention mechanism. It achieves over a 50% relative improvement in the area under the precision-recall curve in general phosphorylation site prediction and obtains competitive results in kinase-specific prediction compared to other well-known tools on the benchmark data.

Availability and implementation: MusiteDeep is provided as an open-source tool available at https://github.com/duolinwang/MusiteDeep.

Contact: xudong@missouri.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Deep-learning architecture of MusiteDeep. The input layer is the one-of-K coding of a 33-residue protein fragment centered at the prediction site. Multi-layer CNN is used as the feature extractor but no pooling layers are used. The last hidden state of multi-layer CNN is copied twice, where one directly inputs into the attention mechanism (attention-1) and the other first trans-positioned and then inputs into another attention mechanism (attention-2). The output of the two attention mechanisms is combined and input into the fully connected neural network layers. The final layer is a single neural network layer with the softmax output
Fig. 2.
Fig. 2.
Graphical illustration of the attention-based decoder on the feature map dimension. It decodes the feature maps (h1, h2…, hT) from the last hidden state of multi-layer CNN into a single target representation (H'). All the parameters within each layer are scaled between 0 and 1. The grey scale is shown according to the values of parameters
Fig. 3.
Fig. 3.
ROC and precision-recall curves comparing MusiteDeep with Musite and other deep-learning architectures by five-fold cross-validation
Fig. 4.
Fig. 4.
ROC and precision-recall curves comparing MusiteDeep with other well-known general phosphorylation site prediction tools on the testing set
Fig. 5.
Fig. 5.
ROC and precision-recall curves comparing MusiteDeep with other well-known kinase-specific phosphorylation site prediction tools by five-fold cross-validation of CDK (left) and PKA (right)
Fig. 6.
Fig. 6.
t-SNE plot of the merged representation and the original one-of-K representation

Similar articles

Cited by

References

    1. Alipanahi B. et al. (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol., 33, 831–838. - PubMed
    1. Bahdanau D. et al. (2014) Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv: 1409.0473.
    1. Bairoch A. et al. (2005) The Universal Protein Resource (UniProt). Nucleic Acids Res., 33, D154–D159. - PMC - PubMed
    1. Blom N. et al. (2004) Prediction of post‐translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics, 4, 1633–1649. - PubMed
    1. Caruana R. (1995) Learning many related tasks at the same time with backpropagation. In: Advances in Neural Information Processing Systems, 7, pp. 657–664.