iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://github.com/BinXia/PETs
GitHub - BinXia/PETs: The Predictor of Protein-Protein interaction sites based on Extremely-randomized Trees(PETs)
Skip to content
/ PETs Public

The Predictor of Protein-Protein interaction sites based on Extremely-randomized Trees(PETs)

Notifications You must be signed in to change notification settings

BinXia/PETs

Repository files navigation

PETs

The Predictor of Protein-Protein interaction sites based on Extremely-randomized Trees(PETs)

Table of Contents

  • Introduction
  • Installation
  • Quick Start
  • Design Description
  • Additional Information

Introduction

PETs is a Predictor of Protein-Protein interaction sites based on Extremely-randomized Trees. In PETs, a new sampling strategy is proposed to ensure and improve the stability of predictor, features are used to divide training dataset into sub datasets, these datasets would be clustered through k-means, and the samples could be selected based the distance. Through this kind of sampling, samples which have different types of significant features could be selected uniformly. The performance shows that PETs has better results and keeps a good stability.

Installation

(Note: If the github version is not feasible, the dropbox version can be available at https://www.dropbox.com/sh/44d321ghh0e9l3y/AABpRHlLQzajaFxlnshn3sKqa?dl=0)

PETs requires:

Python is a programming language suitable for rapid development, and it could be available at https://www.python.org/downloads/.

Scikit-learn is a simple and efficient tools for data mining and data analysis, and it could be available at http://scikit-learn.org/stable/install.html. The installation of Scikit-learn, NumPy, and SciPy is described in the website completely.

IPythonNotebook is a browser-based notebook with support for code, text, mathematical expressions, inline plots and other rich media, the complete installation manual could be available at http://ipython.org/install.html. IPythonNotebook is excellent in storing the history running result, embedded in the website, and many other characteristics.

Quick Start

Step 1: Open 'PETs_ez2use.ipynb' using IPythonNotebook, and search for the area which is described in the following in tail of code:

####################################### def main():

definition = 'ASA_Change'

#######################################

Modify the definition of PPIs which you require.

Step 2: Enter the folder 'YourDatasetHere' in the root of our project.

Step 3: Just run 'PETs_ez2use', and a complete report could be available in folder 'result' for each protein.

Tips: These folders are in 'YourDatasetHere', DO NOT MODIFY THE FOLDERS IN THE ROOT OF PROJECT!

Design Description

  • PETs_final.ipynb

This is the code which is doing loading, sampling, training, testing from starting to finishing. It contains our sampling strategy completely.

  • PETs_ourDatasets.ipynb

This is the code which is loading the model directly to test Dtestset72 and PDBtestset164.

  • PETs_ez2use.ipynb

This is the code for 'easy to use'.

  • dataset/PRSA/PSA/PSS/PSSM/subDataset

The storage of datasets and features used in our experiments.

  • ETsModel

The storage of model files in different definition of interface.

  • YourDatasetHere

The storage of datasets and features for PETs_ez2use.ipynb.

Additional Information

Feel free to send us an email if you have any question or concern.

E-mail: ben.binxia@gmail.com

About

The Predictor of Protein-Protein interaction sites based on Extremely-randomized Trees(PETs)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published