iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

doi:10.1093/bioinformatics/bty140

. 2018 Jul 15;34(14):2499-2502.

doi: 10.1093/bioinformatics/bty140.

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Zhen Chen¹, Pei Zhao², Fuyi Li³, André Leier^{4

5}, Tatiana T Marquez-Lago^{4

5}, Yanan Wang⁶, Geoffrey I Webb⁷, A Ian Smith³, Roger J Daly³, Kuo-Chen Chou^{8

9}, Jiangning Song^{3

7}

Affiliations

¹ School of Basic Medical Science, Qingdao University, 38 Dengzhou Road, Qingdao, China.
² State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang, China.
³ Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, Australia.
⁴ Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA.
⁵ Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
⁶ Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China.
⁷ Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia.
⁸ Gordon Life Science Institute, Boston, MA, USA.
⁹ Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.

PMID: 29528364
PMCID: PMC6658705
DOI: 10.1093/bioinformatics/bty140

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Zhen Chen et al. Bioinformatics. 2018.

. 2018 Jul 15;34(14):2499-2502.

doi: 10.1093/bioinformatics/bty140.

Authors

Zhen Chen¹, Pei Zhao², Fuyi Li³, André Leier^{4

5}, Tatiana T Marquez-Lago^{4

5}, Yanan Wang⁶, Geoffrey I Webb⁷, A Ian Smith³, Roger J Daly³, Kuo-Chen Chou^{8

9}, Jiangning Song^{3

7}

Affiliations

¹ School of Basic Medical Science, Qingdao University, 38 Dengzhou Road, Qingdao, China.
² State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang, China.
³ Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, Australia.
⁴ Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA.
⁵ Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA.
⁶ Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China.
⁷ Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia.
⁸ Gordon Life Science Institute, Boston, MA, USA.
⁹ Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.

PMID: 29528364
PMCID: PMC6658705
DOI: 10.1093/bioinformatics/bty140

Abstract

Summary: Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection and dimensionality reduction algorithms, greatly facilitating training, analysis and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit.

Availability and implementation: http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Cited by

ACVPICPred: Inhibitory activity prediction of anti-coronavirus peptides based on artificial neural network.
Li M, Wu Y, Li B, Lu C, Jian G, Shang X, Chen H, Huang J, He B. Li M, et al. Comput Struct Biotechnol J. 2024 Oct 2;23:3625-3633. doi: 10.1016/j.csbj.2024.09.015. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 39469670 Free PMC article.
Neoantigen immunogenicity landscapes and evolution of tumor ecosystems during immunotherapy with nivolumab.
Alban TJ, Riaz N, Parthasarathy P, Makarov V, Kendall S, Yoo SK, Shah R, Weinhold N, Srivastava R, Ma X, Krishna C, Mok JY, van Esch WJE, Garon E, Akerley W, Creelan B, Aanur N, Chowell D, Geese WJ, Rizvi NA, Chan TA. Alban TJ, et al. Nat Med. 2024 Sep 30. doi: 10.1038/s41591-024-03240-y. Online ahead of print. Nat Med. 2024. PMID: 39349627
DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes.
Wei T, Lu C, Du H, Yang Q, Qi X, Liu Y, Zhang Y, Chen C, Li Y, Tang Y, Zhang WH, Tao X, Jiang N. Wei T, et al. Brief Bioinform. 2024 Sep 23;25(6):bbae484. doi: 10.1093/bib/bbae484. Brief Bioinform. 2024. PMID: 39344712 Free PMC article.
Advances in Computational Intelligence-Based Methods of Structure and Function Prediction of Proteins.
Zhang J, Qian J. Zhang J, et al. Biomolecules. 2024 Aug 29;14(9):1083. doi: 10.3390/biom14091083. Biomolecules. 2024. PMID: 39334850 Free PMC article.
Current computational tools for protein lysine acylation site prediction.
Qin Z, Ren H, Zhao P, Wang K, Liu H, Miao C, Du Y, Li J, Wu L, Chen Z. Qin Z, et al. Brief Bioinform. 2024 Sep 23;25(6):bbae469. doi: 10.1093/bib/bbae469. Brief Bioinform. 2024. PMID: 39316944

See all "Cited by" articles

References

1. Altschul S.F. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed
1. Barkan D.T. et al. (2010) Prediction of protease substrates using sequence and structure features. Bioinformatics, 26, 1714–1722. - PMC - PubMed
1. Bellman R.E. (1961) Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ.
1. Bhasin M., Raghava G.P. (2004) Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem., 279, 23262–23266. - PubMed
1. Cao D.S. et al. (2013) propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics, 29, 960–962. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

R01 AI111965/AI/NIAID NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations
- scite Smart Citations

[1] Altschul S.F. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed

[2] Altschul S.F. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed

[3] Barkan D.T. et al. (2010) Prediction of protease substrates using sequence and structure features. Bioinformatics, 26, 1714–1722. - PMC - PubMed

[4] Barkan D.T. et al. (2010) Prediction of protease substrates using sequence and structure features. Bioinformatics, 26, 1714–1722. - PMC - PubMed

[5] Bellman R.E. (1961) Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ.

[6] Bellman R.E. (1961) Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ.

[7] Bhasin M., Raghava G.P. (2004) Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem., 279, 23262–23266. - PubMed

[8] Bhasin M., Raghava G.P. (2004) Classification of nuclear receptors based on amino acid composition and dipeptide composition. J. Biol. Chem., 279, 23262–23266. - PubMed

[9] Cao D.S. et al. (2013) propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics, 29, 960–962. - PubMed

[10] Cao D.S. et al. (2013) propy: a tool to generate various modes of Chou’s PseAAC. Bioinformatics, 29, 960–962. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Affiliations

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Authors

Affiliations

Abstract

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources