Abstract
The prediction and recognition of promoter in human genome play an important role in DNA sequence analysis. Nucleotide content is a multiple utility in bioinformatics details analysis. The single nucleotide statistics method based on nucleotide content can help extract features with higher separability and make decision. In this paper, a human promoter recognition method based on multiple gene features and multilayer decision, which is called MD-MSVMs, is proposed. In our method, we firstly perform single nucleotide analysis and divide the gene set into two parts. Secondly, the multiple gene features are extracted from each part, including CpG-island, n-mer and rigidity. And then, based on multiple features, multiple support vector machines and multilayer decision model are combined to construct a human promoter recognition framework, which is flexible and can integrate new feature extraction or new classification models freely. Experimental result shows that our method has better performance and helps understanding multiple features integrating.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bajic, V.B., Chong, A., Seah, S.H., et al.: An intelligent system for vertebrate promoter recognition. IEEE Intell. Syst. 17(4), 64–70 (2002)
Fickett, J.W., Hatzigeorgiou, A.G.: Eukaryotic promoter recognition. Genome Res. 7, 861–878 (1997)
Umesh, P., Dubey, J.K., Karthika, R.V., et al.: A novel sequence and context based method for promoter recognition. Bioinformation 10(4), 175–179 (2014)
Zeng, J., Zhao, X.Y., Cao, X.Q., Yan, H.: SCS: signal, context, and structure features for genome-wide human promoter recognition. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(3), 550–562 (2010)
Deng, J., Liang, H., Zhang, R., et al.: Methylated CpG site count of dapper homolog 1 (DACT1) promoter prediction the poor survival of gastric cancer. Am. J. Cancer Res. 4, 518–527 (2014)
Saxonov, S., Berg, P., Brutlag, D.L.: A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc. Natl. Acad. Sci. U.S.A. 103, 1412–1417 (2015)
Huang, W.L., Tung, C.W., Liaw, C., Huang, H.L., Ho, S.Y.: Rule-based knowledge acquisition method for promoter prediction in human and Drosophila species. Sci. World J. 2014, 1–14 (2014)
Vinga, S.: Information theory applications for biological sequence analysis. Brief. Bioinform. 15(3), 376–389 (2014)
Fujii, S., Kono, H., Takenaka, S., Go, N., Sarai, A.: Sequence-dependent DNA deformability studied using molecular dynamics simulations. Nucleic Acids Res. 35, 6063–6074 (2007)
Gan, Y., Guan, J., Zhou, S.: A comparison study on feature selection of DNA structural properties for promoter prediction. BMC Bioinform. 7, 13–14 (2012)
Anwar, F., Baker, S.M., Jabid, T., Mehedi, H.M., Shoyaib, M., Khan, H., Walshe, R.: Pol II promoter prediction using characteristic 4-mer motifs: a machine learning approach. BMC Bioinform. 9(1), 414–418 (2008)
Zhao, X.Y., Zhang, J., Chen, Y.Y., Li, Q., Yang, T., Pian, C., Zhang, L.Y.: Promoter recognition based on the maximum entropy hidden Markov model. Comput. Biol. Med. 51, 73–81 (2014)
Li, Y., Lee, K.K., Walsh, S., Smith, C., Hadingham, S., Sorefan, K., Cawley, G., Bevan, M.W.: Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine. Genome Res. 16(3), 414–427 (2006)
Lu, J., Luo, L.: Prediction for human transcription start site using diversity measure with quadratic discriminant. Bioinformation 2(7), 316–321 (2008)
Wang, J., Ungar, L.H., Tseng, H., Hannenhalli, S.: MetaProm: a neural network based meta-predictor for alternative human promoter prediction. BMC Genom. 8, 374 (2007)
Xie, X., Wu, S., Lam, K., Yan, H.: PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm. Bioinformatics 22, 2722–2728 (2006)
Suzuki, A., Wakaguri, H., Yamashita, R., Kawano, S., Tsuchihara, K., Sugano, S., Suzuki, Y., Nakai, K.: DBTSS as an integrative platform for transcriptome, epigenome and genome sequence variation data. Nucleic Acids Res. 43(Database issue), D87–D91 (2014)
Zeng, J., Cao, X., Yan, H.: Human promoter recognition using Kullback-Leibler divergence. In: 2007 International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3319–3325 (2007)
Goddard, N.L., Bonnet, G., Krichevsky, O., Libchaber, A.: Sequence dependent rigidity of single stranded DNA. Phys. Rev. Lett. 85, 2400–2403 (2000)
Zeng, J., Zhu, S., Yan, H.: Towards accurate human promoter recognition: a review of currently used sequence features and classification methods. Brief. Bioinform. 10, 498–508 (2009)
Brukner, I., Sanchez, R., Suck, D., Pongor, S.: Sequence-dependent bending propensity of DNA as revealed by DNase I. parameters for trinucleotides. EMBO J. 14, 1812–1818 (1995)
Li, W., Kou, Q., Wei, L., Liu, J.: Plant promoter recognition based on analysis of base bias and SVM. J. Liaoning Normal Univ. (Natural Science Edition) 35, 183–187 (2012)
Saxonov, S., Daizadeh, I., Fedorov, A., Gilbert, W.: EID: the exon-intron database — an exhaustive database of protein coding intron-containing genes. Nucleic Acids Res. 28, 185–190 (2000)
Licciulli, Mignone, F., Gissi, C., Saccone, C.: F., Gissi, C., Saccone, C.: UTRdb and UTRsite: specialized databases of sequences and functional elements of 5’ and 3’ untranslated regions of eukaryotic mRNAs. Nucleic Acids Res. 30, 335–340 (2002)
Vapnik, V., Cortes, C.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Bajic, V.B.: Comparing the success of different prediction programs in sequence analysis: a review. Brief. Bioinform. 1(3), 214–228 (2000)
Zhu, L., Guo, W.L., Lu, C., Huang, D.S.: Collaborative completion of transcription factor binding profiles via local sensitive unified embedding. IEEE Trans. Nanobiosci. PP(99), 1 (2016)
Liang, X., Zhu, L., Huang, D.S.: Multi-task ranking SVM for image cosegmentaiton. Neurocomputing (2017)
Acknowledgment
This work was supported by the grants of the National Science Foundation of China, Nos. 31571364, U1611265, 61532008, 61672203, 61402334, 61472282, 61520106006, 61472280, 61472173, 61572447, 61373098 and 61672382, China Postdoctoral Science Foundation Grant, Nos. 2016M601646.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Xu, W., Bao, W., Yuan, L., Jiang, Z. (2017). MD-MSVMs: A Human Promoter Recognition Method Based on Single Nucleotide Statistics and Multilayer Decision. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2017. Lecture Notes in Computer Science(), vol 10361. Springer, Cham. https://doi.org/10.1007/978-3-319-63309-1_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-63309-1_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63308-4
Online ISBN: 978-3-319-63309-1
eBook Packages: Computer ScienceComputer Science (R0)