Abstract
Understanding of transcriptional regulation through the discovery of transcription factor binding sites (TFBS) is a fundamental problem in molecular biology research. Here we propose a new computational method for motif discovery by mixing a genetic algorithm structure with several statistical coefficients. The algorithm was tested with 56 data sets from four different species. The motifs obtained were compared to the known motifs for each one of the data sets, and the accuracy in this prediction compared to 14 other methods both at nucleotide and site level. The results, though did not stand out in detection of false positives, showed a remarkable performance in most of the cases in sensitivity and in overall performance at site level, generally outperforming the other methods in these statistics, and suggesting that the algorithm can be a useful tool to successfully predict motifs in different kinds of sets of DNA sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Vijayvargiya, S., Shukla, P.: Identification of Transcription Factor Binding Sites in Biological Sequences Using Genetic Algorithm. International Journal of Research & Reviews in Computer Science 2(2) (2011)
Tanaka, E., Bailey, T. L., Keich, U.: Improving MEME via a two-tiered significance analysis. Bioinformatics, btu163 (2014)
Abnizova, I., te Boekhorst, R., Walter, K., Gilks, W.R.: Some statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the Drosophila genome: the fluffy-tail test. BMC Bioinformatics 6(1), 109 (2005)
Shu, J.J., Li, Y.: A statistical thin-tail test of predicting regulatory regions in the Drosophila genome. Theoretical Biology and Medical Modelling 10(1), 11 (2013)
Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., et al.: Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites. Nat. Biotechnol. 23137–23147 (2005)
Pevzner, P.A., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. ISMB 8, 269–278 (2000)
Burset, M., Guigo, R.: Evaluation of gene structure prediction programs. Genomics 34(3), 353–367 (1996)
Wingender, E., Dietze, P., Karas, H., Knüppel, R.: TRANSFAC: a Database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24, 238–241 (1996)
Das, M.K., Dai, H.K.: A survey of DNA motif finding algorithms. BMC Bioinformatics 8(Suppl. 7), S21 (2007)
Lenhard, B., Wasserman, W.W.: TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 18(8), 1135–1136 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Basha Gutierrez, J., Frith, M., Nakai, K. (2015). A Genetic Algorithm for Motif Finding Based on Statistical Significance. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-16483-0_43
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16482-3
Online ISBN: 978-3-319-16483-0
eBook Packages: Computer ScienceComputer Science (R0)