Abstract
A Structured Motif refers to a sequence of simple motifs with distance constraints. We present SimpLiSMS, a simple, lightweight and fast algorithm for searching structured motifs. SimpLiSMS does not use any sophisticated data structure, which makes it simple and lightweight. Our experiments show excellent performance of SimpLiSMS. Furthermore, we introduce a parallel version of SimpLiSMS which runs even faster.
Part of this research has been supported by an INSPIRE Strategic Partnership Award, administered by the British Council, Bangladesh for the project titled “Advances in Algorithms for Next Generation Biological Sequences”.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Communications of the ACM 18(6), 333–340 (1975)
Bailey, T.L., Bodén, M., Buske, F.A., Frith, M.C., Grant, C.E., Clementi, L., Ren, J., Li, W.W., Noble, W.S.: MEME SUITE: tools for motif discovery and searching. Nucleic Acids Research 37(Web-Server-Issue), 202–208 (2009)
Bailey, T.L., Williams, N., Misleh, C., Li, W.W.: MEME: discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Research 34(Web-Server-Issue), 369–373 (2006)
Bille, P., Gortz, I.L., Vildhoj, H.W., Wind, D.K.: String matching with variable length gaps. Theor. Comput. Sci. 443, 25–34 (2012)
Bille, P., Thorup, M.: Regular expression matching with multi–strings and intervals. In: Charikar, M. (ed.) ACM–SIAM Symp. on Discrete Algorithms, pp. 1297–1308. SIAM (2010)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Communications of the ACM 20(10), 762–772 (1977)
Crochemore, M., Sagot, M.-F.: 1. motifs in sequences. In: Compact Handbook of Computational Biology, p. 47 (2004)
Grundy, W.N., Bailey, T.L., Elkan, C., Baker, M.E.: Meta-meme: motif-based hidden markov models of protein families. Computer Applications in the Biosciences 13(4), 397–406 (1997)
Halachev, M., Shiri, N.: Fast structured motif search in DNA sequences. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds.) BIRD 2008. CCIS, vol. 13, pp. 58–73. Springer, Heidelberg (2008)
Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., De Castro, E., Langendijk-Genevaux, P.S., Pagni, M., Sigrist, C.J.A.: The prosite database. Nucleic Acids Research 34(suppl. 1), D227–D230 (2006)
Junier, T., Pagni, M., Bucher, P.: mmsearch: a motif arrangement language and search program. Bioinformatics 17(12), 1234–1235 (2001)
Knuth, D.E., Morris, J.H., Pratt, V.R.: Fast pattern matching in strings. SIAM Journal of Computing 6(2), 323–350 (1977)
Morgante, M., Policriti, A., Vitacolonna, N., Zuccolo, A.: Structured motifs search. Journal of Computational Biology 12(8), 1065–1082 (2005)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps patternmatching, with application to protein searching. In: RECOMB, pp. 231–240 (2001)
Navarro, G., Raffinot, M.: Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching. Journal of Computational Biology 10(6), 903–923 (2003)
Pissis, S.P.: Motex-ii: structured motif extraction from large-scale datasets. BMC Bioinformatics 15, 235 (2014)
Rahman, M.S., Iliopoulos, C.S., Lee, I., Mohamed, M., Smyth, W.F.: Finding patterns with variable length gaps or don’t cares. In: Chen, D.Z., Lee, D.T. (eds.) COCOON 2006. LNCS, vol. 4112, pp. 146–155. Springer, Heidelberg (2006)
Sigrist, C.J.A., de Castro, E., Cerutti, L., Cuche, B.A., Hulo, N., Bridge, A., Bougueleret, L., Xenarios, I.: New and continuing developments at prosite. Nucleic Acids Research 41(D1), D344–D347 (2013)
Zhang, Y., Zaki, M.J.: SMOTIF: efficient structured pattern and profile motif search. Algorithms for Molecular Biology, 1 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Alatabbi, A., Azmin, S., Habib, M.K., Iliopoulos, C.S., Rahman, M.S. (2015). SimpLiSMS: A Simple, Lightweight and Fast Approach for Structured Motifs Searching. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9044. Springer, Cham. https://doi.org/10.1007/978-3-319-16480-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-16480-9_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16479-3
Online ISBN: 978-3-319-16480-9
eBook Packages: Computer ScienceComputer Science (R0)