Abstract
In this paper, we study deterministic regular expressions with interleaving (IDREs). Regular expressions extended with interleaving are good at describing sequential and parallel data patterns. The interleaving makes these expressions exponentially more succinct than standard regular expressions. And when they obey the determinism constraint, they are more efficient for matching and parsing processes than general ones. However, the interleaving also makes the structure of expressions more complex, and the semantic definition of determinism is hard to follow, which prevent users from writing and developing IDREs. We address this problem by proposing a determinism checking method and a syntactic description for IDREs. Our checking method can locate the source of nondeterminism more efficiently and accurately than the existing method when the nondeterminism is caused by unary operators. We prove that the grammars of IDREs are context-free and suggest effective optimization rules to simplify the grammars.
Work supported by the National Natural Science Foundation of China under Grant Nos. 61872339 and 61472405.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahonen, H.: Disambiguation of SGML content models. In: International Workshop on Principles of Document Processing, pp. 27–37 (1996)
Bárcenas, E., Genevès, P., Layaïda, N., Schmitt, A.: Query reasoning on trees with types, interleaving, and counting. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 718–723 (2011)
Bex, G.J., Gelade, W., Martens, W., Neven, F.: Simplifying XML schema: effortless handling of nondeterministic regular expressions. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 731–744 (2009)
Bex, G.J., Neven, F., Schwentick, T., Vansummeren, S.: Inference of concise regular expressions and DTDs. ACM Trans. Database Syst. 35(2), 1–47 (2010)
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (XML) 1.0, 5th edn. W3C Recommendation (2008). http://www.w3.org/TR/REC-xml/
Broda, S., Machiavelo, A., Moreira, N., Reis, R.: Automata for regular expressions with shuffle. Inf. Comput. 259(2), 162–173 (2017)
Brüggemann-Klein, A.: Regular expressions into finite automata. Theoret. Comput. Sci. 120(2), 197–213 (1993)
Brüggemann-Klein, A.: Unambiguity of extended regular expressions in SGML document grammars. In: European Symposium on Algorithms, pp. 73–84 (1993)
Brüggemann-Klein, A., Wood, D.: One-unambiguous regular languages. Inf. Comput. 140(2), 229–253 (1998)
Buil-Aranda, C., Arenas, M., Corcho, O., Polleres, A.: Federating queries in SPARQL 11: Syntax, semantics and evaluation. Web Semant. Sci. Serv. Agents World Wide Web 18(1), 1–17 (2013)
Che, D., Aberer, K., Özsu, T.: Query optimization in XML structured-document databases. J. Int. Conf. Very Large Databases 15(3), 263–289 (2006)
Chen, H., Xu, Z., Lu, P.: Towards an effective syntax for deterministic regular expressions. Technical report, State Key Laboratory of Computer Science (2018). http://lcs.ios.ac.cn/~chm/papers.htm
Chen, H., Lu, P.: Checking determinism of regular expressions with counting. Inf. Comput. 241, 302–320 (2015)
Ciucanu, R., Staworko, S.: Learning schemas for unordered XML. In: Proceedings of the 14th International Symposium on Database Programming Languages (2013)
Clark, J., Murata, M.: Relax NG specification. Organization for the Advancement of Structured Information Standards (OASIS) (2001)
Gao, S., Sperberg-McQueen, C.M., Thompson, H.S.: XML Schema Definition Language (XSD) 1.1 Part 1: Structures. W3C Recommendation (2012). http://www.w3.org/TR/2012/REC-xmlschema11-1-20120405/
Garg, V.K., Ragunath, M.T.: Concurrent regular expressions and their relationship to petri nets. Theoret. Comput. Sci. 96(2), 285–304 (1992)
Gelade, W.: Succinctness of regular expressions with interleaving, intersection and counting. Theoret. Comput. Sci. 411(31–33), 2987–2998 (2010)
Gischer, J.: Shuffle languages, petri nets, and context-sensitive grammars. Commun. ACM 24(9), 597–605 (1981)
Groz, B., Maneth, S.: Efficient testing and matching of deterministic regular expressions. J. Comput. Syst. Sci. 89, 372–399 (2017)
Hartmann, J., Vieira, M., Foster, H., Ruder, A.: A uml-based approach to system testing. ISSE 1(1), 12–24 (2005)
Hopcroft, J.E., Ullman, J.D.: Introduction To Automata Theory, Languages And Computation. Addison-Wesley, Boston (2001)
Huang, X., Bao, Z., Davidson, S.B., Milo, T., Yuan, X.: Answering regular path queries on workflow provenance. In: International Conference on Data Engineering, pp. 375–386 (2015)
ISO, 8879: Information processing text and office systems Standard Generalized Markup Language (SGML) (1986). https://www.iso.org/standard/16387.html
Kilpeläinen, P.: Checking determinism of XML schema content models in optimal time. Inf. Syst. 36(3), 596–617 (2011)
Kilpeläinen, P., Tuhkanen, R.: One-unambiguity of regular expressions with numeric occurrence indicators. Inf. Comput. 205(6), 890–916 (2007)
Koch, C., Scherzinger, S., Schweikardt, N., Stegmaier, B.: Schema-based scheduling of event processors and buffer minimization for queries on structured data streams. In: Proceedings of the International Conference on Very Large Databases, pp. 228–239 (2004)
Li, Z., Ge, T.: PIE: approximate interleaving event matching over sequences. In: 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 747–758 (2015)
Losemann, K., Martens, W.: The complexity of evaluating path expressions in SPARQL. In: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 101–112 (2012)
Losemann, K., Martens, W., Niewerth, M.: Closure properties and descriptional complexity of deterministic regular expressions. Theoret. Comput. Sci. 627, 54–70 (2016)
Martens, W., Neven, F., Schwentick, T.: Complexity of decision problems for XML schemas and chain regular expressions. SIAM J. Comput. 39(4), 1486–1530 (2009)
Peng, F., Chen, H.: Discovering restricted regular expressions with interleaving. In: Asia-Pacific Web Conference, pp. 104–115 (2015)
Peng, F., Chen, H., Mou, X.: Deterministic regular expressions with interleaving. In: Leucker, M., Rueda, C., Valencia, F.D. (eds.) ICTAC 2015. LNCS, vol. 9399, pp. 203–220. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25150-9_13
Pierce, B.C.: Types and Programming Languages. MIT Press, Cambridge (2002)
Staworko, S., Boneva, I., Gayo, J.E.L., Hym, S., Prud’hommeaux, E.G., Solbrig, H.R.: Complexity and expressiveness of ShEx for RDF. In: Proceedings of International Conference on Database Theory, pp. 195–211 (2015)
Ter Beek, M.H., Kleijn, J.: Infinite unfair shuffles and associativity. Theoret. Comput. Sci. 380(3), 401–410 (2007)
Wall, L., Christiansen, T., Schwartz, R.L.: Programming Perl. O’Reilly & Associates Inc, Sebastopol (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Mou, X., Chen, H., Li, Y. (2019). Context-Free Grammars for Deterministic Regular Expressions with Interleaving. In: Hierons, R., Mosbah, M. (eds) Theoretical Aspects of Computing – ICTAC 2019. ICTAC 2019. Lecture Notes in Computer Science(), vol 11884. Springer, Cham. https://doi.org/10.1007/978-3-030-32505-3_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-32505-3_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32504-6
Online ISBN: 978-3-030-32505-3
eBook Packages: Computer ScienceComputer Science (R0)