Abstract
Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to extract meaningful and hidden knowledge from such tree structured documents, we consider a minimal language (MINL) problem for term trees. The MINL problem for term trees is to find a term tree t such that the language generated by t is minimal among languages, generated by term trees, which contain all given tree structured data. Firstly, we show that the MINL problem for regular term trees is computable in polynomial time if the number of edge labels is infinite. Next, we show that the MINL problems with optimizing the size of an output term tree are NP-complete. Finally, in order to show that our polynomial time algorithm for the MINL problem can be applied to data mining from real-world Web documents, we show that regular term tree languages are polynomial time inductively inferable from positive data if the number of edge labels is infinite.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
T. R. Amoth, P. Cull, and P. Tadepalli. Exact learning of unordered tree patterns from queries. Proc. COLT-99, ACM Press, pages 323–332, 1999.
D. Angluin. Finding patterns common to a set of strings. Journal of Computer and System Science, 21:46–62, 1980.
H. Arimura, T. Shinohara, and S. Otsuki. Polynomial time algorithm for finding finite unions of tree pattern languages. Proc. NIL-91, Springer-Verlag, LNAI 659, pages 118–131, 1993.
S. Goldman and S. Kwek. On learning unions of pattern languages and tree patterns. Proc. ALT-99, Springer-Verlag, LNAI 1720, 1720:347–363, 1999.
S. Matsumoto, Y. Hayashi, and T. Shoudai. Polynomial time inductive inference of regular term tree languages from positive data. Proc. ALT-97, Springer-Verlag, LNAI 1316, pages 212–227, 1997.
T. Miyahara, T. Shoudai, T. Uchida, T. Kuboyama, K. Takahashi, and H. Ueda. Discovering new knowledge from graph data using inductive logic programming. Proc. ILP-99, Springer-Verlag, LNAI 1634, pages 222–233, 1999.
T. Miyahara, T. Shoudai, T. Uchida, K. Takahashi, and H. Ueda. Polynomial time matching algorithms for tree-like structured patterns in knowledge discovery. Proc. PAKDD-2000, Springer-Verlag, LNAI 1805, pages 5–16, 2000.
T. Miyahara, T. Shoudai, T. Uchida, K. Takahashi, and H. Ueda. Discovery of frequent tree structured patterns in semistructured web documents. Proc. PAKDD-2001, Springer-Verlag, LNAI 2035, pages 47–52, 2001.
S. Nestorov, S. Abiteboul, and R. Motwani. Extracting schema from semistructured data. Proceedings of ACM SIGMOD International Conference on Management of Data, pages 295–306, 1998.
T. Shinohara. Polynomial time inference of extended regular pattern languages. In Springer-Verlag, LNCS 147, pages 115–127, 1982.
T. Shoudai, T. Miyahara, T. Uchida, and S. Matsumoto. Inductive inference of regular term tree languages and its application to knowledge discovery. Information Modelling and Knowledge Bases XI, IOS Press, pages 85–102, 2000.
K. Wang and H. Liu. Discovering structural association of semistructured data. IEEE Trans. Knowledge and Data Engineering, 12:353–371, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shoudai, T., Uchida, T., Miyahara, T. (2001). Polynomial Time Algorithms for Finding Unordered Tree Patterns with Internal Variables. In: Freivalds, R. (eds) Fundamentals of Computation Theory. FCT 2001. Lecture Notes in Computer Science, vol 2138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44669-9_32
Download citation
DOI: https://doi.org/10.1007/3-540-44669-9_32
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42487-1
Online ISBN: 978-3-540-44669-9
eBook Packages: Springer Book Archive