Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses

Ng, Yen Kaow; Shinohara, Takeshi

doi:10.1007/11872436_25

Yen Kaow Ng²³ &
Takeshi Shinohara²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4201))

Included in the following conference series:

International Colloquium on Grammatical Inference

538 Accesses

Abstract

A tree pattern p is a first-order term in formal logic, and the language of p is the set of all the tree patterns obtainable by replacing each variable in p with a tree pattern containing no variables. We consider the inductive inference of the unions of these languages from positive examples using strategies that guarantee some forms of minimality during the learning process. By a result in our earlier work, the existence of a characteristic set for each language in a class ${\mathcal L }$ (within ${\mathcal L }$) implies that ${\mathcal L }$ can be identified in the limit by a learner that simply conjectures a hypothesis containing the examples, that is minimal in the number of elements of up to an appropriate size. Furthermore, if there is a size ℓ such that each candidate hypothesis has a characteristic set (within the languages in ${\mathcal L }$ that intersects non-emptily with the examples) that consists only of elements of up to size ℓ, then the hypotheses containing the least number of elements of up to size ℓ are at the same time minimal with respect to inclusion. In this paper we show how to determine such a size ℓ for the unions of the tree pattern languages, and hence allowing us to learn the class using hypotheses that fulfill the two mentioned notions of minimality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The Teaching Complexity of Erasing Pattern Languages with Bounded Variable Frequency

Learning Unions of k-Testable Languages

Tree-Like Grammars and Separation Logic

References

Angluin, D.: Inductive inference of formal languages from positive data. Information and Control 45, 117–135 (1980)
Article MATH MathSciNet Google Scholar
Angluin, D.: Inference of reversible languages. J. of the ACM 29, 741–765 (1982)
Article MATH MathSciNet Google Scholar
Arimura, H., Ishizaka, H., Shinohara, T.: Learning unions of tree patterns using queries. Theoretical Computer Science 185(1), 47–62 (1997)
Article MATH MathSciNet Google Scholar
Arimura, H., Shinohara, T., Otsuki, S.: A polynomial time algorithm for finding finite unions of tree pattern languages. In: Brewka, G., Jantke, K.P., Schmitt, P.H. (eds.) NIL 1991. LNCS (LNAI), vol. 659, pp. 118–131. Springer, Heidelberg (1993)
Chapter Google Scholar
Arimura, H., Shinohara, T., Otsuki, S.: Polynomial time inference of unions of two tree pattern languages. IEICE Transactions on Information and Systems E75-D(7), 426–434 (1992)
Google Scholar
Arimura, H., Shinohara, T., Otsuki, S.: Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data. In: Enjalbert, P., Mayr, E.W., Wagner, K.W. (eds.) STACS 1994. LNCS, vol. 775, pp. 649–660. Springer, Heidelberg (1994)
Google Scholar
Arimura, H., Shinohara, T., Otsuki, S., Ishizaka, H.: A generalization of the least general generalization. Machine Intelligence 13 13, 59–85 (1994)
Google Scholar
Chan, C., Garofalakis, M., Rastogi, R.: RE-tree: an efficient index structure for regular expressions. The VLDB Journal 12(2), 102–119 (2003)
Article Google Scholar
Gold, E.M.: Language identification in the limit. Information and Control 10, 447–474 (1967)
Article MATH Google Scholar
W3C XML Core Working Group. Extensible Markup Language (XML) 1.0, 3rd edn. W3C Recommendation (2004)
Google Scholar
Kobayashi, S., Yokomori, T.: Identifiability of subspaces and homomorphic images of zero-reversible languages. In: Proc. of Algorithmic Learning Theory, 8th Int. Conf., ALT 1997. LNCS (LNAI), vol. 1316, pp. 48–61. Springer, Heidelberg (1997)
Google Scholar
Ng, Y.K., Ono, H., Shinohara, T.: Measuring over-generalization in the minimal multiple generalizations of biosequences. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 176–188. Springer, Heidelberg (2005)
Chapter Google Scholar
Ng, Y.K., Shinohara, T.: Inferring unions of the pattern languages by the most fitting covers. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 269–282. Springer, Heidelberg (2005)
Chapter Google Scholar
Ng, Y.K., Shinohara, T.: Finding consensus patterns in very scarce biosequence samples from their minimal multiple generalizations. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 540–545. Springer, Heidelberg (2006)
Chapter Google Scholar
Plotkin, G.D.: A note on inductive generalization. Machine Intelligence 5, 153–163 (1970)
MathSciNet Google Scholar
Sato, M.: Inductive inference of formal languages. Bulletin of Informatics and Cybernetics 27(1), 85–106 (1995)
MATH MathSciNet Google Scholar
Shinohara, T.: Polynomial time inference of extended regular pattern languages. In: Goto, E., Nakajima, R., Yonezawa, A., Nakata, I., Furukawa, K. (eds.) RIMS 1982. LNCS, vol. 147, pp. 115–127. Springer, Heidelberg (1983)
Google Scholar
Wright, K.: Identification of unions of languages drawn from an identifiable class. In: Proc. of the Second Annual Workshop on Computational Learning Theory, pp. 328–333. Morgan Kaufmann, San Francisco (1989)
Google Scholar

Download references

Author information

Authors and Affiliations

Kyushu Institute of Technology, Graduate School of Computer Science and Systems, Iizuka, 820, Japan
Yen Kaow Ng
Department of Artificial Intelligence, Kyushu Institute of Technology, Iizuka, 820, Japan
Takeshi Shinohara

Authors

Yen Kaow Ng
View author publications
You can also search for this author in PubMed Google Scholar
Takeshi Shinohara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, 223-8522, Yokohama, Japan
Yasubumi Sakakibara
Dept. of Computer Science, Kyoto Sangyo University, Kamigamo Motoyama, Kita-ku, Kyoto, Japan
Satoshi Kobayashi
Japan Biological Informatics Consortium, 10F TIME24 Building, 2-45 Aomi, Koto-ku, 135-8073, Tokyo, Japan
Kengo Sato
Department of Information and Communication Engineering, Graduate School of Electro-Communications, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu-shi, 182-8585, Tokyo, Japan
Tetsuro Nishino
Department of Information and Communication Engineering, Faculty of Electro-Communications, The University of Electro-Communications, Chofugaoka 1–5–1, Chofu, 182-8585, Tokyo, Japan
Etsuji Tomita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ng, Y.K., Shinohara, T. (2006). Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_25

Download citation

DOI: https://doi.org/10.1007/11872436_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45264-5
Online ISBN: 978-3-540-45265-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

The Teaching Complexity of Erasing Pattern Languages with Bounded Variable Frequency

Learning Unions of k-Testable Languages

Tree-Like Grammars and Separation Logic

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

The Teaching Complexity of Erasing Pattern Languages with Bounded Variable Frequency

Learning Unions of k-Testable Languages

Tree-Like Grammars and Separation Logic

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation