Abstract
This paper deals with a complex system of processing raw Czech texts. Several modules were implemented which perform different levels of processing. These modules can easily be incorporated into many other linguistic applications and some of them are already exploited in this way. The first level of processing raw texts represents a reliable morphological analysis - we give a survey of the effective implementation of the robust morphological analyser for Czech named ajka. Texts tagged by ajka can be further processed by the partial parser Dis and its extension VaDis which is based on verb valencies. The output of these systems serves for automatic partial disambiguation of input texts. The tools described in this paper are widely used for parsing large corpora and can be employed in the initial phase of semantic analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Daciuk, J., Watson, R. E. and Watson, B. W. Incremental Construction of Acyclic Finite-State Automata and Transducers. In Finite State Methods in Natural Language Processing, Bilkent University, Ankara, Turkey, June–July 1998.
Hajič, J. Disambiguation of Rich Inflection (Computational Morphology of Czech). Charles University Press, 1st edition, 2000.
Hajič, J. and Hladká, B. Probabilistic and Rule-Based Tagging of an Inflective Language-a Comparsion. In Proceedings of the 5th Conference on Applied Natural Language Processing, Washington 1997.
Knuth, D. E. The Art of Computer Programming: Sorting and Searching, Volume 3, Chapter 6.3. Addison Wesley, 2nd edition, 1973.
Oliva, K., Hnátková, M., Petkevič, V. and Květoň, P. The Linguistic Basis of a Rule-Based Tagger of Czech. In Proceedings of the Third International Workshop TSD 2000, Springer, Berlin 2000.
Osolsobě, K. Algorithmic Description of Czech Formal Morphology and Czech Machine Dictionary. Ph.D. Thesis, Faculty of Arts, Masaryk University Brno, 1996. In Czech.
Oztaner S. M. A Word Grammar of Turkish with Morphophonemic Rules. Master’s Thesis, Middle East Technical University, 1996.
Pala, K., Rychlý, P., and Smrž, P. DESAM-Annotated Corpus for Czech. In Proceedings of SOFSEM’97, LNCS 1338, Springer, 1997.
Pala, K. and Ševeček, P. Valencies of Czech Verbs. Studia Minora Facultatis Philosophicae Universitatis Brunensis, A45, 1997.
Sedláček, R. and Smrž, P. Automatic Processing of Czech Inflectional and Derivative Morphology. Technical Report FIMU-RS-2001-03, Faculty of Informatics, Masaryk University Brno, 2001.
Žáčková, E. Partial Parsing (of Czech). Ph.D. Thesis, Faculty of Informatics, Masaryk University Brno, 2002. In Czech.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mráková, E., Sedláček, R. (2003). From Czech Morphology through Partial Parsing to Disambiguation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2003. Lecture Notes in Computer Science, vol 2588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36456-0_13
Download citation
DOI: https://doi.org/10.1007/3-540-36456-0_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00532-2
Online ISBN: 978-3-540-36456-6
eBook Packages: Springer Book Archive