Abstract
This chapter presents on-going efforts at the Joint-Research Center of the European Commission for automating event extraction from news articles collected through the Internet with the Europe Media Monitor system. Event extraction builds on techniques developed over several years in the fields of information extraction, whose basic goal is to derive quantitative data from unstructured text. The motivation for automated event tracking is to provide objective incident data with broad coverage on terrorist incidents and violent conflicts from around the world. This quantitative data then forms the basis for populating incident databases and systems for trend analysis and risk assessment.
A discussion of the technical requirements for information extraction and the approach adopted by the authors is presented. In particular, we deploy lightweight methods for entity extraction and a machine-learning technique for pattern-based event extraction. A preliminary evaluation of the results shows that the accuracy is already acceptable. Future directions of improving the approach are also discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Appelt, D.: Introduction to Information Extraction Technology. In: IJCAI 1999, Tutorial, Stockholm, Sweden (1999)
Best, C., van der Goot, E., Blackler, K., Garcia, T., Horby, D.: Europe Media Monitor - System Description. Technical Report EUR 22173 EN, European Commission (2005)
Bond, D.: Integrated Data for Event Analysis (IDEA) (1998-2002), http://vranet.com/idea
Cunningham, H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine (2rd edn). Technical Report, CS–00–10, University of Sheffield, Department of Computer Science (2000)
Discoverer Extractor, http://www.temis-group.com
Drożdżyński, W., Krieger, H.-U., Piskorski, J., Schäfer, U., Xu, F.: Shallow Processing with Unification and Typed Feature Structures — Foundations and Applications. Künstliche Intelligenz 2004(1), 17–23 (2004)
Erjavec, T.: MULTEXT - East Morphosyntactic Specifications (2004), Web document, http://nl.ijs.si/ME/V3/msd/html
Global Public Health Information Network
Goldstein, J.: A Conflict-Cooperation scale for WEIS Events data. Journal of Conflict Resolution 36(2), 369–385 (1992)
http://www.phacaspc.gc.ca/media/nr-rp/2004/2004gphin-rmispbke.html
Infoxtract, http://www.cymfony.com
Institute for Counter Terrorism, http://www.itc.org.il
Inxight ThingFinder Professional, http://www.inxight.com
Jones, R., McCallum, A., Nigam, K., Riloff, E.: Bootstrapping for Text Learning Tasks. In: Proceedings of IJCAI 1999 Workshop on Text Mining: Foundations, Techniques, and Applications, Stockholm, Sweden (1999)
Medical Intelligence System, http://medisys.jrc.it
MIPT Terrorism Knowledge Base (TKB), http://www.tkb.org
MUC, http://www.itl.nist.gov/iaui/894.02/related/projects/muc
Piskorski, J.: Advances in Information Extraction. In: Abramowicz, W. (ed.) Knowledge Based Information Retrieval and Filtering from Internet. Kluwer Academic Publishers, Dordrecht (2003)
Piskorski, J.: On Compact Storage Models for Gazetteers. In: Yli-Jyrä, A., Karttunen, L., Karhumäki, J. (eds.) FSMNLP 2005. LNCS (LNAI), vol. 4002. Springer, Heidelberg (2006)
Piskorski, J.: CORLEONE - Core Linguistic Entity Online Extraction. Technical Report, European Commission (to appear, 2007)
Pouliquen, B., Kimler, M., Steinberger, R., Ignat, C., Oellinger, T., Blackler, K., Fuart, F., Zaghouani, W., Widiger, A., Forslund, A.C., Best, C.: Geocoding multilingual texts: Recognition, Disambiguation and Visualisation. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, pp. 24–26 (2006)
Pouliquen, B., Steinberger, R., Ignat, C., Temnikova, I., Widiger, A., Zaghouani, W., Zizka, J.: Multilingual person name recognition and transliteration. Journal CORELA - Cognition, Representation, Langage. Special issue: Le traitement lexicographique des noms propres (2005)
Schrodt, P.: Kansas Event Data Project (KEDS). Dept. of Political Science, University of Kansas, http://www.ku.edu/~keds/project.html
Semantex, http://www.janyainc.com
South Asian Terrorism Portal, http://www.satp.org
Steinberger, R., Pouliquen, B., Ignat, C.: Navigating multilingual news collections using automatically extracted information. Journal of Computing and Information Technology - CIT 13, 257–264 (2005)
Szpektor, I., Tanev, H., Dagan, I., Coppola, B.: Scaling Web-based acquisition of Entailment Relation. In: Proceedings of EMNLP 2004, Barcelona, Spain (2004)
Teragram, http://www.teragram.com
Virtual Research Associates, http://www.vranet.com
Weimann, G.: Terror on the Internet. USIP Press (2006) ISBN 1929223714
Yangarber, R., Jokipii, L., Rauramo, A., Huttunen, S.: Information Extraction from Epidemiological Reports. In: Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP-2005), Vancouver, Canada (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Best, C., Piskorski, J., Pouliquen, B., Steinberger, R., Tanev, H. (2008). Automating Event Extraction for the Security Domain. In: Chen, H., Yang, C.C. (eds) Intelligence and Security Informatics. Studies in Computational Intelligence, vol 135. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69209-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-69209-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69207-2
Online ISBN: 978-3-540-69209-6
eBook Packages: EngineeringEngineering (R0)