THOTH: Neural Translation and Enrichment of Knowledge Graphs

Moussallem, Diego; Soru, Tommaso; Ngonga Ngomo, Axel-Cyrille

doi:10.1007/978-3-030-30793-6_29

Diego Moussallem¹⁷,
Tommaso Soru¹⁸ &
Axel-Cyrille Ngonga Ngomo¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11778))

Included in the following conference series:

International Semantic Web Conference

2766 Accesses
2 Citations

Abstract

Knowledge Graphs are used in an increasing number of applications. Although considerable human effort has been invested into making knowledge graphs available in multiple languages, most knowledge graphs are in English. Additionally, regional facts are often only available in the language of the corresponding region. This lack of multilingual knowledge availability clearly limits the porting of machine learning models to different languages. In this paper, we aim to alleviate this drawback by proposing THOTH, an approach for translating and enriching knowledge graphs. THOTH extracts bilingual alignments between a source and target knowledge graph and learns how to translate from one to the other by relying on two different recurrent neural network models along with knowledge graph embeddings. We evaluated THOTH extrinsically by comparing the German DBpedia with the German translation of the English DBpedia on two tasks: fact checking and entity linking. In addition, we ran a manual intrinsic evaluation of the translation. Our results show that THOTH is a promising approach which achieves a translation accuracy of 88.56%. Moreover, its enrichment improves the quality of the German DBpedia significantly, as we report +18.4% accuracy for fact validation and +19% F$_1$ for entity linking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multilingual Knowledge Graph Embeddings with Neural Networks

KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs

Information extraction pipelines for knowledge graphs

Article Open access 07 January 2023

Notes

1.
https://tinyurl.com/statswebdata.
2.
a string or a value with a unit.
3.
http://dbpedia.org/page/Edmund_Hillary.
4.
http://dbpedia.org/resource/Kiwi_(people).
5.
http://dbpedia.org/resource/Kiwifruit.
6.
http://dbpedia.org/resource/Kiwi.
7.
http://dbpedia.org/resource/KiwiIRC.
8.
https://www.w3.org/TR/cooluris.
9.
https://github.com/dice-group/THOTH.
10.
We could not use RDF2Vec in our work as its code was incomplete.
11.
https://www.w3.org/TR/webont-req/#section-requirements.
12.
The black squares represents how the model splits the frequent tokens in a sequence for a better translation process.
13.
More than one surface forms can be assigned to the entities.
14.
http://www.statmt.org/wmt18/translation-task.html.
15.
We selected the subsets of mapping-based objects and labels to evaluate the quality of our approach since they are the most used ones for training Linked-Data NLP approaches.
16.
https://tools.ietf.org/html/rfc3986#section-3.1.
17.
We reduced our testset to the first subset of provided abstracts due to evaluation platform limits.
18.
http://mappings.dbpedia.org/server/statistics/de/.

References

Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Towards an automatic creation of localized versions of DBpedia. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 494–509. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41335-3_31
Chapter Google Scholar
Arcan, M., Buitelaar, P.: Ontology label translation. In: HLT-NAACL, pp. 40–46 (2013)
Google Scholar
Arcan, M., Buitelaar, P.: Translating domain-specific expressions in knowledge bases with neural machine translation. arXiv preprint arXiv:1709.02184 (2017)
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Böhning, D.: Multinomial logistic regression algorithm. Ann. Inst. Stat. Math. 1, 197–200 (1992)
Article Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)
Google Scholar
Brümmer, M., Dojchinovski, M., Hellmann, S.: DBpedia abstracts: a large-scale, open, multilingual NLP training corpus. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016). European Language Resources Association (ELRA), Paris, May 2016
Google Scholar
Cao, Z., Wang, L., de Melo, G.: Link prediction via subgraph embedding-based convex matrix completion. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI 2018). AAAI Press (2018)
Google Scholar
Chen, M., Tian, Y., Yang, M., Zaniolo, C.: Multilingual knowledge graph embeddings for cross-lingual knowledge alignment. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 1511–1517. AAAI Press (2017)
Google Scholar
Chen, M., Tian, Y., Yang, M., Zaniolo, C.: Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1–10. AAAI Press (2017)
Google Scholar
Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, p. 21. ACM (2017)
Google Scholar
Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 190–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_12
Chapter Google Scholar
Edunov, S., Ott, M., Auli, M., Grangier, D.: Understanding back-translation at scale. arXiv preprint arXiv:1808.09381 (2018)
Feng, X., Tang, D., Qin, B., Liu, T.: English-Chinese knowledge base translation with neural network. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 2935–2944 (2016)
Google Scholar
Gerber, D., et al.: Defacto—temporal and multilingual deep fact validation. Web Semant. Sci. Serv. Agents World Wide Web 35, 85–101 (2015)
Article Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, vol. 2, pp. 427–431 (2017)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Nickel, M., Mikolov, T.: Fast linear model for knowledge graph embeddings. arXiv preprint arXiv:1710.10881 (2017)
K M, A., Basu Roy Chowdhury, S., Dukkipati, A.: Learning beyond datasets: knowledge graph augmented neural networks for natural language processing. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp. 313–322. Association for Computational Linguistics (2018). http://aclweb.org/anthology/N18-1029
Kaffee, L.-A., et al.: Mind the (language) gap: generation of multilingual Wikipedia summaries from Wikidata for ArticlePlaceholders. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 319–334. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_21
Chapter Google Scholar
Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: EMNLP, vol. 3, p. 413 (2013)
Google Scholar
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.M.: OpenNMT: Open-Source Toolkit for Neural Machine Translation. ArXiv e-prints (2017)
Google Scholar
Klein, G., Kim, Y., Deng, Y., Senellart, J., Rush, A.: OpenNMT: open-source toolkit for neural machine translation. In: Proceedings of ACL 2017, System Demonstrations, pp. 67–72 (2017)
Google Scholar
Lakshen, G.A., Janev, V., Vraneš, S.: Challenges in quality assessment of Arabic DBpedia. In: Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, p. 15. ACM (2018)
Google Scholar
Luong, T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421. Association for Computational Linguistics (2015). https://doi.org/10.18653/v1/D15-1166. http://aclweb.org/anthology/D15-1166
McCrae, J.P., Arcan, M., Asooja, K., Gracia, J., Buitelaar, P., Cimiano, P.: Domain adaptation for ontology localization. Web Semant. Sci. Serv. Agents World Wide Web 36, 23–31 (2016)
Article Google Scholar
Moussallem, D., Arčan, M., Ngomo, A.C.N., Buitelaar, P.: Augmenting neural machine translation with knowledge graphs. arXiv preprint arXiv:1902.08816 (2019)
Moussallem, D., Usbeck, R., Röeder, M., Ngomo, A.C.N.: MAG: a multilingual, knowledge-base agnostic and deterministic entity linking approach. In: Proceedings of the Knowledge Capture Conference, p. 9. ACM (2017)
Google Scholar
Moussallem, D., Wauer, M., Ngomo, A.C.N.: Machine translation using semantic web technologies: a survey. J. Web Semant. 51, 1–19 (2018)
Article Google Scholar
Nickel, M., Rosasco, L., Poggio, T.A., et al.: Holographic embeddings of knowledge graphs. In: AAAI, pp. 1955–1961 (2016)
Google Scholar
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Google Scholar
Ristoski, P., Paulheim, H.: RDF2Vec: RDF graph embeddings for data mining. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 498–514. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_30
Chapter Google Scholar
Sennrich, R., Haddow, B., Birch, A.: Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1715–1725. Association for Computational Linguistics (2016)
Google Scholar
Sorokin, D., Gurevych, I.: Modeling semantics with gated graph neural networks for knowledge base question answering. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3306–3317. Association for Computational Linguistics (2018). http://aclweb.org/anthology/C18-1280
Tang, G., Müller, M., Rios, A., Sennrich, R.: Why self-attention? A targeted evaluation of neural machine translation architectures. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4263–4272 (2018)
Google Scholar
Usbeck, R., et al.: GERBIL: general entity annotator benchmarking framework. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, Florence, Italy, 18–22 May 2015, pp. 1133–1143 (2015)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)
Article Google Scholar

Download references

Acknowledgments

This work has been supported by the German Federal Ministry of Transport and Digital Infrastructure (BMVI) in the projects LIMBO (no. 19F2029I) and OPAL (no. 19F2028A) as well as by the Brazilian National Council for Scientific and Technological Development (CNPq) (no. 206971/2014-1).

Author information

Authors and Affiliations

Data Science Group, University of Paderborn, Paderborn, Germany
Diego Moussallem & Axel-Cyrille Ngonga Ngomo
AKSW Research Group, University of Leipzig, Leipzig, Germany
Tommaso Soru

Authors

Diego Moussallem
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Soru
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego Moussallem .

Editor information

Editors and Affiliations

Fondazione Bruno Kessler, Trento, Italy
Chiara Ghidini
Linköping University, Linköping, Sweden
Olaf Hartig
University of Bonn, Bonn, Germany
Maria Maleshkova
University of Economics Prague, Prague, Czech Republic
Vojtěch Svátek
University of Illinois at Chicago, Chicago, IL, USA
Isabel Cruz
University of Chile, Santiago, Chile
Aidan Hogan
Memect Technology, Beijing, China
Jie Song
Mines Saint-Etienne, Saint-Etienne, France
Maxime Lefrançois
Inria Sophia Antipolis - Méditerranée, Sophia Antipolis, France
Fabien Gandon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moussallem, D., Soru, T., Ngonga Ngomo, AC. (2019). THOTH: Neural Translation and Enrichment of Knowledge Graphs. In: Ghidini, C., et al. The Semantic Web – ISWC 2019. ISWC 2019. Lecture Notes in Computer Science(), vol 11778. Springer, Cham. https://doi.org/10.1007/978-3-030-30793-6_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-30793-6_29
Published: 17 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30792-9
Online ISBN: 978-3-030-30793-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)

THOTH: Neural Translation and Enrichment of Knowledge Graphs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multilingual Knowledge Graph Embeddings with Neural Networks

KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs

Information extraction pipelines for knowledge graphs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

THOTH: Neural Translation and Enrichment of Knowledge Graphs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multilingual Knowledge Graph Embeddings with Neural Networks

KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs

Information extraction pipelines for knowledge graphs

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation