Abstract
Knowledge graphs comprise structural and textual information to represent knowledge. To predict new structural knowledge, current approaches learn representations using both types of information through knowledge graph embeddings and language models. These approaches commit to a single pre-trained language model. We hypothesize that heterogeneous language models may provide complementary information not exploited by current approaches. To investigate this hypothesis, we propose a unified framework that integrates multiple representations of structural knowledge and textual information. Our approach leverages hypercomplex algebra to model the interactions between (i) graph structural information and (ii) multiple text representations. Specifically, we utilize Dihedron models with 4*D dimensional hypercomplex numbers to integrate four different representations: structural knowledge graph embeddings, word-level representations (e.g., Word2vec and FastText), sentence-level representations (using a sentence transformer), and document-level representations (using FastText or Doc2vec). Our unified framework score the plausibility of labeled edges via Dihedron products, thus modeling pairwise interactions between the four representations. Extensive experimental evaluations on standard benchmark datasets confirm our hypothesis showing the superiority of our two new frameworks for link prediction tasks.
M. Nayyeri, Z. Wang and Mst. M. Akter—These authors contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balazevic, I., Allen, C., Hospedales, T.: Multi-relational poincaré graph embeddings. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguistics 5, 135–146 (2017)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, vol. 26 (2013)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26. Curran Associates, Inc. (2013). https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf
Brayne, A., Wiatrak, M., Corneil, D.: On masked language models for contextual link prediction. In: Agirre, E., Apidianaki, M., Vulic, I. (eds.) Proceedings of Deep Learning Inside Out: The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, DeeLIO@ACL 2022, Dublin, Ireland and Online, May 27, 2022, pp. 87–99. Association for Computational Linguistics (2022)
Cao, Z., Xu, Q., Yang, Z., Cao, X., Huang, Q.: Dual quaternion knowledge graph embeddings. In: AAAI, pp. 6894–6902. AAAI Press (2021)
Chami, I., Wolf, A., Juan, D., Sala, F., Ravi, S., Ré, C.: Low-dimensional hyperbolic knowledge graph embeddings. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J.R. (eds.) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5–10, 2020, pp. 6901–6914. Association for Computational Linguistics (2020)
Choudhary, S., Luthra, T., Mittal, A., Singh, R.: A survey of knowledge graph embedding and their applications. arXiv preprint arXiv:2107.07842 (2021)
Clouâtre, L., Trempe, P., Zouaq, A., Chandar, S.: MLMLM: link prediction with mean likelihood masked language model. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Online Event, August 1–6, 2021. Findings of ACL, vol. ACL/IJCNLP 2021, pp. 4321–4331. Association for Computational Linguistics (2021)
Duchi, J.C., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)
Ethayarajh, K.: How contextual are contextualized word representations? comparing the geometry of Bert, Elmo, and GPT-2 embeddings. In: EMNLP/IJCNLP (1), pp. 55–65. Association for Computational Linguistics (2019)
Ji, S., Pan, S., Cambria, E., Marttinen, P., Philip, S.Y.: A survey on knowledge graphs: representation, acquisition, and applications. IEEE Trans. Neural Networks Learn. Syst. 33, 494–514 (2021)
Kazemi, S.M., Poole, D.: Simple embedding for link prediction in knowledge graphs. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Kok, S., Domingos, P.: Statistical predicate invention. In: Proceedings of the 24th International Conference on Machine Learning, pp. 433–440 (2007)
Lacroix, T., Usunier, N., Obozinski, G.: Canonical tensor decomposition for knowledge base completion. In: International Conference on Machine Learning, pp. 2863–2872. PMLR (2018)
Lacroix, T., Usunier, N., Obozinski, G.: Canonical tensor decomposition for knowledge base completion. In: Dy, J.G., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018. Proceedings of Machine Learning Research, vol. 80, pp. 2869–2878. PMLR (2018). https://proceedings.mlr.press/v80/lacroix18a.html
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196. PMLR (2014)
Li, M., Wang, B., Jiang, J.: Siamese pre-trained transformer encoder for knowledge base completion. Neural Process. Lett. 53(6), 4143–4158 (2021)
Li, X., Zhao, X., Xu, J., Zhang, Y., Xing, C.: IMF: interactive multimodal fusion model for link prediction. In: WWW, pp. 2572–2580. ACM (2023)
Nayyeri, M., et al.: Dihedron algebraic embeddings for spatio-temporal knowledge graph completion. In: Groth, P., et al. (eds.) ESWC 2022. LNCS, vol. 13261, pp. 253–269. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06981-9_15
Nayyeri, M., et al.: Fantastic knowledge graph embeddings and how to find the right space for them. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12506, pp. 438–455. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62419-4_25
Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2015)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, pp. 1532–1543. ACL (2014)
Petroni, F., et al.: Language models as knowledge bases? In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 2463–2473. Association for Computational Linguistics (2019)
Shapley, L.S.: 17. A value for n-person games. Princeton University Press, Princeton (2016)
Shen, J., Wang, C., Gong, L., Song, D.: Joint language semantic and structure embedding for knowledge graph completion. In: Calzolari, N., et al. (eds.) Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022, Gyeongju, Republic of Korea, October 12–17, 2022, pp. 1965–1978. International Committee on Computational Linguistics (2022)
Shi, B., Weninger, T.: Open-world knowledge graph completion. In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1. AAAI (2018)
Sun, Z., Deng, Z., Nie, J., Tang, J.: Rotate: knowledge graph embedding by relational rotation in complex space. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6–9, 2019. OpenReview.net (2019)
Toth, G., Tâoth, G.: Glimpses of Algebra and Geometry, vol. 1810. Springer, New York (2002). https://doi.org/10.1007/b98964
Toutanova, K., Chen, D.: Observed versus latent features for knowledge base and text inference. In: CVSC, pp. 57–66. Association for Computational Linguistics (2015)
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex embeddings for simple link prediction. In: International Conference on Machine Learning, pp. 2071–2080 (2016)
Wang, B., Shen, T., Long, G., Zhou, T., Wang, Y., Chang, Y.: Structure-augmented text representation learning for efficient knowledge graph completion. In: Leskovec, J., Grobelnik, M., Najork, M., Tang, J., Zia, L. (eds.) WWW 2021: The Web Conference 2021, Virtual Event/Ljubljana, Slovenia, April 19–23, 2021, pp. 1737–1748. ACM/IW3C2 (2021)
Wang, C., Nulty, P., Lillis, D.: A comparative study on word embeddings in deep learning for text classification. In: NLPIR, pp. 37–46. ACM (2020)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Wang, X., et al.: Kepler: a unified model for knowledge embedding and pre-trained language representation. Trans. Assoc. Comput. Linguist. 9, 176–194 (2021)
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: AAAI, pp. 1112–1119. Citeseer (2014)
Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M.: Representation learning of knowledge graphs with entity descriptions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Xu, C., Li, R.: Relation embedding with dihedral group in knowledge graph. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp. 263–272. Association for Computational Linguistics (2019)
Yao, L., Mao, C., Luo, Y.: Kg-Bert: Bert for knowledge graph completion. arXiv preprint arXiv:1909.03193 (2019)
Zhang, S., Tay, Y., Yao, L., Liu, Q.: Quaternion knowledge graph embeddings. In: NeurIPS (2019)
Zhang, Z., Liu, X., Zhang, Y., Su, Q., Sun, X., He, B.: Pretrain-KGE: learning knowledge representation from pretrained language models. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 259–266. Association for Computational Linguistics (2020)
Zhu, C., Yang, Z., Xia, X., Li, N., Zhong, F., Liu, L.: Multimodal reasoning based on knowledge graph embedding for specific diseases. Bioinformatics 38(8), 2235–2245 (2022). https://doi.org/10.1093/bioinformatics/btac085
Acknowledgement
The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting Zihao Wang. Zihao Wang and Mojtaba Nayyeri have been funded by the German Federal Ministry for Economic Affairs and Climate Action under Grant Agreement Number 01MK20008F (Service-Meister) and ATLAS project funded by Bundesministerium für Bildung und Forschung (BMBF).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nayyeri, M. et al. (2023). Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces. In: Payne, T.R., et al. The Semantic Web – ISWC 2023. ISWC 2023. Lecture Notes in Computer Science, vol 14265. Springer, Cham. https://doi.org/10.1007/978-3-031-47240-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-47240-4_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47239-8
Online ISBN: 978-3-031-47240-4
eBook Packages: Computer ScienceComputer Science (R0)