Abstract
Semantics in natural language processing is largely dependent on contextual relationships between words and entities in a document collection. The context of a word may evolve. For example, the word “apple” currently has two contexts—a fruit and a technology company. The changes in the context of words or entities in text data such as scientific publications and news articles can help us understand the evolution of innovation or events of interest. In this work, we present a new diffusion-based temporal word embedding model that can capture short- and long-term changes in the semantics of entities in different domains. Our model captures how the context of each entity shifts over time. Existing temporal word embeddings capture semantic evolution at a discrete/granular level, aiming to study how a language developed over a long period. Unlike existing temporal embedding methods, our approach provides temporally smooth embeddings, facilitating prediction and trend analysis better than those of existing models. Extensive evaluations demonstrate that our proposed temporal embedding model performs better in sense-making and predicting relationships between words and entities in the future compared to other existing models.
Similar content being viewed by others
References
Hamilton WL, Leskovec J, Jurafsky D (2016) Diachronic word embeddings reveal statistical laws of semantic change. In: Proceedings of ACL, vol 1. Association for Computational Linguistics, Berlin, pp 1489–1501. https://doi.org/10.18653/v1/P16-1141
Tang X (2018) A state-of-the-art of semantic change computation. Nat Lang Eng 66:1–28. https://doi.org/10.1017/S1351324918000220
Rosin GD, Guy I, Radinsky K (2022) Time masking for temporal language models. In: Proceedings of the fifteenth ACM international conference on Web search and data mining, pp 833–841. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3488560.3498529
Barranco RC, Dos Santos RF, Hossain MS, Akbar M (2018) Tracking the evolution of words with time-reflective text representations. In: 2018 IEEE international conference on big data (big data), pp 2088–2097. IEEE, Seattle, WA, USA. https://doi.org/10.1109/BigData.2018.8621902
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems—Volume 2. Curran Associates Inc., Red Hook, NY, USA, pp 3111–3119
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. CoRR arXiv:1301.3781. https://doi.org/10.48550/arXiv.1301.3781
Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Empirical methods in natural language processing (EMNLP), pp 1532–1543. https://doi.org/10.3115/v1/D14-1162
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American Chapter of the Association for computational linguistics: human language technologies, Volume 1 (long and short papers). Association for Computational Linguistics, Minneapolis, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423
Bamler R, Mandt S (2017) Dynamic word embeddings. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning. Proceedings of Machine Learning Research, vol 70, pp 380–389
Yao Z, Sun Y, Ding W, Rao N, Xiong H (2018) Dynamic word embeddings for evolving semantic discovery. In: Proceedings of the eleventh ACM international conference on Web Search and data mining. Association for Computing Machinery, New York, pp 673–681. https://doi.org/10.1145/3159652.3159703
Rudolph M, Blei D (2018) Dynamic embeddings for language evolution. In: Proceedings of the 2018 World Wide Web conference. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, pp 1003–1011. https://doi.org/10.1145/3178876.3185999
Di Carlo V, Bianchi F, Palmonari M (2019) Training temporal word embeddings with a compass. In: Proceedings of the AAAI conference on artificial intelligence, vol 33(no 1), pp 6326–6334. https://doi.org/10.1609/aaai.v33i01.33016326
Aitchison J (1981) Language change: progress or decay?, 4th edn. Cambridge University Press, Cambridge
Yule G (2016) The study of language. Cambridge University Press, Cambridge, 6th edn. https://doi.org/10.1017/CBO9781316594131
Radinsky K, Davidovich S, Markovitch S (2012) Learning causality for news events prediction. In: Proceedings of the 21st international conference on World Wide Web. Association for Computing Machinery, New York, pp 909–918. https://doi.org/10.1145/2187836.2187958
Yogatama D, Wang C, Routledge BR, Smith NA, Xing E (2014) Dynamic language models for streaming text. Trans Assoc Comput Linguist 2:181–192. https://doi.org/10.1162/tacl_a_00175
Tang X, Qu W, Chen X (2013) Semantic change computation: a successive approach. In: Behavior and social computing. Springer, Cham, pp 68–81. https://doi.org/10.1007/978-3-319-04048-6_7
Naim SM, Boedihardjo AP, Hossain MS (2017) A scalable model for tracking topical evolution in large document collections. In: IEEE BigData, pp 726–735. https://doi.org/10.1109/BigData.2017.8257988
Mihalcea R, Nastase V (2012) Word epoch disambiguation: Finding how words change over time. In: Proceedings of the 50th annual meeting of the Association for Computational Linguistics (volume 2: short papers), pp 259–263. Association for Computational Linguistics, Jeju Island, Korea. https://aclanthology.org/P12-2051
Mitra S, Mitra R, Maity S, Riedl M, Biemann C, Goyal P, Mukherjee A (2015) An automatic approach to identify word sense changes in text media across timescales. Nat Lang Eng 21:773–798
Barkan O (2017) Bayesian neural word embedding. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, pp 3135–3143
Rosin GD, Adar E, Radinsky K (2017) Learning word relatedness over time. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 1168–1178. Association for Computational Linguistics, Copenhagen, Denmark. https://doi.org/10.18653/v1/D17-1121
Angulo J, Pederneiras C, Ebner W, Kimura E, Megale P (1980) Concepts of diffusion theory and a graphic approach to the description of the epidemic flow of contagious disease. Public Health Rep 95(5):478–485
McGovern A, Rosendahl DH, Brown RA, Droegemeier KK (2011) Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction. Data Min Knowl Discov 22(1–2):232–258
Matsubara Y, Sakurai Y, Faloutsos C (2014) AutoPlait: automatic mining of co-evolving time sequences. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data. ACM, New York, pp 193–204. https://doi.org/10.1145/2588555.2588556
Yu H-F, Rao N, Dhillon IS (2015) High-dimensional time series prediction with missing values. https://doi.org/10.48550/ARXIV.1509.08333 (2015)
Yu H-F, Rao N, Dhillon IS (2016) Temporal regularized matrix factorization for high-dimensional time series prediction. In: Proceedings of the 30th international conference on neural information processing systems (NIPS’16). Curran Associates Inc., Red Hook, NY, USA, pp 847–855
Saha TK, Williams T, Hasan MA, Joty S, Varberg NK (2018) Models for capturing temporal smoothness in evolving networks for learning latent representation of nodes. arXiv. https://doi.org/10.48550/ARXIV.1804.05816
Kumar S, Zhang X, Leskovec J (2019) Predicting dynamic embedding trajectory in temporal interaction networks. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, pp 1269–1278. https://doi.org/10.1145/3292500.3330895
Kutuzov A (2020) Distributional word embeddings in modeling diachronic semantic change. PhD thesis, University of Oslo. http://urn.nb.no/URN:NBN:no-84130
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation, pp 1724–1734. https://doi.org/10.3115/v1/D14-1179
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs, stat]. https://doi.org/10.48550/ARXIV.1409.0473
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. Curran Associates Inc., Red Hook, pp 6000–6010
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. MIT Press, Cambridge, pp 3104–3112
Wang LL, Lo K, Chandrasekhar Y, Reas R, Yang J, Burdick D, Eide D, Funk K, Katsis Y, Kinney RM, Li Y, Liu Z, Merrill W, Mooney P, Murdick DA, Rishi D, Sheehan J, Shen Z, Stilson B, Wade AD, Wang K, Wang NXR, Wilhelm C, Xie B, Raymond DM, Weld DS, Etzioni O, Kohlmeier S (2020) CORD-19: the COVID-19 open research dataset. association for computational linguistics, Online. https://aclanthology.org/2020.nlpcovid19-acl.1
Montani I, Honnibal M, Honnibal M, Landeghem S.V, Boyd A, Peters H, McCann P.O, jim geovedi O’Regan J, Samsonov M, Altinok D, Orosz G, de Kok D, Kristiansen S.L, Bournhonesque R, Kannan M, Miranda L, Baumgartner P, Edward Bot E, Hudson R, Roman Fiedler L, Mitsch R, Daniels R, Howard G, Phatthiyaphaibun W, Tamura Y, Bozek S (2022) explosion/spaCy: v3.4.3: Extended Typer support and bug fixes. Zenodo. https://doi.org/10.5281/zenodo.7310816
Neumann M, King D, Beltagy I, Ammar W (2019) ScispaCy: Fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP workshop and shared task. Association for Computational Linguistics, Florence, pp 319–327. https://doi.org/10.18653/v1/W19-5034
Steffens I (2020) A hundred days into the coronavirus disease (COVID-19) pandemic. Euro Surveill 25(14):66
Sullivan SJ, Jacobson RM, Dowdle WR, Poland GA (2010) 2009 H1N1 influenza. Mayo Clin Proc 85(1):64–76
Cucinotta D, Vanelli M (2020) Who declares Covid-19 a pandemic. Acta Bio-med Atenei Parmensis 91(1):157–160. https://doi.org/10.23750/abm.v91i1.9397
Tosun OK, Eshraghi A (2022) Corporate decisions in times of war: evidence from the Russia–Ukraine conflict. Finance Res Lett 48:102920. https://doi.org/10.1016/j.frl.2022.102920
Alsentzer E, Murphy J, Boag W, Weng W-H, Jindi D, Naumann T, McDermott M (2019) Publicly available clinical BERT embeddings. In: Proceedings of the 2nd clinical natural language processing workshop. Association for Computational Linguistics, Minneapolis, pp 72–78. https://doi.org/10.18653/v1/W19-1909
Bhargava P, Drozd A, Rogers A (2021) Generalization in NLI: ways (not) to go beyond simple heuristics. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic. https://doi.org/10.18653/v1/2021.insights-1.18
Turc I, Chang M, Lee K, Toutanova K (2019) Well-read students learn better: the impact of student initialization on knowledge distillation. CoRR arXiv:1908.08962. https://doi.org/10.48550/arXiv.1908.08962
Byron L, Wattenberg M (2008) Stacked graphs—geometry & aesthetics. IEEE Trans Vis Comput Graph 14(6):1245–1252. https://doi.org/10.1109/TVCG.2008.166
PubMed. U.S. National Library of Medicine, Bethesda. https://pubmed.ncbi.nlm.nih.gov/
Booth H, Rike D, Witte GA (2013) The National Vulnerability Database (NVD): Overview. NIST Pubs, National Institute of Standards and Technology. https://www.nist.gov/publications/national-vulnerability-database-nvd-overview
Verma R (2022) US–Taliban peace deal and regional powers as potential spoilers: Iran as a case study. Int Polit 59(2):260–279. https://doi.org/10.1057/s41311-021-00302-7
Boni F (2022) Afghanistan 2021: Us withdrawal, the Taliban return and regional geopolitics. Asia Maior XXXII:375–391
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Financial and Non-financial interests
The authors have no relevant financial or non-financial interests to disclose. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.
Conflicts of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Farhan, A., Camacho Barranco, R., Akbar, M. et al. Temporal word embedding with predictive capability. Knowl Inf Syst 65, 5159–5194 (2023). https://doi.org/10.1007/s10115-023-01920-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-023-01920-8