Global-locality preserving projection for word embedding

Wang, Bolin; Sun, Yuanyuan; Chu, Yonghe; Yang, Zhihao; Lin, Hongfei

doi:10.1007/s13042-022-01574-y

Global-locality preserving projection for word embedding

Original Article
Published: 16 June 2022

Volume 13, pages 2943–2956, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Bolin Wang¹,
Yuanyuan Sun¹,
Yonghe Chu²,
Zhihao Yang¹ &
…
Hongfei Lin¹

340 Accesses
1 Altmetric
Explore all metrics

Abstract

Pre-trained word embedding has a significant impact on constructing representations for sentences, paragraphs and documents. However, existing word embedding methods are typically learned in the Euclidean space. Distributed word embedding suffers from inaccurate semantic similarity and high computational cost in the Euclidean metric space. In this study, we propose global-locality preserving projection to refine word representation by re-embedding word vectors from the original embedding space to a manifold semantic space. Our method extracts the local feature of the word vector and preserves the global feature of the word vector as well. It can discover the local geometric structure that also indicates the latent semantic structure and obtain a compact word embedding subspace. The performance of the method is assessed on several lexical-level intrinsic tasks of semantic similarity and semantic relatedness, and the experimental results demonstrate its advantages over other word embedding-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PMIVec: a word embedding model guided by point-wise mutual information criterion

Article 09 June 2022

Utilizing Local Tangent Information for Word Re-embedding

Word Embedding Based on Low-Rank Doubly Stochastic Matrix Decomposition

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

http://liir.cs.kuleuven.be/software.php.

References

Mikolov T, Sutskever I, Chen K et al (2013) Distributed representations of words and phrases and their compositionality. arXiv:1310.4546
Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Yang Z, Chen H, Zhang J et al (2020) Attention-based multi-level feature fusion for named entity recognition. In: IJCAI, pp 3594–3600
Ke P, Ji H, Liu S et al (2020) Sentilare: linguistic knowledge enhanced language representation for sentiment analysis. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 6975–6988
Liu W, Tang J, Liang X et al (2021) Heterogeneous graph reasoning for knowledge-grounded medical dialogue system. Neurocomputing 442:260–268
Article Google Scholar
Cai D, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(12):1624–1637
Article Google Scholar
Lu J, Lai Z, Wang H et al (2020) Generalized embedding regression: a framework for supervised feature extraction. In: IEEE transactions on neural networks and learning systems, 2020
Liu Y, Gao Q, Miao S et al (2016) A non-greedy algorithm for L1-norm LDA. IEEE Trans Image Process 26(2):684–695
Article MathSciNet Google Scholar
Mu T, Goulermas JY, Tsujii J et al (2012) Proximity-based frameworks for generating embeddings from multi-output data. IEEE Trans Pattern Anal Mach Intell 34(11):2216–2232
Article Google Scholar
Lu J, Wang H, Zhou J et al (2021) Low-rank adaptive graph embedding for unsupervised feature extraction. Pattern Recogn 113:107758
Article Google Scholar
He X, Niyogi P (2004) Locality preserving projections. Adv Neural Inf Process Syst 16(16):153–160
Google Scholar
Lu J, Lin J, Lai Z et al (2021) Target redirected regression with dynamic neighborhood structure. Inf Sci 544:564–584
Article MathSciNet Google Scholar
Liu Y, Gao Q, Li J et al (2018) Zero shot learning via low-rank embedded semantic autoencoder. In: IJCAI, pp 2490–2496
Liu Y, Nie F, Gao Q et al (2019) Flexible unsupervised feature extraction for image classification. Neural Netw 115:65–71
Article Google Scholar
Hashimoto TB, Alvarez-Melis D, Jaakkola TS (2016) Word embeddings as metric recovery in semantic spaces. Trans Assoc Comput Linguist 4:273–286
Article Google Scholar
Levy O, Goldberg Y (2014) Neural word embedding as implicit matrix factorization. Adv Neural Inf Process Syst 27:2177–2185
Google Scholar
Bengio Y, Ducharme R, Vincent P et al (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
MATH Google Scholar
Labutov I, Lipson H (2013) Re-embedding words. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 2: short papers), pp 489–493
Lee Y Y, Ke H, Huang HH et al (2016) Less is more: filtering abnormal dimensions in GloVe. In: Proceedings of the 25th international conference companion on world wide web, pp 71–72
Yu LC, Wang J, Lai KR et al (2017) Refining word embeddings using intensity scores for sentiment analysis. IEEE/ACM Trans Audio Speech Lang Process 26(3):671–681
Article Google Scholar
Mu J, Bhat S, Viswanath P (2018) All-but-the-top: simple and effective postprocessing for word representations. In: International conference on learning representations, ICLR 2018
Wang S, Zhang J, Zong C (2018) Learning multimodal word representation via dynamic fusion methods. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, no 1
Hasan S, Curry E (2017) Word re-embedding via manifold dimensionality retention. In: Proceedings of the 2017 conference on empirical methods in natural language processing
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Article Google Scholar
Chu Y, Lin H, Yang L et al (2019) Refining word representations by manifold learning. In: IJCAI, pp 5394–5400
Zhang Z, Wang J (2007) MLLE: modified locally linear embedding using multiple weights. In: Advances in neural information processing systems, pp 1593–1600
Zhao W, Zhou D, Li L et al (2020) Manifold learning-based word representation refinement incorporating global and local information. In: Proceedings of the 28th international conference on computational linguistics, pp 3401–3412
Peters ME, Neumann M, Iyyer M et al (2018) Deep contextualized word representations. arXiv:1802.05365
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modelling. In: Interspeech, pp 601–608
Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. arXiv:170603762
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: Proceedings of the 2013 IEEE international conference on acoustics, speech and signal processing, pp 6645–6649
Devlin J, Chang MW, Lee K et al (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Collell G, Zhang T, Moens MF (2017) Imagined visual representations as multimodal embeddings. In: Proceedings of the AAAI conference on artificial intelligence, vol 31, no 1
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Zhang Z, Zha H (2003) Nonlinear dimension reduction via local tangent space alignment. In: International conference on intelligent data engineering and automated learning. Springer, Berlin, pp 477–481

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (no. 2018YFC0830603).

Author information

Authors and Affiliations

School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China
Bolin Wang, Yuanyuan Sun, Zhihao Yang & Hongfei Lin
School of Information Science and Engineering, Henan University of Technology, Henan, 450001, China
Yonghe Chu

Authors

Bolin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yonghe Chu
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuanyuan Sun.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, B., Sun, Y., Chu, Y. et al. Global-locality preserving projection for word embedding. Int. J. Mach. Learn. & Cyber. 13, 2943–2956 (2022). https://doi.org/10.1007/s13042-022-01574-y

Download citation

Received: 06 August 2021
Accepted: 03 May 2022
Published: 16 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s13042-022-01574-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Global-locality preserving projection for word embedding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PMIVec: a word embedding model guided by point-wise mutual information criterion

Utilizing Local Tangent Information for Word Re-embedding

Word Embedding Based on Low-Rank Doubly Stochastic Matrix Decomposition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Global-locality preserving projection for word embedding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

PMIVec: a word embedding model guided by point-wise mutual information criterion

Utilizing Local Tangent Information for Word Re-embedding

Word Embedding Based on Low-Rank Doubly Stochastic Matrix Decomposition

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation