Gemsec: Graph embedding with self clustering
B Rozemberczki, R Davies, R Sarkar… - Proceedings of the 2019 …, 2019 - dl.acm.org
Proceedings of the 2019 IEEE/ACM international conference on advances in …, 2019•dl.acm.org
Modern graph embedding procedures can efficiently process graphs with millions of nodes.
In this paper, we propose GEMSEC-a graph embedding algorithm which learns a clustering
of the nodes simultaneously with computing their embedding. GEMSEC is a general
extension of earlier work in the domain of sequence-based graph embedding. GEMSEC
places nodes in an abstract feature space where the vertex features minimize the negative
log-likelihood of preserving sampled vertex neighborhoods, and it incorporates known …
In this paper, we propose GEMSEC-a graph embedding algorithm which learns a clustering
of the nodes simultaneously with computing their embedding. GEMSEC is a general
extension of earlier work in the domain of sequence-based graph embedding. GEMSEC
places nodes in an abstract feature space where the vertex features minimize the negative
log-likelihood of preserving sampled vertex neighborhoods, and it incorporates known …
Modern graph embedding procedures can efficiently process graphs with millions of nodes. In this paper, we propose GEMSEC - a graph embedding algorithm which learns a clustering of the nodes simultaneously with computing their embedding. GEMSEC is a general extension of earlier work in the domain of sequence-based graph embedding. GEMSEC places nodes in an abstract feature space where the vertex features minimize the negative log-likelihood of preserving sampled vertex neighborhoods, and it incorporates known social network properties through a machine learning regularization. We present two new social network datasets and show that by simultaneously considering the embedding and clustering problems with respect to social properties, GEMSEC extracts high-quality clusters competitive with or superior to other community detection algorithms. In experiments, the method is found to be computationally efficient and robust to the choice of hyperparameters.
ACM Digital Library