Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Wang, Hanchen; Kaddour, Jean; Liu, Shengchao; Tang, Jian; Lasenby, Joan; Liu, Qi

Computer Science > Machine Learning

arXiv:2206.08005 (cs)

[Submitted on 16 Jun 2022 (v1), last revised 18 Oct 2023 (this version, v3)]

Title:Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Authors:Hanchen Wang, Jean Kaddour, Shengchao Liu, Jian Tang, Joan Lasenby, Qi Liu

View PDF

Abstract:Graph Self-Supervised Learning (GSSL) provides a robust pathway for acquiring embeddings without expert labelling, a capability that carries profound implications for molecular graphs due to the staggering number of potential molecules and the high cost of obtaining labels. However, GSSL methods are designed not for optimisation within a specific domain but rather for transferability across a variety of downstream tasks. This broad applicability complicates their evaluation. Addressing this challenge, we present "Molecular Graph Representation Evaluation" (MOLGRAPHEVAL), generating detailed profiles of molecular graph embeddings with interpretable and diversified attributes. MOLGRAPHEVAL offers a suite of probing tasks grouped into three categories: (i) generic graph, (ii) molecular substructure, and (iii) embedding space properties. By leveraging MOLGRAPHEVAL to benchmark existing GSSL methods against both current downstream datasets and our suite of tasks, we uncover significant inconsistencies between inferences drawn solely from existing datasets and those derived from more nuanced probing. These findings suggest that current evaluation methodologies fail to capture the entirety of the landscape.

Comments:	Camera ready, NeurIPS Benchmark 2023
Subjects:	Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
Cite as:	arXiv:2206.08005 [cs.LG]
	(or arXiv:2206.08005v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.08005

Submission history

From: Hanchen Wang [view email]
[v1] Thu, 16 Jun 2022 09:01:53 UTC (5,222 KB)
[v2] Thu, 8 Jun 2023 15:52:21 UTC (6,036 KB)
[v3] Wed, 18 Oct 2023 08:51:16 UTC (6,038 KB)

Computer Science > Machine Learning

Title:Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators