Small molecule generation via disentangled representation learning

Summary of D-Mol-VAE variants in terms of their disentanglement objectives

Model	Objectives
D-MolVAE-V	$① + ② + ③ + ④$
D-MolVAE-β	$① + ③ + β (② + ④)$
D-MolVAE-DIP-I	$① + ② + ③ + λ ④$
D-MolVAE-DIP-II	$① + ③ + ④$
D-MolVAE-VIB	$① + ③ + γ ② + ④ + C$

Table 1.

Summary of D-Mol-VAE variants in terms of their disentanglement objectives

Model	Objectives
D-MolVAE-V	$① + ② + ③ + ④$
D-MolVAE-β	$① + ③ + β (② + ④)$
D-MolVAE-DIP-I	$① + ② + ③ + λ ④$
D-MolVAE-DIP-II	$① + ③ + ④$
D-MolVAE-VIB	$① + ③ + γ ② + ④ + C$

Table 2.

Encoders and decoders architectures

Encoder	Decoder
Input: $G (V, E, E, F)$	Input $[z] \in R^{100}$
FC.100 ReLU	FC.100 ReLU
GGNN.100 ReLU	GGNN.100 ReLU
GGNN.100 ReLU	GGNN.100 ReLU
FC.100	FC.bv (batch node size) FC.3 (edge)

Note: Each layer is expressed in the format as $< kernel_size > < layer_type > < Num_channel > < Activation_function > < stride_size >$ ⁠. FC refers to the fully connected layers

Table 2.

Encoders and decoders architectures

Encoder	Decoder
Input: $G (V, E, E, F)$	Input $[z] \in R^{100}$
FC.100 ReLU	FC.100 ReLU
GGNN.100 ReLU	GGNN.100 ReLU
GGNN.100 ReLU	GGNN.100 ReLU
FC.100	FC.bv (batch node size) FC.3 (edge)

Note: Each layer is expressed in the format as $< kernel_size > < layer_type > < Num_channel > < Activation_function > < stride_size >$ ⁠. FC refers to the fully connected layers

Table 3.

Hyperparameters used for training

Dataset	Learning_rate	Batch_size	λ	Num_iteration
QM9	5e−4	64	1	10
ZINC	5e−4	8	1	5
MOSES	5e−4	4	1	5
CHEMBL	5e−4	4	1	5

Table 3.

Hyperparameters used for training

Dataset	Learning_rate	Batch_size	λ	Num_iteration
QM9	5e−4	64	1	10
ZINC	5e−4	8	1	5
MOSES	5e−4	4	1	5
CHEMBL	5e−4	4	1	5

3 Results

3.1 Datasets and experimental setup

We employ four benchmark datasets: QM9, ZINC, MOSES and ChEMBL (Du et al., 2021b). QM9 (Ramakrishnan et al., 2014; Ruddigkeit et al., 2012) contains around 134k stable small organic molecules with up to nine heavy atoms [e.g. carbon (C), oxygen (O), nitrogen (N) and fluorine (F)]. ZINC (Irwin et al., 2012) contains approximately 250k drug-like chemical compounds with an average of 23 heavy atoms. The molecules in this dataset are more complex than in QM9. MOSES (Polykovskiy et al., 2020) contains about 1.9M larger molecules with up to 30 heavy atoms. ChEMBL (Gaulton et al., 2017) contains about 1.8M manually curated bioactive molecules with drug-like properties. For QM9, we use the entire dataset, while for ZINC, MOSES and ChEMBL which have larger molecules, we randomly sample 70k molecules from the entire dataset, and split into $6 : 1$ for training and validation. During testing, we generate 30k molecules for our experiments.

We utilize qualitative and quantitative experiments that evaluate the proposed D-MolVAE-V, D-MolVAE-β, D-MolVAE-DIP-I, D-MolVAE-DIP-II and D-MolVAE-VIB. The models are pitched against nine state-of-the-art deep generative models for molecule generation: ChemVAE (Gómez-Bombarelli et al., 2018), GrammarVAE (Kusner et al., 2017), GraphVAE (Simonovsky and Komodakis, 2018), GraphGMG (Li et al., 2018), SMILES-LSTM (Sundermeyer et al., 2012), GraphNVP (Madhawa et al., 2019), GRF (Honda et al., 2019), GraphAF (Shi et al., 2019) and CGVAE (Liu et al., 2018). In the interest of brevity, summaries of the main computational ingredients in each of these models are related in the Supplementary Material. All experiments are conducted on a 64-bit machine with a 6 core Intel CPU i9-9820X, 32 GB RAM and an NVIDIA GPU (GeForce RTX 2080ti, 1545 MHz, 11 GB GDDR6).

3.2 Evaluating the quality of generated molecules

Table 4 relates the comparative analysis. Each trained model is used to generate 30k molecules. For GraphGMG, we obtain 20k generated molecules from the GraphGMG authors. Results for ChemVAE, GrammarVAE, GraphVAE and SMILES-LSTM are obtained from Liu et al. (2018). The quality of generated dataset is evaluated via the three common metrics of Novelty, Uniqueness and Validity. Novelty measures the fraction of generated molecules that are not in the training dataset. Uniqueness measures the fraction of generated molecules after and before removing duplicates. Validity measures the fraction of generated molecules that are chemically valid.

Table 4.

Novelty, uniqueness and validity, shown in %, are measured on a generated dataset

Model	QM9			ZINC
	Validity	Novelty	Unique	Validity	Novelty	Unique
ChemVAE	10.00	90.00	67.50	17.00	98.00	30.98
GrammarVAE	30.00	95.44	9.30	31.00	100.00	10.76
GraphVAE	61.00	85.00	40.90	14.00	100.00	31.60
GraphGMG	–	–	–	89.20	89.10	99.41
SMILES-LSTM	94.78	82.98	96.94	96.80	100.00	99.97
GraphNVP	90.10	54.00	97.30	74.40	100.00	94.80
GRF	84.50	58.60	66.00	73.40	100.00	53.70
GraphAF	100.00	88.83	94.51	100.00	100.00	99.10
CGVAE	100.00	96.33	98.03	100.00	100.00	99.95
D-MolVAE-V	100.00	96.10	99.15	100.00	100.00	99.95
D-MolVAE-β	100.00	95.35	96.62	100.00	100.00	99.72
D-MolVAE-DIP-I	100.00	97.36	97.80	100.00	99.99	99.88
D-MolVAE-DIP-II	100.00	98.31	72.36	100.00	100.00	51.42
D-MolVAE-VIB	100.00	95.85	98.66	100.00	100.00	99.18

Model	QM9			ZINC
	Validity	Novelty	Unique	Validity	Novelty	Unique
ChemVAE	10.00	90.00	67.50	17.00	98.00	30.98
GrammarVAE	30.00	95.44	9.30	31.00	100.00	10.76
GraphVAE	61.00	85.00	40.90	14.00	100.00	31.60
GraphGMG	–	–	–	89.20	89.10	99.41
SMILES-LSTM	94.78	82.98	96.94	96.80	100.00	99.97
GraphNVP	90.10	54.00	97.30	74.40	100.00	94.80
GRF	84.50	58.60	66.00	73.40	100.00	53.70
GraphAF	100.00	88.83	94.51	100.00	100.00	99.10
CGVAE	100.00	96.33	98.03	100.00	100.00	99.95
D-MolVAE-V	100.00	96.10	99.15	100.00	100.00	99.95
D-MolVAE-β	100.00	95.35	96.62	100.00	100.00	99.72
D-MolVAE-DIP-I	100.00	97.36	97.80	100.00	99.99	99.88
D-MolVAE-DIP-II	100.00	98.31	72.36	100.00	100.00	51.42
D-MolVAE-VIB	100.00	95.85	98.66	100.00	100.00	99.18

Note: The highest value achieved on a metric is highlighted in boldface.

Table 4.

Novelty, uniqueness and validity, shown in %, are measured on a generated dataset

Model	QM9			ZINC
	Validity	Novelty	Unique	Validity	Novelty	Unique
ChemVAE	10.00	90.00	67.50	17.00	98.00	30.98
GrammarVAE	30.00	95.44	9.30	31.00	100.00	10.76
GraphVAE	61.00	85.00	40.90	14.00	100.00	31.60
GraphGMG	–	–	–	89.20	89.10	99.41
SMILES-LSTM	94.78	82.98	96.94	96.80	100.00	99.97
GraphNVP	90.10	54.00	97.30	74.40	100.00	94.80
GRF	84.50	58.60	66.00	73.40	100.00	53.70
GraphAF	100.00	88.83	94.51	100.00	100.00	99.10
CGVAE	100.00	96.33	98.03	100.00	100.00	99.95
D-MolVAE-V	100.00	96.10	99.15	100.00	100.00	99.95
D-MolVAE-β	100.00	95.35	96.62	100.00	100.00	99.72
D-MolVAE-DIP-I	100.00	97.36	97.80	100.00	99.99	99.88
D-MolVAE-DIP-II	100.00	98.31	72.36	100.00	100.00	51.42
D-MolVAE-VIB	100.00	95.85	98.66	100.00	100.00	99.18

Model	QM9			ZINC
	Validity	Novelty	Unique	Validity	Novelty	Unique
ChemVAE	10.00	90.00	67.50	17.00	98.00	30.98
GrammarVAE	30.00	95.44	9.30	31.00	100.00	10.76
GraphVAE	61.00	85.00	40.90	14.00	100.00	31.60
GraphGMG	–	–	–	89.20	89.10	99.41
SMILES-LSTM	94.78	82.98	96.94	96.80	100.00	99.97
GraphNVP	90.10	54.00	97.30	74.40	100.00	94.80
GRF	84.50	58.60	66.00	73.40	100.00	53.70
GraphAF	100.00	88.83	94.51	100.00	100.00	99.10
CGVAE	100.00	96.33	98.03	100.00	100.00	99.95
D-MolVAE-V	100.00	96.10	99.15	100.00	100.00	99.95
D-MolVAE-β	100.00	95.35	96.62	100.00	100.00	99.72
D-MolVAE-DIP-I	100.00	97.36	97.80	100.00	99.99	99.88
D-MolVAE-DIP-II	100.00	98.31	72.36	100.00	100.00	51.42
D-MolVAE-VIB	100.00	95.85	98.66	100.00	100.00	99.18

Note: The highest value achieved on a metric is highlighted in boldface.

Table 4 allows making several observations. ChemVAE, GrammarVAE and GraphVAE have the lowest performance. The D-MolVAE models achieve superior performance over the other models. In particular, all D-MolVAE models achieve 100% on validity on all datasets. Similar performance is observed on uniqueness as well. Varied performance is observed on novelty, though all D-MolVAE models consistently outperform or match the performance of the other models; CGVAE is the only other model with a consistently good performance across all metrics on all datasets. This is not surprising, as our proposed models build over the CGVAE architecture but additionally enforce disentanglement. The explicit disentanglement enforcement seems to provide some benefits on higher novelty, in particular, on the QM9 dataset, over CGVAE. Taken altogether, these results suggest that the disentanglement enforcement does not reduce and actually improves performance; adding the disentanglement regularization does not influence the reconstruction error and so does not sacrifice the quality of generated molecules. It is worth noting that some of the proposed models, such as D-MolVAE-DIP-I and D-MolVAE-DIP-II, generate more novel molecules. Between the two, D-MolVAE-DIP-II generates more novel (nearly 100%) yet less unique (50–70%) molecules due to the stronger constraint exerted by the KL divergence term. In Table 5, we further evaluate the performance of our proposed methods and the strongest baseline, CGVAE, on two new datasets, MOSES and ChEMBL. In MOSES dataset, all the model achieve 100% validity and novelty, while D-MolVAE-VIB and D-MolVAE-DIp-I also perform 100% unique. In CHEMBL dataset, all the models achieve a comparable result except D-MolVAE-V on Unique.

Table 5.

Novelty, uniqueness and validity, shown in %, are measured on a generated dataset

Model	MOSES			CHEMBL
	Validity	Novelty	Unique	Validity	Novelty	Unique
CGVAE	99.97	99.97	95.33	100.00	99.97	99.85
D-MolVAE-V	100.00	100.00	99.70	100.00	100.00	14.85
D-MolVAE-β	100.00	100.00	99.73	100.00	100.00	99.35
D-MolVAE-DIP-I	100.00	100.00	100.00	100.00	100.00	99.96
D-MolVAE-DIP-II	100.00	100.00	56.53	100.00	100.00	99.93
D-MolVAE-VIB	100.00	100.00	100.00	100.00	99.97	99.88

Model	MOSES			CHEMBL
	Validity	Novelty	Unique	Validity	Novelty	Unique
CGVAE	99.97	99.97	95.33	100.00	99.97	99.85
D-MolVAE-V	100.00	100.00	99.70	100.00	100.00	14.85
D-MolVAE-β	100.00	100.00	99.73	100.00	100.00	99.35
D-MolVAE-DIP-I	100.00	100.00	100.00	100.00	100.00	99.96
D-MolVAE-DIP-II	100.00	100.00	56.53	100.00	100.00	99.93
D-MolVAE-VIB	100.00	100.00	100.00	100.00	99.97	99.88

Note: The highest value achieved on a metric is highlighted in boldface.

Table 5.

Novelty, uniqueness and validity, shown in %, are measured on a generated dataset

Model	MOSES			CHEMBL
	Validity	Novelty	Unique	Validity	Novelty	Unique
CGVAE	99.97	99.97	95.33	100.00	99.97	99.85
D-MolVAE-V	100.00	100.00	99.70	100.00	100.00	14.85
D-MolVAE-β	100.00	100.00	99.73	100.00	100.00	99.35
D-MolVAE-DIP-I	100.00	100.00	100.00	100.00	100.00	99.96
D-MolVAE-DIP-II	100.00	100.00	56.53	100.00	100.00	99.93
D-MolVAE-VIB	100.00	100.00	100.00	100.00	99.97	99.88

Model	MOSES			CHEMBL
	Validity	Novelty	Unique	Validity	Novelty	Unique
CGVAE	99.97	99.97	95.33	100.00	99.97	99.85
D-MolVAE-V	100.00	100.00	99.70	100.00	100.00	14.85
D-MolVAE-β	100.00	100.00	99.73	100.00	100.00	99.35
D-MolVAE-DIP-I	100.00	100.00	100.00	100.00	100.00	99.96
D-MolVAE-DIP-II	100.00	100.00	56.53	100.00	100.00	99.93
D-MolVAE-VIB	100.00	100.00	100.00	100.00	99.97	99.88

Note: The highest value achieved on a metric is highlighted in boldface.

3.3 Comparing the learned distribution to the training distribution

Given the above results, we now focus the comparison of our models against CGVAE. We measure the distance between the generated and the training datasets in terms of molecular properties and graph statistics, as shown in Table 6, utilizing two popular metrics, the maximum mean discrepancy (MMD) (You et al., 2018) and KL divergence (KLD) (You et al., 2018). MMD is used when comparing distributions of graph statistics and KLD is used when comparing distributions of molecular properties; the molecular properties of interest are selected due to their low correlation, which is ideal for the disentanglement experiment setting that requires independent semantic factors. The correlation heatmap between commonly used molecular properties evaluated in QM9 dataset is shown in Supplementary Figure S1. All these statistics are described in detail the Supplementary Material, where we also draw randomly selected QM9 molecules over the generated dataset for each of the models.

Table 6.

Comparing the difference between the training and generated distributions of graph properties via MMD and KLD

Dataset	Metric	CGVAE	Mol-V	Mol-β	Mol-DI	Mol-DII	Mol-VIB
QM9	MMD(Deg)	0.0167	0.0258	0.0541	0.0838	0.0238	0.0232
	MMD(CC)	0.0097	0.0051	0.0259	0.0175	0.0095	0.0045
	MMD(Orbit)	0.0018	0.0210	0.0021	0.0079	0.0031	0.0017
	KLD(cLogP)	0.08	0.41	0.44	0.35	0.46	0.01
	KLD(cLogS)	0.06	0.27	0.26	0.18	1.23	0.13
	KLD(Drug)	0.07	0.15	0.08	0.18	0.22	0.04
	KLD(RPSA)	0.04	0.29	0.11	0.18	0.51	0.04
	KLD(PSA)	0.03	0.07	0.07	0.30	0.09	0.03
	KLD(SA)	0.44	0.21	0.50	0.89	0.16	0.20
ZINC	MMD(Deg)	0.0023	0.0005	0.0043	0.0034	0.7962	0.0111
	MMD(CC))	0.0013	0.0002	0.0013	0.0005	0.0316	0.0363
	MMD(Orbit)	0.0005	0.0731	0.0001	0.0001	0.0001	0.0006
	KLD(cLogP)	0.67	0.59	0.09	0.67	0.30	0.23
	KLD(cLogS)	0.74	0.04	0.09	0.74	0.58	0.10
	KLD(Drug)	1.29	1.63	0.97	1.29	1.52	0.01
	KLD(RPSA)	0.78	0.47	0.31	0.79	1.17	0.08
	KLD(PSA)	0.56	0.06	0.14	0.59	0.01	0.12
	KLD(SA)	0.56	0.75	0.79	0.76	2.29	0.82
MOSES	MMD(Deg)	0.0052	0.0032	0.0031	0.0220	0.4520	0.0024
	MMD(CC))	0.0003	0.0027	0.0004	0.0005	0.0000	0.0002
	MMD(Orbit)	0.0009	0.0013	0.0002	0.0006	0.0217	0.0005
	KLD(cLogP)	0.47	0.01	0.96	0.12	0.37	0.25
	KLD(cLogS)	0.22	0.21	0.17	1.01	0.50	0.16
	KLD(Drug)	0.35	0.56	0.84	1.41	0.33	0.48
	KLD(RPSA)	0.04	0.01	0.18	0.93	0.97	0.05
	KLD(PSA)	0.07	0.22	0.36	0.71	0.58	0.07
	KLD(SA)	1.57	1.76	1.85	1.25	3.57	1.09
CHEMBL	MMD(Deg)	0.0028	0.6634	0.0022	0.0015	0.0013	0.0025
	MMD(CC))	0.0002	0.0010	0.0004	0.0001	0.0002	0.0001
	MMD(Orbit)	0.0004	0.0424	0.0010	0.0002	0.0002	0.0004
	KLD(cLogP)	0.03	0.05	0.31	0.04	0.04	0.03
	KLD(cLogS)	0.04	0.04	0.05	0.04	0.04	0.04
	KLD(Drug)	0.01	0.01	0.02	0.02	0.02	0.01
	KLD(RPSA)	0.01	0.02	0.01	0.01	0.01	0.01
	KLD(PSA)	0.23	0.24	0.25	0.24	0.25	0.23
	KLD(SA)	0.07	0.08	0.09	0.08	0.08	0.08

Dataset	Metric	CGVAE	Mol-V	Mol-β	Mol-DI	Mol-DII	Mol-VIB
QM9	MMD(Deg)	0.0167	0.0258	0.0541	0.0838	0.0238	0.0232
	MMD(CC)	0.0097	0.0051	0.0259	0.0175	0.0095	0.0045
	MMD(Orbit)	0.0018	0.0210	0.0021	0.0079	0.0031	0.0017
	KLD(cLogP)	0.08	0.41	0.44	0.35	0.46	0.01
	KLD(cLogS)	0.06	0.27	0.26	0.18	1.23	0.13
	KLD(Drug)	0.07	0.15	0.08	0.18	0.22	0.04
	KLD(RPSA)	0.04	0.29	0.11	0.18	0.51	0.04
	KLD(PSA)	0.03	0.07	0.07	0.30	0.09	0.03
	KLD(SA)	0.44	0.21	0.50	0.89	0.16	0.20
ZINC	MMD(Deg)	0.0023	0.0005	0.0043	0.0034	0.7962	0.0111
	MMD(CC))	0.0013	0.0002	0.0013	0.0005	0.0316	0.0363
	MMD(Orbit)	0.0005	0.0731	0.0001	0.0001	0.0001	0.0006
	KLD(cLogP)	0.67	0.59	0.09	0.67	0.30	0.23
	KLD(cLogS)	0.74	0.04	0.09	0.74	0.58	0.10
	KLD(Drug)	1.29	1.63	0.97	1.29	1.52	0.01
	KLD(RPSA)	0.78	0.47	0.31	0.79	1.17	0.08
	KLD(PSA)	0.56	0.06	0.14	0.59	0.01	0.12
	KLD(SA)	0.56	0.75	0.79	0.76	2.29	0.82
MOSES	MMD(Deg)	0.0052	0.0032	0.0031	0.0220	0.4520	0.0024
	MMD(CC))	0.0003	0.0027	0.0004	0.0005	0.0000	0.0002
	MMD(Orbit)	0.0009	0.0013	0.0002	0.0006	0.0217	0.0005
	KLD(cLogP)	0.47	0.01	0.96	0.12	0.37	0.25
	KLD(cLogS)	0.22	0.21	0.17	1.01	0.50	0.16
	KLD(Drug)	0.35	0.56	0.84	1.41	0.33	0.48
	KLD(RPSA)	0.04	0.01	0.18	0.93	0.97	0.05
	KLD(PSA)	0.07	0.22	0.36	0.71	0.58	0.07
	KLD(SA)	1.57	1.76	1.85	1.25	3.57	1.09
CHEMBL	MMD(Deg)	0.0028	0.6634	0.0022	0.0015	0.0013	0.0025
	MMD(CC))	0.0002	0.0010	0.0004	0.0001	0.0002	0.0001
	MMD(Orbit)	0.0004	0.0424	0.0010	0.0002	0.0002	0.0004
	KLD(cLogP)	0.03	0.05	0.31	0.04	0.04	0.03
	KLD(cLogS)	0.04	0.04	0.05	0.04	0.04	0.04
	KLD(Drug)	0.01	0.01	0.02	0.02	0.02	0.01
	KLD(RPSA)	0.01	0.02	0.01	0.01	0.01	0.01
	KLD(PSA)	0.23	0.24	0.25	0.24	0.25	0.23
	KLD(SA)	0.07	0.08	0.09	0.08	0.08	0.08

Note: We abbreviate D-MolVAE by Mol, DIP by D, degree by Deg, clustering coefficient by Coeff, drug-likeness by Drug and Rel PSA by RPSA. The best value per row is in boldface.

Table 6.

Comparing the difference between the training and generated distributions of graph properties via MMD and KLD

Dataset	Metric	CGVAE	Mol-V	Mol-β	Mol-DI	Mol-DII	Mol-VIB
QM9	MMD(Deg)	0.0167	0.0258	0.0541	0.0838	0.0238	0.0232
	MMD(CC)	0.0097	0.0051	0.0259	0.0175	0.0095	0.0045
	MMD(Orbit)	0.0018	0.0210	0.0021	0.0079	0.0031	0.0017
	KLD(cLogP)	0.08	0.41	0.44	0.35	0.46	0.01
	KLD(cLogS)	0.06	0.27	0.26	0.18	1.23	0.13
	KLD(Drug)	0.07	0.15	0.08	0.18	0.22	0.04
	KLD(RPSA)	0.04	0.29	0.11	0.18	0.51	0.04
	KLD(PSA)	0.03	0.07	0.07	0.30	0.09	0.03
	KLD(SA)	0.44	0.21	0.50	0.89	0.16	0.20
ZINC	MMD(Deg)	0.0023	0.0005	0.0043	0.0034	0.7962	0.0111
	MMD(CC))	0.0013	0.0002	0.0013	0.0005	0.0316	0.0363
	MMD(Orbit)	0.0005	0.0731	0.0001	0.0001	0.0001	0.0006
	KLD(cLogP)	0.67	0.59	0.09	0.67	0.30	0.23
	KLD(cLogS)	0.74	0.04	0.09	0.74	0.58	0.10
	KLD(Drug)	1.29	1.63	0.97	1.29	1.52	0.01
	KLD(RPSA)	0.78	0.47	0.31	0.79	1.17	0.08
	KLD(PSA)	0.56	0.06	0.14	0.59	0.01	0.12
	KLD(SA)	0.56	0.75	0.79	0.76	2.29	0.82
MOSES	MMD(Deg)	0.0052	0.0032	0.0031	0.0220	0.4520	0.0024
	MMD(CC))	0.0003	0.0027	0.0004	0.0005	0.0000	0.0002
	MMD(Orbit)	0.0009	0.0013	0.0002	0.0006	0.0217	0.0005
	KLD(cLogP)	0.47	0.01	0.96	0.12	0.37	0.25
	KLD(cLogS)	0.22	0.21	0.17	1.01	0.50	0.16
	KLD(Drug)	0.35	0.56	0.84	1.41	0.33	0.48
	KLD(RPSA)	0.04	0.01	0.18	0.93	0.97	0.05
	KLD(PSA)	0.07	0.22	0.36	0.71	0.58	0.07
	KLD(SA)	1.57	1.76	1.85	1.25	3.57	1.09
CHEMBL	MMD(Deg)	0.0028	0.6634	0.0022	0.0015	0.0013	0.0025
	MMD(CC))	0.0002	0.0010	0.0004	0.0001	0.0002	0.0001
	MMD(Orbit)	0.0004	0.0424	0.0010	0.0002	0.0002	0.0004
	KLD(cLogP)	0.03	0.05	0.31	0.04	0.04	0.03
	KLD(cLogS)	0.04	0.04	0.05	0.04	0.04	0.04
	KLD(Drug)	0.01	0.01	0.02	0.02	0.02	0.01
	KLD(RPSA)	0.01	0.02	0.01	0.01	0.01	0.01
	KLD(PSA)	0.23	0.24	0.25	0.24	0.25	0.23
	KLD(SA)	0.07	0.08	0.09	0.08	0.08	0.08

Dataset	Metric	CGVAE	Mol-V	Mol-β	Mol-DI	Mol-DII	Mol-VIB
QM9	MMD(Deg)	0.0167	0.0258	0.0541	0.0838	0.0238	0.0232
	MMD(CC)	0.0097	0.0051	0.0259	0.0175	0.0095	0.0045
	MMD(Orbit)	0.0018	0.0210	0.0021	0.0079	0.0031	0.0017
	KLD(cLogP)	0.08	0.41	0.44	0.35	0.46	0.01
	KLD(cLogS)	0.06	0.27	0.26	0.18	1.23	0.13
	KLD(Drug)	0.07	0.15	0.08	0.18	0.22	0.04
	KLD(RPSA)	0.04	0.29	0.11	0.18	0.51	0.04
	KLD(PSA)	0.03	0.07	0.07	0.30	0.09	0.03
	KLD(SA)	0.44	0.21	0.50	0.89	0.16	0.20
ZINC	MMD(Deg)	0.0023	0.0005	0.0043	0.0034	0.7962	0.0111
	MMD(CC))	0.0013	0.0002	0.0013	0.0005	0.0316	0.0363
	MMD(Orbit)	0.0005	0.0731	0.0001	0.0001	0.0001	0.0006
	KLD(cLogP)	0.67	0.59	0.09	0.67	0.30	0.23
	KLD(cLogS)	0.74	0.04	0.09	0.74	0.58	0.10
	KLD(Drug)	1.29	1.63	0.97	1.29	1.52	0.01
	KLD(RPSA)	0.78	0.47	0.31	0.79	1.17	0.08
	KLD(PSA)	0.56	0.06	0.14	0.59	0.01	0.12
	KLD(SA)	0.56	0.75	0.79	0.76	2.29	0.82
MOSES	MMD(Deg)	0.0052	0.0032	0.0031	0.0220	0.4520	0.0024
	MMD(CC))	0.0003	0.0027	0.0004	0.0005	0.0000	0.0002
	MMD(Orbit)	0.0009	0.0013	0.0002	0.0006	0.0217	0.0005
	KLD(cLogP)	0.47	0.01	0.96	0.12	0.37	0.25
	KLD(cLogS)	0.22	0.21	0.17	1.01	0.50	0.16
	KLD(Drug)	0.35	0.56	0.84	1.41	0.33	0.48
	KLD(RPSA)	0.04	0.01	0.18	0.93	0.97	0.05
	KLD(PSA)	0.07	0.22	0.36	0.71	0.58	0.07
	KLD(SA)	1.57	1.76	1.85	1.25	3.57	1.09
CHEMBL	MMD(Deg)	0.0028	0.6634	0.0022	0.0015	0.0013	0.0025
	MMD(CC))	0.0002	0.0010	0.0004	0.0001	0.0002	0.0001
	MMD(Orbit)	0.0004	0.0424	0.0010	0.0002	0.0002	0.0004
	KLD(cLogP)	0.03	0.05	0.31	0.04	0.04	0.03
	KLD(cLogS)	0.04	0.04	0.05	0.04	0.04	0.04
	KLD(Drug)	0.01	0.01	0.02	0.02	0.02	0.01
	KLD(RPSA)	0.01	0.02	0.01	0.01	0.01	0.01
	KLD(PSA)	0.23	0.24	0.25	0.24	0.25	0.23
	KLD(SA)	0.07	0.08	0.09	0.08	0.08	0.08

Note: We abbreviate D-MolVAE by Mol, DIP by D, degree by Deg, clustering coefficient by Coeff, drug-likeness by Drug and Rel PSA by RPSA. The best value per row is in boldface.

Table 7.

Evaluation of disentanglement across all top models on each of the datasets

Dataset	Model	β-M(%)↑	F-M(%)↑	DCI↑	Mod↑
QM9	CGVAE	100	57.0	0.055	0.239
	Mol-V	100	50.0	0.019	0.233
	Mol-β	100	56.0	0.0466	0.223
	Mol-DI	100	61.2	0.023	0.261
	Mol-DII	100	62.0	0.0972	0.241
	Mol-VIB	100	72.0	0.1282	0.243
ZINC	CGVAE	100	48.0	0.011	0.195
	Mol-V	100	44.0	0.016	0.163
	Mol-β	100	52.0	0.016	0.151
	Mol-DI	100	52.4	0.010	0.197
	Mol-DII	100	50.0	0.019	0.188
	Mol-VIB	100	58.0	0.036	0.189
MOSES	CGVAE	100	38.0	0.059	0.184
	Mol-V	100	44.0	0.060	0.189
	Mol-β	100	46.0	0.061	0.186
	Mol-DI	100	58.0	0.062	0.209
	Mol-DII	100	50.0	0.071	0.212
	Mol-VIB	100	54.0	0.078	0.253
CHEMBL	CGVAE	82.0	61.3	0.181	0.500
	Mol-V	80.0	62.0	0.202	0.499
	Mol-β	82.6	62.3	0.219	0.491
	Mol-DI	84.0	62.0	0.209	0.481
	Mol-DII	80.0	64.0	0.213	0.456
	Mol-VIB	85.3	64.6	0.183	0.504

Dataset	Model	β-M(%)↑	F-M(%)↑	DCI↑	Mod↑
QM9	CGVAE	100	57.0	0.055	0.239
	Mol-V	100	50.0	0.019	0.233
	Mol-β	100	56.0	0.0466	0.223
	Mol-DI	100	61.2	0.023	0.261
	Mol-DII	100	62.0	0.0972	0.241
	Mol-VIB	100	72.0	0.1282	0.243
ZINC	CGVAE	100	48.0	0.011	0.195
	Mol-V	100	44.0	0.016	0.163
	Mol-β	100	52.0	0.016	0.151
	Mol-DI	100	52.4	0.010	0.197
	Mol-DII	100	50.0	0.019	0.188
	Mol-VIB	100	58.0	0.036	0.189
MOSES	CGVAE	100	38.0	0.059	0.184
	Mol-V	100	44.0	0.060	0.189
	Mol-β	100	46.0	0.061	0.186
	Mol-DI	100	58.0	0.062	0.209
	Mol-DII	100	50.0	0.071	0.212
	Mol-VIB	100	54.0	0.078	0.253
CHEMBL	CGVAE	82.0	61.3	0.181	0.500
	Mol-V	80.0	62.0	0.202	0.499
	Mol-β	82.6	62.3	0.219	0.491
	Mol-DI	84.0	62.0	0.209	0.481
	Mol-DII	80.0	64.0	0.213	0.456
	Mol-VIB	85.3	64.6	0.183	0.504

Note: ↑ indicates that a higher value on a metric is better. Best performances are bolded.

Table 7.

Open in new tab Download slide

Evaluation of disentanglement across all top models on each of the datasets

Dataset	Model	β-M(%)↑	F-M(%)↑	DCI↑	Mod↑
QM9	CGVAE	100	57.0	0.055	0.239
	Mol-V	100	50.0	0.019	0.233
	Mol-β	100	56.0	0.0466	0.223
	Mol-DI	100	61.2	0.023	0.261
	Mol-DII	100	62.0	0.0972	0.241
	Mol-VIB	100	72.0	0.1282	0.243
ZINC	CGVAE	100	48.0	0.011	0.195
	Mol-V	100	44.0	0.016	0.163
	Mol-β	100	52.0	0.016	0.151
	Mol-DI	100	52.4	0.010	0.197
	Mol-DII	100	50.0	0.019	0.188
	Mol-VIB	100	58.0	0.036	0.189
MOSES	CGVAE	100	38.0	0.059	0.184
	Mol-V	100	44.0	0.060	0.189
	Mol-β	100	46.0	0.061	0.186
	Mol-DI	100	58.0	0.062	0.209
	Mol-DII	100	50.0	0.071	0.212
	Mol-VIB	100	54.0	0.078	0.253
CHEMBL	CGVAE	82.0	61.3	0.181	0.500
	Mol-V	80.0	62.0	0.202	0.499
	Mol-β	82.6	62.3	0.219	0.491
	Mol-DI	84.0	62.0	0.209	0.481
	Mol-DII	80.0	64.0	0.213	0.456
	Mol-VIB	85.3	64.6	0.183	0.504

Dataset	Model	β-M(%)↑	F-M(%)↑	DCI↑	Mod↑
QM9	CGVAE	100	57.0	0.055	0.239
	Mol-V	100	50.0	0.019	0.233
	Mol-β	100	56.0	0.0466	0.223
	Mol-DI	100	61.2	0.023	0.261
	Mol-DII	100	62.0	0.0972	0.241
	Mol-VIB	100	72.0	0.1282	0.243
ZINC	CGVAE	100	48.0	0.011	0.195
	Mol-V	100	44.0	0.016	0.163
	Mol-β	100	52.0	0.016	0.151
	Mol-DI	100	52.4	0.010	0.197
	Mol-DII	100	50.0	0.019	0.188
	Mol-VIB	100	58.0	0.036	0.189
MOSES	CGVAE	100	38.0	0.059	0.184
	Mol-V	100	44.0	0.060	0.189
	Mol-β	100	46.0	0.061	0.186
	Mol-DI	100	58.0	0.062	0.209
	Mol-DII	100	50.0	0.071	0.212
	Mol-VIB	100	54.0	0.078	0.253
CHEMBL	CGVAE	82.0	61.3	0.181	0.500
	Mol-V	80.0	62.0	0.202	0.499
	Mol-β	82.6	62.3	0.219	0.491
	Mol-DI	84.0	62.0	0.209	0.481
	Mol-DII	80.0	64.0	0.213	0.456
	Mol-VIB	85.3	64.6	0.183	0.504

Note: ↑ indicates that a higher value on a metric is better. Best performances are bolded.

In Table 6, the smaller the value, the more similar the generated set is to the training set on a property under comparison. Table 6 shows that all models reasonably preserve the distributions of properties in the training set. In comparison with CGVAE, our D-MolVAE models preserve more on the ZINC and MOSES dataset while less on the QM9 dataset. However, our models consistently perform well on all four datasets. The only dataset where CGVAE performs better than any of our models on about half of the properties (4/9) is on the QM9 dataset. CGVAE also performs comparably on KLD to at least one of our models on the CHEMLB dataset, but it is outperformed on MMD. On both the ZINC and the MOSES datasets, our models outperform CGVAE. In particular, D-MolVAE-VIB performs consistently well across all four datasets. The KLD between the training and the generated datasets are small, and this is further confirmed visually by plotting the distributions of the molecular properties cLogP, cLogS, PSA, rPSA and drug-likeness for each model in Supplementary Figures S2–S7. These results make clear that our D-MolVAE models capture well the distributions of the molecular properties in the training dataset.

Altogether, these results suggest that the proposed models capture the underlying property distribution of the training dataset. Overall, all models balance well between information preservation and novelty in the generated molecules. Among all our D-MolVAE models, it is easily observed that D-MolVAE-VIB outperforms all the others along most metrics. Interestingly, even though disentanglement-enhanced models do not outperform the baselines in terms of capturing the synthesis accessibility (SA) score distribution, they generate novel molecules with higher SA score, e.g. MolVAE-VIB. This observation actually demonstrates the exploration power of the disentangled models and the better trade-off they allow us to achieve between exploration and exploitation. It is worth noting that one can choose over the disentangled models and the base models by preferences of exploration or exploitation.

3.3.1 Quantitative evaluation of disentanglement learning

Table 7 relates the evaluation of our models’ disentanglement scores via β-M, F-M, MOD and DCI, which are four popular metrics to evaluate disentanglement. Briefly, β-M (Higgins et al., 2017) measures disentanglement by examining the accuracy of a linear classifier that predicts the index of a fixed factor of variation. F-M (Kim and Mnih, 2018) addresses several issues by using a majority voting classifier on a different feature vector that represents a corner case in the β-M. The β-M and F-M metrics are formulated as follows:

x_{1, 1}, x_{2, 1}, \dots, x_{1, L}, x_{2, L} \sim (f_{k} \cup N (0, 1))

(9)

z_{1, 1}, z_{2, 1}, \dots, z_{1, L}, z_{2, L} = p_{(z | x)} (x_{1, 1}), p_{(z | x)} (x_{2, 1}), \dots,

(10)

p_{(z | x)} (x_{1, L}), p_{(z | x)} (x_{2, L}),

(11)

z_{diff} = \frac{1}{L} \sum_{l = 1}^{L} | z_{1, l} - z_{2, l} |,

(12)

β - M = p (k | z_{diff}),

(13)

F - M = p (k | \underset{d}{argmin} \underset{l}{Var} z_{l}^{d} / σ_{d}),

(14)

MOD (Ridgeway and Mozer, 2018) measures whether each latent variable depends on at most a factor describing the maximum variation using their mutual information. We first calculate the mutual information between the latent representations and the values of the factors of variation in a matrix m. Then, we compute a vector t_i for each dimension of representation i. Finally, we average over the dimensions of the representation with N factors, as follows:

t_{i, f} = {\begin{matrix} θ_{i} & if f = \underset{g}{argmax} m_{i, g} \\ 0 & otherwise \end{matrix}

(15)

MOD = \frac{1}{I} \sum_{i} 1 - \frac{\sum_{f} {(m_{i, f} - t_{i, f})}^{2}}{θ_{i}^{2} (N - 1)}

(16)

DCI (Eastwood and Williams, 2018) computes the entropy of the distribution obtained by normalizing the importance of each dimension of the learned representation for predicting the value of a factor of variation. For DCI, we first take the importance weights for each factor by fitting gradient boosted trees and form an importance matrix R. We then compute the relative importance of each dimension ρ_i and disentanglement score DCI as follows:

ρ_{i} = \frac{\sum_{j} R_{i j}}{\sum_{i j} R_{i j}}

(17)

DCI = \sum_{i} ρ_{i} (1 - H (R_{i}))

(18)

All implementation details are as in Locatello et al. (2018).

Table 7 shows that our models achieve the best overall disentanglement scores over CGVAE. Specifically, on the QM9 dataset with smaller molecules, D-MolVAE-DIP-I, D-MolVAE-DIP-II and D-MolVAE-VIB achieve F-M scores of 61.2%, 62.0%, 72.0%, respectively, whereas CGVAE achieves only 57.0%. All models achieve comparable MOD scores, with D-MolVAE-DIP-I achieving the highest. All models achieve a $β - M$ of 100%. D-MolVAE-VIB outperforms all others on the DCI score, and this observation holds across all four datasets. Interestingly, all models perform worse on the ZINC dataset, which contains larger molecules than the QM9 dataset. Similarly, on the MOSES dataset, all the models perform worse than on QM9 but better than on ZINC. Specifically, D-MolVAE-DIP-I and D-MolVAE-VIB rank as the top two on the F-M metrics, and D-Mol-VAE achieves the best performance on the DCI and Mod metrics, with an up to 16% improvement over the second best model, D-MolVAE-DIP-II. On the CHEMBL dataset, D-MolVAE-DIV performs the best across the $β - M$ ⁠, F-M and MOD metrics. D-Mol-DIP-I achieves the second in $β - M$ (84.0%), while CGVAE performs only 82.0%. Nevertheless, D-Mol-β performs slightly better over D-Mol-DII on the DCI metric which achieves the best performance. Altogether, these results show that the proposed disentanglement-enhanced models improve the ability of a model for disentanglement learning, especially for D-MolVAE-VIB.

3.3.2 Relating disentangled factors to molecular properties

In Figure 1, we show how the learned disentangled factors relate to the biological properties computed on each generated molecule. The mutual information is calculated between each of the disentangled factors learned by CGVAE and the D-MolVAE models and the molecular properties computed on generated molecules. We focus the comparison here to the MOSES-trained CGVAE and D-MolVAE-VIB models but show all models on all datasets in the Supplementary Material.

Fig. 1.

The mutual information is calculated between each of the disentangled factors and the molecular properties computed on generated molecules

Figure 1 clearly show that the factors learned by CGVAE relate weakly with the molecular properties. Such relationship is stronger on the disentangled factors learned by our D-MolVAE models, even though all models are unsupervised. Moreover, different disentangled factors from D-MolVAE-VIB tend to more clearly correlate to different properties than CGVAE, thanks to the disentanglement enhancement.

Figure 2 allows digging deeper into the impact of a property of interest by visualizing the change in the property over molecules generated when a particular latent factor is varied in a range, and others are kept fixed. We focus on one of our top models, D-MolVAE-VIB and on PSA, which is a crucial consideration when generating molecules, as it directly relates to our ability to actually synthesize them in wet laboratories. We can clearly see that basically only one factor is majorly related to PSA, thanks to our disentanglement enhancement that strengthens the independence among different factors and hence minimizes the number of different factors correlated to a property (e.g. PSA). Figure 2 shows that one of the latent factors impacts PSA, and this is more clearly visible on the QM9 and MOSES datasets.

Fig. 2.

Change in PSA is tracked as a latent factor is varied in a range while keeping all others fixed. Focus here is on the latent factors learned by D-MolVAE-VIB

Open in new tab Download slide

4 Conclusion

The evaluation presented in this article suggests that the proposed disentanglement framework D-MolVAE is effective at generating valid, novel and unique small molecules and outperforms several state-of-the-art generative models. This performance is due to the sequence decoding process and, specifically, valence checking and the stop-checking mechanism. Other graph-based generative models that lack this process (for instance, GraphVAE) suffer in this respect and generate invalid molecules. The variational inference in D-MolVAE also allows better capturing the distribution of the input dataset and so sampling novel and unique molecules from the learned distribution.

It is important to note that the loss functions in the models we propose here effectively implement a trade-off between the disentanglement enhancement and the reconstruction. The distributions of specific properties (for instance, synthesis accessibility) show the exploration-exploitation trade-off in the various disentangled models. Our analysis shows that explicit disentanglement enforcement does not hurt the proposed models; indeed, like CGVAE, the proposed models generate novel and unique molecules and even surpass CGVAE on some of the datasets; the disentangled factors provide an advantage. Moreover, the proposed D-MolVAE models better capture the underlying graph statistics and distributions of various biological properties. Our evaluation also reveals that different types of disentangled models have different abilities. In particular, the experiments suggest that D-MolVAE-VIB is a promising model for exploring disentangled representations.

We consider the proposed work to be a first step to address remaining challenges in small molecule generation. Beyond interpreting the generation process, it is important to precisely control the properties of generated molecules. The disentangled representation learning is this article falls under the umbrella of unsupervised learning. Therefore, specific control and correspondence of latent factors to molecular properties of interest is not expected to be strong. Our analysis shows that, in principle, one can build over the models proposed here for such precise control. Ideally, given specific, target values for several properties of interest, one could decode back the latent variables into a molecule that achieves the target property values. Our future work will address such models.

We also note that current models, including those proposed and evaluated this article, are only concerned with global properties of molecules (or their graph representations), such as ClogP, drug-likeness and others. Preserving local properties of an atom or a cluster of atoms (e.g. an aromatic hydrocarbon) has not been explored so far. Doing both can be helpful in designing novel molecules while improving our understanding of the contribution of each element in the overall molecular properties of interest. We caution, however, that supervised representation learning, while useful in many specific applications, may also bias toward a known, target set of molecular properties and miss possibly interesting new discoveries. In our future work, we hope to advance both unsupervised and supervised representation learning in small molecule generation.

Funding

This work was supported in part by the National Science Foundation [grant numbers 1942594, 1755850, 1907805]. This material was additionally based upon work by A.S. supported by (while serving at) the National Science Foundation. Any opinion, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Conflict of Interest: none declared.

References

Alemi

A.A.

et al. (

2017

)

Deep variational information bottleneck

. In: 5th International Conference on Learning Representations, ICLR. Palais des Congrès Neptune, Toulon, Fr.

Blaschke

et al. (

2018

)

Application of generative autoencoder in de novo molecular design

Mol. Inf.

1700123

Bojchevski

et al. (

2018

) Netgan: generating graphs via random walks. In: International Conference on Machine Learning, Stockholm, Sweden, pp.

609

–

618

Chen

T.Q.

et al. (

2018

) Isolating sources of disentanglement in variational autoencoders. In: Advances in Neural Information Processing Systems, Montréal, Quebec, Canada, pp.

2610

–

2620

Dai

et al. (

2018

) Syntax-directed variational autoencoder for structured data. In: International Conference on Learning Representations, Vancouver, Canada.

De Samanta

B.A.

et al. (

2018

) Designing random graph models using variational autoencoders with applications to chemical design. arXiv preprint arXiv:1802.05283.

Doshi-Velez

Kim

(

2017

) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

et al. (

2020

) Interpretable molecule generation via disentanglement learning. In: 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics Workshops: Comput Struct Biol Workshop (CSBW), Baltimore-Washington, DC Area, USA, pp.

–

et al. (

2021a

) Deep latent-variable models for controllable molecule generation. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Virtual, pp. 372–375. IEEE.

et al. (

2021b

) Graphgt: Machine learning datasets for graph generation and transformation. In: Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), Virtual.

Eastwood

Williams

C.K.

(

2018

) A framework for the quantitative evaluation of disentangled representations. In: 6th International Conference on Learning Representations ICLR, Vancouver, Canada.

Ellman

J.A.

(

1996

)

Design, synthesis, and evaluation of small-molecule libraries

Acc. Chem. Res

132

–

143

Esmaeili

et al. (

2019

)

Structured disentangled representations

Proc. Mach. Learn. Res

. 2525–2534.

Gaulton

et al. (

2017

)

The chembl database in 2017

Nucleic Acids Res

D945

–

D954

Gómez-Bombarelli

et al. (

2018

)

Automatic chemical design using a data-driven continuous representation of molecules

ACS Cent. Sci.

268

–

276

Grover

et al. (

2019

) Graphite: Iterative generative modeling of graphs. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, United States, Vol.

, pp.

2434

–

2444

Guimaraes

G.L.

et al. (

2017

) Objective-reinforced generative adversarial networks (organ) for sequence generation models. arXiv preprint arXiv:1705.10843.

Guo

et al. (

2018

) Deep graph translation. arXiv preprint arXiv:1805.09980.

Guo

et al. (

2020

) Property controllable variational autoencoder via invertible mutual dependence. In: International Conference on Learning Representations, Virtual.

Guo

et al. (

2021

) Deep generative model for spatial networks. In: 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Singapore.

Higgins

et al. (

2017

) beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR. Palais des Congrès Neptune, Toulon, Fr.

Honda

et al. (

2019

) Graph residual flow for molecular graph generation. arXiv preprint arXiv:1909.13521.

Irwin

J.J.

et al. (

2012

)

Zinc: a free tool to discover chemistry for biology

J. Chem. Inf. Model.

1757

–

1768

Janz

et al. (

2017

) Actively learning what makes a discrete sequence valid. arXiv preprint arXiv:1708.04465.

Jin

et al. (

2018

) Discriminative graph autoencoder. In: International Conference on Big Knowledge (ICBK), Venue Singapore, Singapore. IEEE.

Kim

Mnih

(

2018

) Disentangling by factorising. arXiv preprint arXiv:1802.05983.

Kingma

D.P.

Welling

(

2013

) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.

Kipf

T.N.

Welling

(

2016

) Variational graph auto-encoders. arXiv preprint arXiv:1611.07308.

Kumar

et al. (

2018

) Variational inference of disentangled latent concepts from unlabeled observations. In: 6th International Conference on Learning Representations, ICLR, Vancouver Convention Center, Vancouver, BC, Canada.

Kusner

M.J.

et al. (

2017

) Grammar variational autoencoder. In: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, Vol.

, pp.

1945

–

1954

et al. (

2018

) Learning deep generative models of graphs. arXiv preprint aeXiv:1803.03324.

Liu

et al. (

2018

) Constrained graph variational autoencoders for molecule design. In:

Bengio

Wallach

Larochelle

Grauman

Cesa-Bianchi

Garnett

(eds.)

Advances in Neural Information Processing Systems

, Vol. 31, pp.

7795

–

7804

Curran Associates, Inc

., Red Hook, NY.

Google Preview

Locatello

et al. (

2018

) Challenging common assumptions in the unsupervised learning of disentangled representations. arXiv preprint arXiv:1811.12359.

Lopez

et al. (

2018

)

Information constraints on auto-encoding variational bayes

. In: Thirty-second Conference on Neural Information Processing Systems, Montréal, Canada.

Madhawa

et al. (

2019

) Graphnvp: an invertible flow model for generating molecular graphs. arXiv preprint arXiv:1905.11600.

Polykovskiy

et al. (

2020

)

Molecular sets (MOSES): a benchmarking platform for molecular generation models

Front. Pharmacol

Ramakrishnan

et al. (

2014

)

Quantum chemistry structures and properties of 134 kilo molecules

Sci. Data

140022

Renz

et al. (

2020

)

On failure modes in molecule generation and optimization

Drug Discov. Today Technol

. 32–

–

Reymond

et al. (

2012

)

The enumeration of chemical space

Wires Comput. Mol. Sci

717

–

733

Ridgeway

Mozer

M.C.

(

2018

) Learning deep disentangled embeddings with the F-statistic loss. In: Advances in Neural Information Processing Systems, Montréal, Canada, pp.

185

–

194

Ruddigkeit

et al. (

2012

)

Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17

J. Chem. Inf. Model

2864

–

2875

Schneider

(

2016

)

De novo design at the edge of chaos

J. Med. Chem

4077

–

4086

Segler

M.H.

et al. (

2018

)

Generating focused molecule libraries for drug discovery with recurrent neural networks

ACS Cent. Sci

120

–

131

Shi

et al. (

2019

) GraphAF: a flow-based autoregressive model for molecular graph generation. In:

International Conference on Learning Representations.

Ernest N. Morial Convention Center, New Orleans.

Simonovsky

Komodakis

(

2018

) GraphVAE: towards generation of small graphs using variational autoencoders. In: International Conference on Artificial Neural Networks, Rhodes, Greece, pp.

412

–

422

. Springer.

Stumpfe

Bajorath

(

2011

)

Similarity searching

WIREs Comput. Mol. Sci

260

–

282

Sundermeyer

et al. (

2012

) LSTM neural networks for language modeling. In: Thirteenth Annual Conference of the International Speech Communication Association, Portland, Oregon, USA.

Weininger

(

1988

)

SMILES, a chemical language and information system

J. Chem. Inf. Model

–

Whitesides

G.M.

(

2015

)

Reinventing chemistry

Angew. Chem. Int. Ed. Engl

3196

–

3209

Xue

et al. (

2019

)

Advances and challenges in deep generative models for de novo molecule generation

Wiley Interdisc. Rev. Comput. Mol. Sci

e1395

Yoshikawa

et al. (

2018

)

Population-based de novo molecule generation, using grammatical evolution

Chem. Lett

1431

–

1434