Abstract
Targeted drugs have been applied to the treatment of cancer on a large scale, and some patients have certain therapeutic effects. It is a time-consuming task to detect drug–target interactions (DTIs) through biochemical experiments. At present, machine learning (ML) has been widely applied in large-scale drug screening. However, there are few methods for multiple information fusion. We propose a multiple kernel-based triple collaborative matrix factorization (MK-TCMF) method to predict DTIs. The multiple kernel matrices (contain chemical, biological and clinical information) are integrated via multi-kernel learning (MKL) algorithm. And the original adjacency matrix of DTIs could be decomposed into three matrices, including the latent feature matrix of the drug space, latent feature matrix of the target space and the bi-projection matrix (used to join the two feature spaces). To obtain better prediction performance, MKL algorithm can regulate the weight of each kernel matrix according to the prediction error. The weights of drug side-effects and target sequence are the highest. Compared with other computational methods, our model has better performance on four test data sets.
1 Introduction
The binding site of the drug and the biomolecule is the target. Targets involve receptors, enzymes, ion channels, transporters, immune system, genes, etc. Among the existing drugs, more than 50% of drugs use receptors as their targets, and receptors have become the most important targets; more than 20% of drugs use enzymes as their targets, especially enzyme inhibitors. About 6% of drugs use ion channels as their targets, 3% of drugs use nucleic acids as their targets and 20% of drugs need to be further studied [1–4].
In the past 10 years, the machine learning (ML) methods have enabled many biological problems [5–8]. The drug–target interactions (DTIs) also could be predicted by ML [9–13]. Kernel regression, support vector machines (SVM) and neural networks have been widely used in the prediction of DTIs. The Laplace regularized least squares based on Network (NetLapRLS) [14], Kronecker regularized least square (KronRLS) [15–18] and weighted nearest neighbor-based Gaussian interaction profile (WNN-GIP) [19] were all based on regression model. Among them, Nascimento [17] proposed a multi-kernel based KronRLS, which employed multi-kernel learning (MKL) algorithms to efficiently fuse multiple features. SVM-based bipartite local model (BLM) [20] also was built to predict DTIs. On this basis, a variety of extended versions have been derived, such as BLM based on fuzzy model [21] and neighbor interaction-profile inferring (BLM-NII) [22]. In addition, deep learning technology has appeared in DTIs prediction. The convolutional neural network (CNN) [23, 24] and graph convolutional network (GCN) [25, 26] were the common deep learning models, which had obtained better prediction performance of DTIs.
The matrix factorization (MF) method is currently mainly used in the field of recommendation systems, which implement implicit semantic models. Matrix decomposition can solve the problem of sparse incidence matrix caused by the excessive number of users and items. In the field of DTIs, MF [27–33] used two latent feature matrices to approximate the association matrix (DTIs). Among them, heterogeneous network-based DTIs predictor (DTINet) [33] and variational Bayesian multiple kernel logistic matrix factorization (VB-MK-LMF) [32] utilized the multiple information fusion methods to improve the prediction performance of the model. A graph regularized generalized matrix factorization (GRGMF) [31] was designed for link prediction in biomedical bipartite networks. The purpose of GRGMF was to find the latent patterns of known links.
Based on previous MF [27–34], we develop an MF model called multiple kernel-based triple collaborative matrix factorization (MK-TCMF). Different from other MF, we decompose the original adjacency matrix into three matrices, including the feature matrix of the drug space, the bi-projection matrix (used to join the two spaces) and the feature matrix of the target space. In the process of solving the model, multiple drug kernel matrices and target kernel matrices are all linearly weighted and fused by the MKL algorithm.
The contributions of our work are as follows: (1) Different from the previous MF [27, 30], we decompose the original adjacency matrix into the bi-projection matrix, feature matrix of the drug space and target space; (2) The weights of multiple drug and target kernel matrices are synergistically optimized with MF algorithm; (3) To solve the parameters of the MK-TCMF model, an efficient iterative optimization algorithm is proposed; (4) Our method achieves better results on most data sets.
Our study is organized as follows: In the Materials and methods section, we propose an MK-TCMF model to predict DTIs. In the Results section, we test MK-TCMF model on benchmark data sets. In the Discussion section, we discuss the performance of the method and the experimental results. In the Conclusion section, the future work is given.
2 Materials and methods
2.1 Problem description
The network of DTIs can be considered as a bipartite network (in Figure 1). Our goal is to use the known DTIs (links) to estimate new associations between drugs and targets. The DTIs network has |$n$| drugs and |$m$| targets in two sets (|$D = \left \{d_{1},d_{2},...,d_{n}\right \}$| and |$T = \left \{t_{1},t_{2},...,t_{m} \right \}$|), respectively. In our study, the similarity of these drugs (or between targets) can be described as a kernel matrix. Actually, these kernel matrices reflect the topology of drug–drug (|$n \times n$|) and target–target (|$m \times m$|). And the values of the kernel matrix elements are between |$0$| and |$1$|. There are some known links (associations) between the drug and the target set. They can be represented as an adjacency matrix |$\mathbf{Y}_{train} \in \mathbf{R}^{n \times m}$|. If |$Y_{train}(i,j)=1$|, drug |$d_{i}$| and target |$t_{j}$| have directly interaction, otherwise |$Y_{train}(i,j)=0$|. The goal of our method is to calculate a new matrix |$\mathbf{Y}^{*}\in \mathbf{R}^{n \times m}$| and make it approximately equal to |$\mathbf{Y}_{train}$|. If some new values (non-zero) appear in |$\mathbf{Y}^{*}$| (e.g. |$Y_{i,j}^{*}>0$|), drug |$d_{i}$| and target |$t_{j}$| may have link (interaction) in DTIs network. In Figure 1, the black solid lines represent the links between drug and target nodes.
2.2 Related work
CMF [
27] was an MF-based method with minimizing the objective function:
where
|$\mathbf{W} \in \mathbf{R}^{n \times m}$| was a weight matrix.
|$\mathbf{A} \in \mathbf{R}^{n \times k}$| and
|$\mathbf{B} \in \mathbf{R}^{m \times k}$| were two latent feature matrices. The predicted matrix for DTIs network was obtained by multiplying
|$\mathbf{A}$| and
|$\mathbf{B}$|.
NRLMF [
29] estimated the probability of DTIs by logistic matrix factorization. The objective function of NRLMF was formulated as:
where
|$k$| denoted the dimension of low-rank matrix.
|$\mathbf{I}_{d} \in \mathbf{R}^{n \times n}$|,
|$\mathbf{I}_{t} \in \mathbf{R}^{m \times m}$|,
|$\mathbf{U} \in \mathbf{R}^{n \times k}$| and
|$\mathbf{V} \in \mathbf{R}^{m \times k}$| denoted identity matrices and two low-rank matrices.
|$\beta $|,
|$\alpha $| and
|$c$| denoted constant parameters.
|$\mathbf{P}^{*}\in \mathbf{R}^{n \times m}$| of prediction of probability was calculated by
The GRMF [
30] also decomposed
|$\mathbf{Y}_{train}$| into
|$\mathbf{A} \in \mathbf{R}^{n \times k}$| and
|$\mathbf{B} \in \mathbf{R}^{m \times k}$| by
where
|$\lambda _{l}$|,
|$\lambda _{d}$| and
|$\lambda _{t}$| were positive parameters of regular terms. The prediction
|$\mathbf{F}^{*}$| was calculated by
2.3 Benchmark data set
Four real DTIs networks are used to test MK-TCMF. They are Enzymes (Es), G Protein-Coupled Receptors (GPCRs), Ion Channel (IC) and Nuclear Receptors (NRs) [20] from DrugBank [4], BRENDA [2], SuperTarget [1] and KEGG BRITE [3] databases. The information of DTIs networks are list in Table 1.
Table 1The information of four DTIs networks.
Data set
. | Interaction
. | Target
. | Drug
. | Sparsity
. |
---|
NRs | |$90$| | |$26$| | |$54$| | 93.59% |
GPCRs | |$635$| | |$95$| | |$223$| | 97.00% |
IC | |$1476$| | |$204$| | |$210$| | 96.55% |
Es | |$2926$| | |$664$| | |$445$| | 99.01% |
Data set
. | Interaction
. | Target
. | Drug
. | Sparsity
. |
---|
NRs | |$90$| | |$26$| | |$54$| | 93.59% |
GPCRs | |$635$| | |$95$| | |$223$| | 97.00% |
IC | |$1476$| | |$204$| | |$210$| | 96.55% |
Es | |$2926$| | |$664$| | |$445$| | 99.01% |
Table 1The information of four DTIs networks.
Data set
. | Interaction
. | Target
. | Drug
. | Sparsity
. |
---|
NRs | |$90$| | |$26$| | |$54$| | 93.59% |
GPCRs | |$635$| | |$95$| | |$223$| | 97.00% |
IC | |$1476$| | |$204$| | |$210$| | 96.55% |
Es | |$2926$| | |$664$| | |$445$| | 99.01% |
Data set
. | Interaction
. | Target
. | Drug
. | Sparsity
. |
---|
NRs | |$90$| | |$26$| | |$54$| | 93.59% |
GPCRs | |$635$| | |$95$| | |$223$| | 97.00% |
IC | |$1476$| | |$204$| | |$210$| | 96.55% |
Es | |$2926$| | |$664$| | |$445$| | 99.01% |
2.4 Drug and target kernels
Several kinds of DTIs information are introduced into our model. In target space, the protein sequence [20, 35], Protein–Protein Interactions (PPIs) network [36], functional annotation (with Gene Ontology) [37, 38] and known target interaction profile [15, 19] are utilized to build target kernel matrices. In drug space, side-effects [17, 39], chemical structure (with SIMCOMP) [40], drug substructure (with fingerprint) [41] and known profile of drug interaction [15, 19] are employed to build drug kernel matrices. The details of kernels are list in Table 2.
Table 2The kernels in two feature spaces.
Feature Space
. | Kernel
. | Description
. |
---|
Drug | |$\mathbf{K}_{GIP,d}$| | Gaussian interaction profile for drug[15, 19] |
| |$\mathbf{K}_{SE,d}$| | network of drug-side effect associations [17, 39] |
| |$\mathbf{K}_{MACCS,d}$| | drug substructure fingerprint[41] |
| |$\mathbf{K}_{SIMCOMP,d}$| | chemical structure [35, 40] |
Target | |$\mathbf{K}_{GIP,t}$| | Gaussian interaction profile for target[15, 19] |
| |$\mathbf{K}_{PPI,t}$| | PPIs network of target[17, 36] |
| |$\mathbf{K}_{GO,t}$| | functional information of target[17, 37] |
| |$\mathbf{K}_{SW,t}$| | sequence information of target[35] |
Feature Space
. | Kernel
. | Description
. |
---|
Drug | |$\mathbf{K}_{GIP,d}$| | Gaussian interaction profile for drug[15, 19] |
| |$\mathbf{K}_{SE,d}$| | network of drug-side effect associations [17, 39] |
| |$\mathbf{K}_{MACCS,d}$| | drug substructure fingerprint[41] |
| |$\mathbf{K}_{SIMCOMP,d}$| | chemical structure [35, 40] |
Target | |$\mathbf{K}_{GIP,t}$| | Gaussian interaction profile for target[15, 19] |
| |$\mathbf{K}_{PPI,t}$| | PPIs network of target[17, 36] |
| |$\mathbf{K}_{GO,t}$| | functional information of target[17, 37] |
| |$\mathbf{K}_{SW,t}$| | sequence information of target[35] |
Table 2The kernels in two feature spaces.
Feature Space
. | Kernel
. | Description
. |
---|
Drug | |$\mathbf{K}_{GIP,d}$| | Gaussian interaction profile for drug[15, 19] |
| |$\mathbf{K}_{SE,d}$| | network of drug-side effect associations [17, 39] |
| |$\mathbf{K}_{MACCS,d}$| | drug substructure fingerprint[41] |
| |$\mathbf{K}_{SIMCOMP,d}$| | chemical structure [35, 40] |
Target | |$\mathbf{K}_{GIP,t}$| | Gaussian interaction profile for target[15, 19] |
| |$\mathbf{K}_{PPI,t}$| | PPIs network of target[17, 36] |
| |$\mathbf{K}_{GO,t}$| | functional information of target[17, 37] |
| |$\mathbf{K}_{SW,t}$| | sequence information of target[35] |
Feature Space
. | Kernel
. | Description
. |
---|
Drug | |$\mathbf{K}_{GIP,d}$| | Gaussian interaction profile for drug[15, 19] |
| |$\mathbf{K}_{SE,d}$| | network of drug-side effect associations [17, 39] |
| |$\mathbf{K}_{MACCS,d}$| | drug substructure fingerprint[41] |
| |$\mathbf{K}_{SIMCOMP,d}$| | chemical structure [35, 40] |
Target | |$\mathbf{K}_{GIP,t}$| | Gaussian interaction profile for target[15, 19] |
| |$\mathbf{K}_{PPI,t}$| | PPIs network of target[17, 36] |
| |$\mathbf{K}_{GO,t}$| | functional information of target[17, 37] |
| |$\mathbf{K}_{SW,t}$| | sequence information of target[35] |
2.5 The Proposed Model of Multiple Kernel-based Triple Collaborative Matrix Factorization
The multiple drug (
|$k_{d}$|) and target (
|$k_{t}$|) kernels can be expressed as
|$\left \{\mathbf{K}_{d}^{1}, \mathbf{K}_{d}^{2},...,\mathbf{K}_{d}^{k_{d}}\right \}$| and
|$\left \{\mathbf{K}_{t}^{1}, \mathbf{K}_{t}^{2},...,\mathbf{K}_{t}^{k_{t}}\right \}$|.
where
|$\boldsymbol{\beta }_{d}=\left \{\beta _{d}^{1},\beta _{d}^{2},...,\beta _{d}^{k_{d}} \right \}$| and
|$\boldsymbol{\beta }_{t}=\left \{\beta _{t}^{1},\beta _{t}^{2},...,\beta _{d}^{k_{t}} \right \}$| are the weights of kernels for the drug and target, respectively. MKL algorithm computes the weights for kernels. So, the
|$\mathbf{K}_{t}^{*}$| and
|$\mathbf{K}_{d}^{*}$| are optimal integrated kernels.
Inspired by MF [
27,
29,
30], the matrix can be approximated by low rank representation of two drug (or target) feature as follows:
where
|$ \mathbf{A}$| and
|$\mathbf{B}$| denote the matrices of low-rank approximation.
|$r_{d}$| and
|$r_{t}$| are the dimensions of the latent feature space in drug and target space. The objective formula is defined as following:
where
|$\boldsymbol{\Theta } \in \mathbf{R}^{r_{d} \times r_{t}}$| is the bi-projection matrix.
|$\lambda _{\Theta }$|,
|$\lambda _{l}$|,
|$\lambda _{d}$|,
|$\lambda _{t}$|,
|$\lambda _{\beta }$| are the regularization coefficients of the next five regular items. The range of these coefficients is 0–1 with 0.1 step. In addition to using the
|$\mathbf{A}$|,
|$\mathbf{B}$| and
|$\boldsymbol{\Theta }$| matrices to approximate
|$\mathbf{Y}_{train}$|, we also use
|$\mathbf{A}$| and
|$\mathbf{B}$| to approximate the kernel matrices of
|$\mathbf{K}_{d}^{*}$| and
|$\mathbf{K}_{t}^{*}$|, respectively.
2.6 Optimization
To optimize the Eq.
8, we use the alternating least squares algorithm (ALSA). First, we fix the variables of
|$\mathbf{A}$|,
|$\mathbf{B}$|,
|$\boldsymbol{\beta }_{d}$|,
|$\boldsymbol{\beta }_{t}$| to solve the variable
|$\boldsymbol{\Theta }$|. Let
|$\partial J/\partial \boldsymbol{\Theta } = 0$|, we can obtain functions as follows:
where Eq.
9e is a Sylvester equation. Then,
|$\boldsymbol{\Theta }$|,
|$\mathbf{B}$|,
|$\boldsymbol{\beta }_{d}$|,
|$\boldsymbol{\beta }_{t}$| are fixed, let
|$\partial J/\partial \mathbf{A} = 0$|:
Next, setting
|$\partial J/\partial \mathbf{B} = 0$|:
where
|$\mathbf{I}_{r_{d}} \in \mathbf{R}^{r_{d} \times r_{d}}$| and
|$\mathbf{I}_{r_{t}} \in \mathbf{R}^{r_{t} \times r_{t}}$| are identity matrices.
The weights of
|$\boldsymbol{\beta }_{d}$| and
|$\boldsymbol{\beta }_{t})$| can be calculated by
where
|$\mathbf{I}_{d} \in \mathbf{R}^{k_{d} \times k_{d}}$| and
|$\mathbf{I}_{t} \in \mathbf{R}^{k_{t} \times k_{t}}$| are also identity matrices.
|$\mathbf{l}_{d}=(1,...,1)^{T} \in \mathbf{R}^{k_{d} \times 1}$| and
|$\mathbf{l}_{t}=(1,...,1)^{T} \in \mathbf{R}^{k_{t} \times 1}$| are vectors with elements of
|$1$|. And
|$\mathbf{G}_{d} \in \mathbf{R}^{k_{d} \times k_{d}}$|,
|$\mathbf{G}_{t} \in \mathbf{R}^{k_{t} \times k_{t}}$|,
|$\mathbf{z}_{d} \in \mathbf{R}^{k_{d} \times 1}$| and
|$\mathbf{z}_{t} \in \mathbf{R}^{k_{t} \times 1}$| can be calculated as follows:
The final predictive values can be constructed by
where
|$\mathbf{Y}^{*} \in \mathbf{R}^{n \times m}$|, and the highly ranked pair score
|$\mathbf{Y}^{*}(i,j)$| for drug
|$i$| and target
|$j$|, the higher probability of interaction between them. In our study, the initialization of
|$\mathbf{A}$| and
|$\mathbf{B}$| are calculated as follows:
where
|$\mathbf{K}_{d}^{*}$| and
|$\mathbf{K}_{t}^{*}$| can be decomposed by
|$\mathbf{U}_{d} \in \mathbf{R}^{n \times r_{d}}, \ \mathbf{S}_{d,r_{d}} \in \mathbf{R}^{r_{d} \times r_{d}}, \ \mathbf{V}_{d} \in \mathbf{R}^{n \times r_{d}}$| and
|$\mathbf{U}_{t} \in \mathbf{R}^{m \times r_{t}}, \ \mathbf{S}_{t,r_{t}} \in \mathbf{R}^{r_{t} \times r_{t}}, \ \mathbf{V}_{t} \in \mathbf{R}^{m \times r_{t}}$|, respectively.
|$\mathbf{S}_{d,r_{d}}$| and
|$\mathbf{S}_{t,r_{t}}$| are two diagonal matrices containing the
|$r_{d}$| and
|$r_{t}$| largest singular values.
The process of MK-TCMF is shown in Figure 2 and Algorithm 1.
3 Results
3.1 Cross-validation test set
In our study, |$10$|-fold Cross Validation (CV) is utilized to test our model. The evaluation method of predictive performance is Area Under the Precision-Recall curve (AUPR). To fully test the predictive performance of the model, the following three types of cross-validation test sets (CVS) [29] will be used: CVS1: Identifying potential DTIs from known network. Targets and drugs both exist in testing and training set; CVS2: Identifying DTIs for novel drugs. The new drugs are not present in training set; CVS3: Identifying DTIs of new targets. The new targets do not exist in training set.
3.2 The parameters of models
We obtain the best model parameters via grid search. The parameters are listed in Table 3. The range of |$\lambda _{\Theta }$|, |$\lambda _{l}$|, |$\lambda _{d}$|, |$\lambda _{t}$|, |$\lambda _{\beta }$| is from |$0$| to |$1$| with step of |$0.1$|. |$r_{d}$| and |$r_{t}$| are smaller than the size of the corresponding kernel matrix, respectively.
In addition, the results of iterations are shown in Figure 3. After several rounds of iterative optimization, the model tends to be stable. At the end, we chose different maximum iterations for different data sets. The numbers of maximum iterations are |$5$|, |$2$|, |$2$| and |$5$| for Es, IC, GPCRs and NRs, respectively.
Table 3The parameters of models.
Data set
. | |$\lambda _{\Theta }$|
. | |$\lambda _{l}$|
. | |$\lambda _{d}$|
. | |$\lambda _{t}$|
. | |$\lambda _{\beta }$|
. | |$r_{d}$|
. | |$r_{t}$|
. | |$iter$|
. |
---|
Es | 1 | 1 | 1 | 1 | 1 | 300 | 300 | 5 |
IC | 1 | 1 | 1 | 1 | 1 | 200 | 200 | 2 |
GPCRs | 1 | 1 | 1 | 1 | 1 | 200 | 80 | 2 |
NRs | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 53 | 21 | 5 |
Data set
. | |$\lambda _{\Theta }$|
. | |$\lambda _{l}$|
. | |$\lambda _{d}$|
. | |$\lambda _{t}$|
. | |$\lambda _{\beta }$|
. | |$r_{d}$|
. | |$r_{t}$|
. | |$iter$|
. |
---|
Es | 1 | 1 | 1 | 1 | 1 | 300 | 300 | 5 |
IC | 1 | 1 | 1 | 1 | 1 | 200 | 200 | 2 |
GPCRs | 1 | 1 | 1 | 1 | 1 | 200 | 80 | 2 |
NRs | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 53 | 21 | 5 |
Table 3The parameters of models.
Data set
. | |$\lambda _{\Theta }$|
. | |$\lambda _{l}$|
. | |$\lambda _{d}$|
. | |$\lambda _{t}$|
. | |$\lambda _{\beta }$|
. | |$r_{d}$|
. | |$r_{t}$|
. | |$iter$|
. |
---|
Es | 1 | 1 | 1 | 1 | 1 | 300 | 300 | 5 |
IC | 1 | 1 | 1 | 1 | 1 | 200 | 200 | 2 |
GPCRs | 1 | 1 | 1 | 1 | 1 | 200 | 80 | 2 |
NRs | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 53 | 21 | 5 |
Data set
. | |$\lambda _{\Theta }$|
. | |$\lambda _{l}$|
. | |$\lambda _{d}$|
. | |$\lambda _{t}$|
. | |$\lambda _{\beta }$|
. | |$r_{d}$|
. | |$r_{t}$|
. | |$iter$|
. |
---|
Es | 1 | 1 | 1 | 1 | 1 | 300 | 300 | 5 |
IC | 1 | 1 | 1 | 1 | 1 | 200 | 200 | 2 |
GPCRs | 1 | 1 | 1 | 1 | 1 | 200 | 80 | 2 |
NRs | 0.5 | 0.5 | 0.5 | 0.5 | 0.5 | 53 | 21 | 5 |
Figure 3
The performance (AUPR) with different numbers of iterations.
3.3 Performance analysis
To evaluate the performance of the MKL algorithm, we test the MK-TCMF and the MK-TCMF (mean weighted) method on four data sets. The results are listed in Table 4 and Figure 4. The AUPRs of MK-TCMF are |$0.912$|, |$0.933$|, |$0.752$| and |$0.552$|, respectively. They are all better than method of MK-TCMF (mean weighted). The MKL algorithm can regulate the weight of each kernel matrix for an optimal combination.
Table 4The AUPR of different weighted methods under CV1.
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
MK-TCMF (mean weighted) | 0.899 | 0.923 | 0.670 | 0.474 |
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
MK-TCMF (mean weighted) | 0.899 | 0.923 | 0.670 | 0.474 |
Table 4The AUPR of different weighted methods under CV1.
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
MK-TCMF (mean weighted) | 0.899 | 0.923 | 0.670 | 0.474 |
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
MK-TCMF (mean weighted) | 0.899 | 0.923 | 0.670 | 0.474 |
Figure 4
The AUPR of different weighted methods under CVS1.
Different kernel matrices often have different contribution of prediction model, so the MKL algorithm assigns different weights to kernels. In general, the higher the value of weight, the greater the contribution for the model. In Figure 5, |$\mathbf{K}_{SE,d}$| (drug), |$\mathbf{K}_{GIP,d}$| (drug), |$\mathbf{K}_{SW,t}$| (target) and |$\mathbf{K}_{GIP,t}$| (target) have higher weights in drug and target feature spaces. |$\mathbf{K}_{GIP,d}$| (drug) and |$\mathbf{K}_{GIP,t}$| (target) contain the known association information between the drug and the target; it provides a priori information in the prediction process.
Figure 5
The weights on four sets.
To further verify the effectiveness of our method, other multi-kernel learning methods are also used to fuse the kernel matrix. They are Heuristically Kernel Alignment-based MKL (HKA-MKL) [17], fast kernel learning-based MKL (FKL-MKL) [42] and centered kernel alignment-based MKL (CKA-MKL) [43]. The results of different MKL model are listed in Table 5. MK-TCMF obtains the best performance on four data sets. On Es, IC, GPCRs and NRs, the AUPRs are |$0.912$|, |$0.933$|, |$0.752$| and |$0.552$|, respectively. The TCMF + CKA-MKL also has good performance on Es (|$0.900$|), IC (|$0.927$|) and GPCRs(|$0.724$|).
3.4 Comparison with existing predictors
In this section, NetLapRLS [14], WNN-GIP [19], CMF [27], NRLMF [29], BLM-NII [22], GRMF [30], VB-MK-LMF and KBMF2K [28] are compared with our model (MK-TCMF). CVS1, CVS2 and CVS3 are utilized to verify these predictive models. The results of comparisons (AUPR) are list in Tables 6, 7 and 8. The results of VB-MK-LMF are taken from the work [32] of Bence. And the result of GRMF were incomplete (missing CVS1). So, we employ the source code and parameter settings, which were obtained from the article of GRMF, to get relevant results. In CVS1 (Table 6), our method (MK-TCMF) achieves best AUPR (IC: |$0.933$|, Es: |$0.912$|) on data sets of IC and Es, respectively. And second best AUPR (|$0.752$|) is achieved by MK-TCMF on GPCRs data set. GRMF (|$0.923$|) and CMF (|$0.923$|) have second best AUPR on IC data set. The best AUPRs of GPCRs (|$0.777$|) and NRs (|$0.773$|) are achieved by VB-MK-LMF. It can be seen that the MF-based methods have better predictive performance.
Table 5AUPRs of different MKL methods (withe TCMF) under CVS1.
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
TCMF + FKL-MKL | 0.884 | 0.920 | 0.703 | 0.503 |
TCMF + TKA-MKL | 0.899 | 0.918 | 0.716 | 0.540 |
TCMF + CKA-MKL | 0.900 | 0.927 | 0.724 | 0.538 |
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
TCMF + FKL-MKL | 0.884 | 0.920 | 0.703 | 0.503 |
TCMF + TKA-MKL | 0.899 | 0.918 | 0.716 | 0.540 |
TCMF + CKA-MKL | 0.900 | 0.927 | 0.724 | 0.538 |
Table 5AUPRs of different MKL methods (withe TCMF) under CVS1.
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
TCMF + FKL-MKL | 0.884 | 0.920 | 0.703 | 0.503 |
TCMF + TKA-MKL | 0.899 | 0.918 | 0.716 | 0.540 |
TCMF + CKA-MKL | 0.900 | 0.927 | 0.724 | 0.538 |
Method
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
MK-TCMF | 0.912 | 0.933 | 0.752 | 0.552 |
TCMF + FKL-MKL | 0.884 | 0.920 | 0.703 | 0.503 |
TCMF + TKA-MKL | 0.899 | 0.918 | 0.716 | 0.540 |
TCMF + CKA-MKL | 0.900 | 0.927 | 0.724 | 0.538 |
Table 6Comparison results with existing models in CVS1.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.912|$\pm $|0.018 | 0.933|$\pm $|0.017 | 0.752|$\pm $|0.040 | 0.552|$\pm $|0.099 |
GRMF | 0.878|$\pm $|0.002 | 0.923|$\pm $|0.002 | 0.737|$\pm $|0.002 | 0.602|$\pm $|0.038 |
VB-MK-LMF|$^{1}$| | 0.890|$\pm $|0.006 | 0.916|$\pm $|0.007 | 0.777|$\pm $|0.016 | 0.773|$\pm $|0.030 |
NRLMF|$^{2}$| | 0.892|$\pm $|0.006 | 0.906|$\pm $|0.008 | 0.749|$\pm $|0.015 | 0.728|$\pm $|0.041 |
CMF|$^{2}$| | 0.877|$\pm $|0.005 | 0.923|$\pm $|0.006 | 0.745|$\pm $|0.013 | 0.584|$\pm $|0.042 |
KBMF2K|$^{2}$| | 0.654|$\pm $|0.008 | 0.771|$\pm $|0.009 | 0.578|$\pm $|0.018 | 0.534|$\pm $|0.050 |
WNN-GIP|$^{2}$| | 0.706|$\pm $|0.017 | 0.717|$\pm $|0.020 | 0.520|$\pm $|0.021 | 0.589|$\pm $|0.034 |
BLM-NII|$^{2}$| | 0.752|$\pm $|0.011 | 0.821|$\pm $|0.012 | 0.524|$\pm $|0.024 | 0.659|$\pm $|0.039 |
NetLapRLS|$^{2}$| | 0.789|$\pm $|0.005 | 0.837|$\pm $|0.009 | 0.616|$\pm $|0.015 | 0.465|$\pm $|0.044 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.912|$\pm $|0.018 | 0.933|$\pm $|0.017 | 0.752|$\pm $|0.040 | 0.552|$\pm $|0.099 |
GRMF | 0.878|$\pm $|0.002 | 0.923|$\pm $|0.002 | 0.737|$\pm $|0.002 | 0.602|$\pm $|0.038 |
VB-MK-LMF|$^{1}$| | 0.890|$\pm $|0.006 | 0.916|$\pm $|0.007 | 0.777|$\pm $|0.016 | 0.773|$\pm $|0.030 |
NRLMF|$^{2}$| | 0.892|$\pm $|0.006 | 0.906|$\pm $|0.008 | 0.749|$\pm $|0.015 | 0.728|$\pm $|0.041 |
CMF|$^{2}$| | 0.877|$\pm $|0.005 | 0.923|$\pm $|0.006 | 0.745|$\pm $|0.013 | 0.584|$\pm $|0.042 |
KBMF2K|$^{2}$| | 0.654|$\pm $|0.008 | 0.771|$\pm $|0.009 | 0.578|$\pm $|0.018 | 0.534|$\pm $|0.050 |
WNN-GIP|$^{2}$| | 0.706|$\pm $|0.017 | 0.717|$\pm $|0.020 | 0.520|$\pm $|0.021 | 0.589|$\pm $|0.034 |
BLM-NII|$^{2}$| | 0.752|$\pm $|0.011 | 0.821|$\pm $|0.012 | 0.524|$\pm $|0.024 | 0.659|$\pm $|0.039 |
NetLapRLS|$^{2}$| | 0.789|$\pm $|0.005 | 0.837|$\pm $|0.009 | 0.616|$\pm $|0.015 | 0.465|$\pm $|0.044 |
Table 6Comparison results with existing models in CVS1.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.912|$\pm $|0.018 | 0.933|$\pm $|0.017 | 0.752|$\pm $|0.040 | 0.552|$\pm $|0.099 |
GRMF | 0.878|$\pm $|0.002 | 0.923|$\pm $|0.002 | 0.737|$\pm $|0.002 | 0.602|$\pm $|0.038 |
VB-MK-LMF|$^{1}$| | 0.890|$\pm $|0.006 | 0.916|$\pm $|0.007 | 0.777|$\pm $|0.016 | 0.773|$\pm $|0.030 |
NRLMF|$^{2}$| | 0.892|$\pm $|0.006 | 0.906|$\pm $|0.008 | 0.749|$\pm $|0.015 | 0.728|$\pm $|0.041 |
CMF|$^{2}$| | 0.877|$\pm $|0.005 | 0.923|$\pm $|0.006 | 0.745|$\pm $|0.013 | 0.584|$\pm $|0.042 |
KBMF2K|$^{2}$| | 0.654|$\pm $|0.008 | 0.771|$\pm $|0.009 | 0.578|$\pm $|0.018 | 0.534|$\pm $|0.050 |
WNN-GIP|$^{2}$| | 0.706|$\pm $|0.017 | 0.717|$\pm $|0.020 | 0.520|$\pm $|0.021 | 0.589|$\pm $|0.034 |
BLM-NII|$^{2}$| | 0.752|$\pm $|0.011 | 0.821|$\pm $|0.012 | 0.524|$\pm $|0.024 | 0.659|$\pm $|0.039 |
NetLapRLS|$^{2}$| | 0.789|$\pm $|0.005 | 0.837|$\pm $|0.009 | 0.616|$\pm $|0.015 | 0.465|$\pm $|0.044 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.912|$\pm $|0.018 | 0.933|$\pm $|0.017 | 0.752|$\pm $|0.040 | 0.552|$\pm $|0.099 |
GRMF | 0.878|$\pm $|0.002 | 0.923|$\pm $|0.002 | 0.737|$\pm $|0.002 | 0.602|$\pm $|0.038 |
VB-MK-LMF|$^{1}$| | 0.890|$\pm $|0.006 | 0.916|$\pm $|0.007 | 0.777|$\pm $|0.016 | 0.773|$\pm $|0.030 |
NRLMF|$^{2}$| | 0.892|$\pm $|0.006 | 0.906|$\pm $|0.008 | 0.749|$\pm $|0.015 | 0.728|$\pm $|0.041 |
CMF|$^{2}$| | 0.877|$\pm $|0.005 | 0.923|$\pm $|0.006 | 0.745|$\pm $|0.013 | 0.584|$\pm $|0.042 |
KBMF2K|$^{2}$| | 0.654|$\pm $|0.008 | 0.771|$\pm $|0.009 | 0.578|$\pm $|0.018 | 0.534|$\pm $|0.050 |
WNN-GIP|$^{2}$| | 0.706|$\pm $|0.017 | 0.717|$\pm $|0.020 | 0.520|$\pm $|0.021 | 0.589|$\pm $|0.034 |
BLM-NII|$^{2}$| | 0.752|$\pm $|0.011 | 0.821|$\pm $|0.012 | 0.524|$\pm $|0.024 | 0.659|$\pm $|0.039 |
NetLapRLS|$^{2}$| | 0.789|$\pm $|0.005 | 0.837|$\pm $|0.009 | 0.616|$\pm $|0.015 | 0.465|$\pm $|0.044 |
Table 7Comparison results with existing models in CVS2.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.407|$\pm $|0.050 | 0.426|$\pm $|0.086 | 0.412|$\pm $|0.071 | 0.386|$\pm $|0.098 |
GRMF | 0.390|$\pm $|0.010 | 0.356|$\pm $|0.014 | 0.404|$\pm $|0.011 | 0.542|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.349|$\pm $|0.042 | 0.345|$\pm $|0.035 | 0.368|$\pm $|0.023 | 0.593|$\pm $|0.058 |
NRLMF|$^{2}$| | 0.358|$\pm $|0.040 | 0.344|$\pm $|0.033 | 0.364|$\pm $|0.023 | 0.545|$\pm $|0.054 |
CMF|$^{2}$| | 0.229 |$\pm $|0.020 | 0.286|$\pm $|0.030 | 0.365|$\pm $|0.022 | 0.488|$\pm $|0.050 |
KBMF2K|$^{2}$| | 0.263|$\pm $|0.033 | 0.308|$\pm $|0.038 | 0.366|$\pm $|0.024 | 0.477|$\pm $|0.049 |
WNN-GIP|$^{2}$| | 0.278|$\pm $|0.037 | 0.258|$\pm $|0.032 | 0.295|$\pm $|0.025 | 0.504|$\pm $|0.056 |
BLM-NII|$^{2}$| | 0.253|$\pm $|0.023 | 0.302|$\pm $|0.033 | 0.315|$\pm $|0.022 | 0.438|$\pm $|0.048 |
NetLapRLS|$^{2}$| | 0.123|$\pm $|0.009 | 0.200|$\pm $|0.026 | 0.229|$\pm $|0.017 | 0.417|$\pm $|0.048 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.407|$\pm $|0.050 | 0.426|$\pm $|0.086 | 0.412|$\pm $|0.071 | 0.386|$\pm $|0.098 |
GRMF | 0.390|$\pm $|0.010 | 0.356|$\pm $|0.014 | 0.404|$\pm $|0.011 | 0.542|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.349|$\pm $|0.042 | 0.345|$\pm $|0.035 | 0.368|$\pm $|0.023 | 0.593|$\pm $|0.058 |
NRLMF|$^{2}$| | 0.358|$\pm $|0.040 | 0.344|$\pm $|0.033 | 0.364|$\pm $|0.023 | 0.545|$\pm $|0.054 |
CMF|$^{2}$| | 0.229 |$\pm $|0.020 | 0.286|$\pm $|0.030 | 0.365|$\pm $|0.022 | 0.488|$\pm $|0.050 |
KBMF2K|$^{2}$| | 0.263|$\pm $|0.033 | 0.308|$\pm $|0.038 | 0.366|$\pm $|0.024 | 0.477|$\pm $|0.049 |
WNN-GIP|$^{2}$| | 0.278|$\pm $|0.037 | 0.258|$\pm $|0.032 | 0.295|$\pm $|0.025 | 0.504|$\pm $|0.056 |
BLM-NII|$^{2}$| | 0.253|$\pm $|0.023 | 0.302|$\pm $|0.033 | 0.315|$\pm $|0.022 | 0.438|$\pm $|0.048 |
NetLapRLS|$^{2}$| | 0.123|$\pm $|0.009 | 0.200|$\pm $|0.026 | 0.229|$\pm $|0.017 | 0.417|$\pm $|0.048 |
Table 7Comparison results with existing models in CVS2.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.407|$\pm $|0.050 | 0.426|$\pm $|0.086 | 0.412|$\pm $|0.071 | 0.386|$\pm $|0.098 |
GRMF | 0.390|$\pm $|0.010 | 0.356|$\pm $|0.014 | 0.404|$\pm $|0.011 | 0.542|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.349|$\pm $|0.042 | 0.345|$\pm $|0.035 | 0.368|$\pm $|0.023 | 0.593|$\pm $|0.058 |
NRLMF|$^{2}$| | 0.358|$\pm $|0.040 | 0.344|$\pm $|0.033 | 0.364|$\pm $|0.023 | 0.545|$\pm $|0.054 |
CMF|$^{2}$| | 0.229 |$\pm $|0.020 | 0.286|$\pm $|0.030 | 0.365|$\pm $|0.022 | 0.488|$\pm $|0.050 |
KBMF2K|$^{2}$| | 0.263|$\pm $|0.033 | 0.308|$\pm $|0.038 | 0.366|$\pm $|0.024 | 0.477|$\pm $|0.049 |
WNN-GIP|$^{2}$| | 0.278|$\pm $|0.037 | 0.258|$\pm $|0.032 | 0.295|$\pm $|0.025 | 0.504|$\pm $|0.056 |
BLM-NII|$^{2}$| | 0.253|$\pm $|0.023 | 0.302|$\pm $|0.033 | 0.315|$\pm $|0.022 | 0.438|$\pm $|0.048 |
NetLapRLS|$^{2}$| | 0.123|$\pm $|0.009 | 0.200|$\pm $|0.026 | 0.229|$\pm $|0.017 | 0.417|$\pm $|0.048 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.407|$\pm $|0.050 | 0.426|$\pm $|0.086 | 0.412|$\pm $|0.071 | 0.386|$\pm $|0.098 |
GRMF | 0.390|$\pm $|0.010 | 0.356|$\pm $|0.014 | 0.404|$\pm $|0.011 | 0.542|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.349|$\pm $|0.042 | 0.345|$\pm $|0.035 | 0.368|$\pm $|0.023 | 0.593|$\pm $|0.058 |
NRLMF|$^{2}$| | 0.358|$\pm $|0.040 | 0.344|$\pm $|0.033 | 0.364|$\pm $|0.023 | 0.545|$\pm $|0.054 |
CMF|$^{2}$| | 0.229 |$\pm $|0.020 | 0.286|$\pm $|0.030 | 0.365|$\pm $|0.022 | 0.488|$\pm $|0.050 |
KBMF2K|$^{2}$| | 0.263|$\pm $|0.033 | 0.308|$\pm $|0.038 | 0.366|$\pm $|0.024 | 0.477|$\pm $|0.049 |
WNN-GIP|$^{2}$| | 0.278|$\pm $|0.037 | 0.258|$\pm $|0.032 | 0.295|$\pm $|0.025 | 0.504|$\pm $|0.056 |
BLM-NII|$^{2}$| | 0.253|$\pm $|0.023 | 0.302|$\pm $|0.033 | 0.315|$\pm $|0.022 | 0.438|$\pm $|0.048 |
NetLapRLS|$^{2}$| | 0.123|$\pm $|0.009 | 0.200|$\pm $|0.026 | 0.229|$\pm $|0.017 | 0.417|$\pm $|0.048 |
Table 8Comparison results with existing models in CVS3.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.831|$\pm $|0.044 | 0.826|$\pm $|0.079 | 0.583|$\pm $|0.090 | 0.384|$\pm $|0.097 |
GRMF | 0.807|$\pm $|0.016 | 0.815|$\pm $|0.010 | 0.615|$\pm $|0.023 | 0.500|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.794|$\pm $|0.017 | 0.826|$\pm $|0.021 | 0.596|$\pm $|0.040 | 0.601|$\pm $|0.081 |
NRLMF|$^{2}$| | 0.812|$\pm $|0.018 | 0.785|$\pm $|0.028 | 0.556|$\pm $|0.038 | 0.449|$\pm $|0.079 |
CMF|$^{2}$| | 0.698|$\pm $|0.021 | 0.620|$\pm $|0.027 | 0.433|$\pm $|0.028 | 0.400|$\pm $|0.077 |
KBMF2K|$^{2}$| | 0.565|$\pm $|0.023 | 0.677|$\pm $|0.021 | 0.516|$\pm $|0.045 | 0.324|$\pm $|0.071 |
WNN-GIP|$^{2}$| | 0.566|$\pm $|0.038 | 0.696|$\pm $|0.035 | 0.550|$\pm $|0.047 | 0.531|$\pm $|0.073 |
BLM-NII|$^{2}$| | 0.735|$\pm $|0.022 | 0.762|$\pm $|0.020 | 0.341|$\pm $|0.034 | 0.402|$\pm $|0.083 |
NetLapRLS|$^{2}$| | 0.669|$\pm $|0.021 | 0.737|$\pm $|0.020 | 0.334|$\pm $|0.025 | 0.449|$\pm $|0.074 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.831|$\pm $|0.044 | 0.826|$\pm $|0.079 | 0.583|$\pm $|0.090 | 0.384|$\pm $|0.097 |
GRMF | 0.807|$\pm $|0.016 | 0.815|$\pm $|0.010 | 0.615|$\pm $|0.023 | 0.500|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.794|$\pm $|0.017 | 0.826|$\pm $|0.021 | 0.596|$\pm $|0.040 | 0.601|$\pm $|0.081 |
NRLMF|$^{2}$| | 0.812|$\pm $|0.018 | 0.785|$\pm $|0.028 | 0.556|$\pm $|0.038 | 0.449|$\pm $|0.079 |
CMF|$^{2}$| | 0.698|$\pm $|0.021 | 0.620|$\pm $|0.027 | 0.433|$\pm $|0.028 | 0.400|$\pm $|0.077 |
KBMF2K|$^{2}$| | 0.565|$\pm $|0.023 | 0.677|$\pm $|0.021 | 0.516|$\pm $|0.045 | 0.324|$\pm $|0.071 |
WNN-GIP|$^{2}$| | 0.566|$\pm $|0.038 | 0.696|$\pm $|0.035 | 0.550|$\pm $|0.047 | 0.531|$\pm $|0.073 |
BLM-NII|$^{2}$| | 0.735|$\pm $|0.022 | 0.762|$\pm $|0.020 | 0.341|$\pm $|0.034 | 0.402|$\pm $|0.083 |
NetLapRLS|$^{2}$| | 0.669|$\pm $|0.021 | 0.737|$\pm $|0.020 | 0.334|$\pm $|0.025 | 0.449|$\pm $|0.074 |
Table 8Comparison results with existing models in CVS3.
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.831|$\pm $|0.044 | 0.826|$\pm $|0.079 | 0.583|$\pm $|0.090 | 0.384|$\pm $|0.097 |
GRMF | 0.807|$\pm $|0.016 | 0.815|$\pm $|0.010 | 0.615|$\pm $|0.023 | 0.500|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.794|$\pm $|0.017 | 0.826|$\pm $|0.021 | 0.596|$\pm $|0.040 | 0.601|$\pm $|0.081 |
NRLMF|$^{2}$| | 0.812|$\pm $|0.018 | 0.785|$\pm $|0.028 | 0.556|$\pm $|0.038 | 0.449|$\pm $|0.079 |
CMF|$^{2}$| | 0.698|$\pm $|0.021 | 0.620|$\pm $|0.027 | 0.433|$\pm $|0.028 | 0.400|$\pm $|0.077 |
KBMF2K|$^{2}$| | 0.565|$\pm $|0.023 | 0.677|$\pm $|0.021 | 0.516|$\pm $|0.045 | 0.324|$\pm $|0.071 |
WNN-GIP|$^{2}$| | 0.566|$\pm $|0.038 | 0.696|$\pm $|0.035 | 0.550|$\pm $|0.047 | 0.531|$\pm $|0.073 |
BLM-NII|$^{2}$| | 0.735|$\pm $|0.022 | 0.762|$\pm $|0.020 | 0.341|$\pm $|0.034 | 0.402|$\pm $|0.083 |
NetLapRLS|$^{2}$| | 0.669|$\pm $|0.021 | 0.737|$\pm $|0.020 | 0.334|$\pm $|0.025 | 0.449|$\pm $|0.074 |
Model
. | Es
. | IC
. | GPCRs
. | NRs
. |
---|
Our method | 0.831|$\pm $|0.044 | 0.826|$\pm $|0.079 | 0.583|$\pm $|0.090 | 0.384|$\pm $|0.097 |
GRMF | 0.807|$\pm $|0.016 | 0.815|$\pm $|0.010 | 0.615|$\pm $|0.023 | 0.500|$\pm $|0.028 |
VB-MK-LMF|$^{1}$| | 0.794|$\pm $|0.017 | 0.826|$\pm $|0.021 | 0.596|$\pm $|0.040 | 0.601|$\pm $|0.081 |
NRLMF|$^{2}$| | 0.812|$\pm $|0.018 | 0.785|$\pm $|0.028 | 0.556|$\pm $|0.038 | 0.449|$\pm $|0.079 |
CMF|$^{2}$| | 0.698|$\pm $|0.021 | 0.620|$\pm $|0.027 | 0.433|$\pm $|0.028 | 0.400|$\pm $|0.077 |
KBMF2K|$^{2}$| | 0.565|$\pm $|0.023 | 0.677|$\pm $|0.021 | 0.516|$\pm $|0.045 | 0.324|$\pm $|0.071 |
WNN-GIP|$^{2}$| | 0.566|$\pm $|0.038 | 0.696|$\pm $|0.035 | 0.550|$\pm $|0.047 | 0.531|$\pm $|0.073 |
BLM-NII|$^{2}$| | 0.735|$\pm $|0.022 | 0.762|$\pm $|0.020 | 0.341|$\pm $|0.034 | 0.402|$\pm $|0.083 |
NetLapRLS|$^{2}$| | 0.669|$\pm $|0.021 | 0.737|$\pm $|0.020 | 0.334|$\pm $|0.025 | 0.449|$\pm $|0.074 |
Actually, the interactions for novel targets or novel drugs do not exist in database. The CVS2 and CVS3 are used to test the performance of predicting new drugs or targets, which do not exist in training set. The AUPRs are listed in Tables 7 and 8. Under CVS2, our model (MK-TCMF) has the best AUPRs on GPCRs (|$0.412$|), IC (|$0.426$|) and Es (|$0.407$|). Compared with the second best results, |$0.017$|, |$0.070$| and |$0.008$| are improved. Under CV3, MK-TCMF achieves best AUPRs on Es (|$0.831$|) and IC (|$0.826$|). But the results of MK-TCMF are not outstanding under CVS1 (|$0.552$|), CVS2 (|$0.386$|) and CVS3 (|$0.384$|) on NRs data set. From these results (CVS1, CVS2 and CVS3), we can find that MK-TCMF is still comparable to other methods. For NRs data set, the main reason is the size of data set, which is excessively small. The NRs only contain 26 targets and 54 drugs. The model is not only prone to overfitting on the training data, but also overfitting on the validation set, which ultimately reduces the stability of the model. Outliers may appear in features or in response variables. The cost of processing these outliers is high. In future work, we will deal with the outliers to improve the generalization performance of the model on small data sets.
3.5 Comparison on DTINet data set
We also test MK-TCMF on data set of DTINet [33], which employed |$1512$| targets and |$708$| drugs. The comparison methods include DTINet [33], GRMF [30], VB-MK-LMF [32], NRLMF [29], CMF [27], GRGMF [31], KronRLS-MKL [17], BLM-NII [22], NetLapRLS [14], GCN-DTI [26] and DTI-CNN [24]. The input of these models are same, including known drug–protein, protein–disease, protein–protein, drug–disease, drug-side effect, drug–drug associations. Above models are all performed under the same random seed. Except for DTINet, VB-MK-LMF, KronRLS-MKL and MK-TCMF, the remaining methods fuse heterogeneous information by average weighting. The values of AUPR and area under curve (AUC) are shown in Table 9. Our model achieves best AUPR and AUC of |$0.949$| and |$0.936$|. DTI-CNN has best AUC of |$0.936$|. The AUPR (|$0.940$|) and AUC (|$0.932$|) values of GRGMF are second best results.
Table 9Comparison on DTINet data set (with same random seed).
Method
. | AUC
. | AUPR
. |
---|
Our method | 0.936|$\pm $|0.020 | 0.949|$\pm $|0.021 |
DTINet | 0.922|$\pm $|0.019 | 0.931|$\pm $|0.021 |
GRMF | 0.877|$\pm $|0.025 | 0.913|$\pm $|0.016 |
VB-MK-LMF | 0.921|$\pm $|0.019 | 0.937|$\pm $|0.017 |
NRLMF | 0.905|$\pm $|0.023 | 0.918|$\pm $|0.018 |
CMF | 0.895|$\pm $|0.030 | 0.924|$\pm $|0.018 |
GRGMF | 0.932|$\pm $|0.019 | 0.940|$\pm $|0.022 |
KronRLS-MKL | 0.919|$\pm $|0.027 | 0.938|$\pm $|0.023 |
BLM-NII | 0.894|$\pm $|0.021 | 0.910|$\pm $|0.014 |
NetLapRLS | 0.904|$\pm $|0.019 | 0.913|$\pm $|0.015 |
GCN-DTI | 0.929|$\pm $|0.021 | 0.936|$\pm $|0.024 |
DTI-CNN | 0.936|$\pm $|0.022 | 0.939|$\pm $|0.016 |
Method
. | AUC
. | AUPR
. |
---|
Our method | 0.936|$\pm $|0.020 | 0.949|$\pm $|0.021 |
DTINet | 0.922|$\pm $|0.019 | 0.931|$\pm $|0.021 |
GRMF | 0.877|$\pm $|0.025 | 0.913|$\pm $|0.016 |
VB-MK-LMF | 0.921|$\pm $|0.019 | 0.937|$\pm $|0.017 |
NRLMF | 0.905|$\pm $|0.023 | 0.918|$\pm $|0.018 |
CMF | 0.895|$\pm $|0.030 | 0.924|$\pm $|0.018 |
GRGMF | 0.932|$\pm $|0.019 | 0.940|$\pm $|0.022 |
KronRLS-MKL | 0.919|$\pm $|0.027 | 0.938|$\pm $|0.023 |
BLM-NII | 0.894|$\pm $|0.021 | 0.910|$\pm $|0.014 |
NetLapRLS | 0.904|$\pm $|0.019 | 0.913|$\pm $|0.015 |
GCN-DTI | 0.929|$\pm $|0.021 | 0.936|$\pm $|0.024 |
DTI-CNN | 0.936|$\pm $|0.022 | 0.939|$\pm $|0.016 |
Table 9Comparison on DTINet data set (with same random seed).
Method
. | AUC
. | AUPR
. |
---|
Our method | 0.936|$\pm $|0.020 | 0.949|$\pm $|0.021 |
DTINet | 0.922|$\pm $|0.019 | 0.931|$\pm $|0.021 |
GRMF | 0.877|$\pm $|0.025 | 0.913|$\pm $|0.016 |
VB-MK-LMF | 0.921|$\pm $|0.019 | 0.937|$\pm $|0.017 |
NRLMF | 0.905|$\pm $|0.023 | 0.918|$\pm $|0.018 |
CMF | 0.895|$\pm $|0.030 | 0.924|$\pm $|0.018 |
GRGMF | 0.932|$\pm $|0.019 | 0.940|$\pm $|0.022 |
KronRLS-MKL | 0.919|$\pm $|0.027 | 0.938|$\pm $|0.023 |
BLM-NII | 0.894|$\pm $|0.021 | 0.910|$\pm $|0.014 |
NetLapRLS | 0.904|$\pm $|0.019 | 0.913|$\pm $|0.015 |
GCN-DTI | 0.929|$\pm $|0.021 | 0.936|$\pm $|0.024 |
DTI-CNN | 0.936|$\pm $|0.022 | 0.939|$\pm $|0.016 |
Method
. | AUC
. | AUPR
. |
---|
Our method | 0.936|$\pm $|0.020 | 0.949|$\pm $|0.021 |
DTINet | 0.922|$\pm $|0.019 | 0.931|$\pm $|0.021 |
GRMF | 0.877|$\pm $|0.025 | 0.913|$\pm $|0.016 |
VB-MK-LMF | 0.921|$\pm $|0.019 | 0.937|$\pm $|0.017 |
NRLMF | 0.905|$\pm $|0.023 | 0.918|$\pm $|0.018 |
CMF | 0.895|$\pm $|0.030 | 0.924|$\pm $|0.018 |
GRGMF | 0.932|$\pm $|0.019 | 0.940|$\pm $|0.022 |
KronRLS-MKL | 0.919|$\pm $|0.027 | 0.938|$\pm $|0.023 |
BLM-NII | 0.894|$\pm $|0.021 | 0.910|$\pm $|0.014 |
NetLapRLS | 0.904|$\pm $|0.019 | 0.913|$\pm $|0.015 |
GCN-DTI | 0.929|$\pm $|0.021 | 0.936|$\pm $|0.024 |
DTI-CNN | 0.936|$\pm $|0.022 | 0.939|$\pm $|0.016 |
3.6 Predicting Novel Interactions
Table 10 shows the predictive results of our method on GPCRs data set. It contains the estimated values of interactions, recorded database (letter abbreviation), ID of drug and target. In the top |$30$| results, we confirm |$18$| valid interaction records in the database. It can be seen that our method is effective in predicting DTIs.
Table 10Top |$30$| predict DTIs with our model for GPCRs.
Drug ID
. | Target ID
. | Score
. | Confirmed*
. | Drug ID
. | Target ID
. | Score
. | Confirmed*
. |
---|
D00283 | hsa1814 | 0.78262985 | C D M | D00437 | hsa152 | 0.67321722 | — |
D00255 | hsa152 | 0.66170877 | D | D02358 | hsa154 | 0.64519076 | D |
D00095 | hsa155 | 0.60468714 | C D K | D00604 | hsa148 | 0.60353368 | D |
D00604 | hsa147 | 0.59977694 | D | D02340 | hsa1812 | 0.59190334 | D |
D01713 | hsa152 | 0.58508138 | — | D00136 | hsa1812 | 0.55156516 | D |
D00397 | hsa1131 | 0.54579513 | C D K | D02342 | hsa155 | 0.53660241 | — |
D00235 | hsa155 | 0.53582014 | M | D00635 | hsa155 | 0.53274981 | — |
D04625 | hsa154 | 0.53197951 | D K | D00632 | hsa155 | 0.53106934 | — |
D00598 | hsa155 | 0.52867707 | — | D02361 | hsa1814 | 0.51970226 | — |
D01164 | hsa1812 | 0.50973009 | D | D03880 | hsa155 | 0.50639507 | — |
D02147 | hsa153 | 0.50405330 | D M | D03490 | hsa155 | 0.50159544 | K |
D00513 | hsa152 | 0.49695517 | — | D00110 | hsa1813 | 0.49590903 | — |
D00270 | hsa152 | 0.49571719 | D K | D00136 | hsa152 | 0.49513088 | — |
D00283 | hsa1131 | 0.49354532 | D | D00790 | hsa1814 | 0.48902350 | C D |
D00503 | hsa3356 | 0.48451227 | — | D00283 | hsa1132 | 0.48273789 | D |
Drug ID
. | Target ID
. | Score
. | Confirmed*
. | Drug ID
. | Target ID
. | Score
. | Confirmed*
. |
---|
D00283 | hsa1814 | 0.78262985 | C D M | D00437 | hsa152 | 0.67321722 | — |
D00255 | hsa152 | 0.66170877 | D | D02358 | hsa154 | 0.64519076 | D |
D00095 | hsa155 | 0.60468714 | C D K | D00604 | hsa148 | 0.60353368 | D |
D00604 | hsa147 | 0.59977694 | D | D02340 | hsa1812 | 0.59190334 | D |
D01713 | hsa152 | 0.58508138 | — | D00136 | hsa1812 | 0.55156516 | D |
D00397 | hsa1131 | 0.54579513 | C D K | D02342 | hsa155 | 0.53660241 | — |
D00235 | hsa155 | 0.53582014 | M | D00635 | hsa155 | 0.53274981 | — |
D04625 | hsa154 | 0.53197951 | D K | D00632 | hsa155 | 0.53106934 | — |
D00598 | hsa155 | 0.52867707 | — | D02361 | hsa1814 | 0.51970226 | — |
D01164 | hsa1812 | 0.50973009 | D | D03880 | hsa155 | 0.50639507 | — |
D02147 | hsa153 | 0.50405330 | D M | D03490 | hsa155 | 0.50159544 | K |
D00513 | hsa152 | 0.49695517 | — | D00110 | hsa1813 | 0.49590903 | — |
D00270 | hsa152 | 0.49571719 | D K | D00136 | hsa152 | 0.49513088 | — |
D00283 | hsa1131 | 0.49354532 | D | D00790 | hsa1814 | 0.48902350 | C D |
D00503 | hsa3356 | 0.48451227 | — | D00283 | hsa1132 | 0.48273789 | D |
Table 10Top |$30$| predict DTIs with our model for GPCRs.
Drug ID
. | Target ID
. | Score
. | Confirmed*
. | Drug ID
. | Target ID
. | Score
. | Confirmed*
. |
---|
D00283 | hsa1814 | 0.78262985 | C D M | D00437 | hsa152 | 0.67321722 | — |
D00255 | hsa152 | 0.66170877 | D | D02358 | hsa154 | 0.64519076 | D |
D00095 | hsa155 | 0.60468714 | C D K | D00604 | hsa148 | 0.60353368 | D |
D00604 | hsa147 | 0.59977694 | D | D02340 | hsa1812 | 0.59190334 | D |
D01713 | hsa152 | 0.58508138 | — | D00136 | hsa1812 | 0.55156516 | D |
D00397 | hsa1131 | 0.54579513 | C D K | D02342 | hsa155 | 0.53660241 | — |
D00235 | hsa155 | 0.53582014 | M | D00635 | hsa155 | 0.53274981 | — |
D04625 | hsa154 | 0.53197951 | D K | D00632 | hsa155 | 0.53106934 | — |
D00598 | hsa155 | 0.52867707 | — | D02361 | hsa1814 | 0.51970226 | — |
D01164 | hsa1812 | 0.50973009 | D | D03880 | hsa155 | 0.50639507 | — |
D02147 | hsa153 | 0.50405330 | D M | D03490 | hsa155 | 0.50159544 | K |
D00513 | hsa152 | 0.49695517 | — | D00110 | hsa1813 | 0.49590903 | — |
D00270 | hsa152 | 0.49571719 | D K | D00136 | hsa152 | 0.49513088 | — |
D00283 | hsa1131 | 0.49354532 | D | D00790 | hsa1814 | 0.48902350 | C D |
D00503 | hsa3356 | 0.48451227 | — | D00283 | hsa1132 | 0.48273789 | D |
Drug ID
. | Target ID
. | Score
. | Confirmed*
. | Drug ID
. | Target ID
. | Score
. | Confirmed*
. |
---|
D00283 | hsa1814 | 0.78262985 | C D M | D00437 | hsa152 | 0.67321722 | — |
D00255 | hsa152 | 0.66170877 | D | D02358 | hsa154 | 0.64519076 | D |
D00095 | hsa155 | 0.60468714 | C D K | D00604 | hsa148 | 0.60353368 | D |
D00604 | hsa147 | 0.59977694 | D | D02340 | hsa1812 | 0.59190334 | D |
D01713 | hsa152 | 0.58508138 | — | D00136 | hsa1812 | 0.55156516 | D |
D00397 | hsa1131 | 0.54579513 | C D K | D02342 | hsa155 | 0.53660241 | — |
D00235 | hsa155 | 0.53582014 | M | D00635 | hsa155 | 0.53274981 | — |
D04625 | hsa154 | 0.53197951 | D K | D00632 | hsa155 | 0.53106934 | — |
D00598 | hsa155 | 0.52867707 | — | D02361 | hsa1814 | 0.51970226 | — |
D01164 | hsa1812 | 0.50973009 | D | D03880 | hsa155 | 0.50639507 | — |
D02147 | hsa153 | 0.50405330 | D M | D03490 | hsa155 | 0.50159544 | K |
D00513 | hsa152 | 0.49695517 | — | D00110 | hsa1813 | 0.49590903 | — |
D00270 | hsa152 | 0.49571719 | D K | D00136 | hsa152 | 0.49513088 | — |
D00283 | hsa1131 | 0.49354532 | D | D00790 | hsa1814 | 0.48902350 | C D |
D00503 | hsa3356 | 0.48451227 | — | D00283 | hsa1132 | 0.48273789 | D |
4 Discussion
In this work, we develop an MK-TCMF model to fuse multi-kernels (features) and predict potential DTIs. Different features often have different contributions for the model. How to select them effectively is a problem. Fortunately, the MKL algorithm can regulate the weight of each kernel matrix according to the prediction error, and obtain better prediction performance in the fusion process.
After the model converges, each kernel will obtain a weight coefficient. The values (from |$0$| to |$1$|) of these coefficients reflects the strength of the feature contribution. The higher the value of weight, the greater the contribution for the model. It can be obtained from the experimental results (Figure 5) that |$\mathbf{K}_{SE,d}$|, |$\mathbf{K}_{GIP,d}$|, |$\mathbf{K}_{SW,t}$| and |$\mathbf{K}_{GIP,t}$| have higher weights in drug and target feature spaces, respectively. The |$\mathbf{K}_{SE,d}$| (drug side effects) has the information of pharmacology. Currently, network pharmacology believes that drug development should follow network targeting-multi-component therapy [44]. In addition, |$\mathbf{K}_{SW,t}$| (sequence similarity of target) reflects the similarity of some structures. |$\mathbf{K}_{GIP,t}$| (target) and |$\mathbf{K}_{GIP,d}$| (drug) contain the known association information of drug and target. These associations provide a priori information in the prediction process. Therefore, the above features get higher weights.
Under three kinds of CV, MK-TCMF is compared with existing models. Under CVS1, MK-TCMF reaches best AUPR (Es: 0.912, IC: 0.933). Under CVS2, MK-TCMF has the best AUPRs on GPCRs (0.412), IC (0.426) and Es (0.407). Under CVS3, MK-TCMF also achieves best AUPRs on Es (0.831) and IC (0.826). These results show that MK-TCMF is comparable.
5 Conclusions
The MF models had been employed in recommendation systems, and it has obtained good prediction performance in the sparse adjacency matrices. With the generation of massive amounts of data, multi-source information fusion and deep learning have entered the field of medicine and been well applied. For example, the model based on graph neural network can extract topological information of drug. In future work, we will introduce multi-view learning [45, 46], graph method [47–50] and deep learning to feature representation and further improve the prediction performance of the model.
Key Points
We develop a matrix factorization (MF) model called multiple kernel-based triple collaborative matrix factorization (MK-TCMF).
We decompose the original adjacency matrix into three matrices, including the feature matrix of the drug space, the bi-projection matrix (used to join the two spaces) and the feature matrix of the target space.
In the process of solving the model, multiple drug kernel matrices and target kernel matrices are all linearly weighted and fused by the multiple kernel learning (MKL) algorithm.
To solve the parameters of the MK-TCMF model, an efficient iterative optimization algorithm is proposed.
Data availability
The codes and data are available from https://github.com/guofei-tju/IDTI-MK-TCMF or https://figshare.com/s/d1c4083564157150f9e7.
Funding
This work is supported by a grant from the National Natural Science Foundation of China (NSFC 61902271, 62172296 and 62172076), Special Science Foundation of Quzhou (2021D004) and Natural Science Research of Jiangsu Higher Education Institutions of China (19KJB520014).
Yijie Ding is an associate researcher in the Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, P.R. China.
Jijun Tang is a Professor in Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.
Fei Guo is a Professor in School of Computer Science and Engineering, Central South University, Changsha, 410083, P.R. China.
Quan Zou is a Professor in Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, P.R. China.
References
1.Hecker
N
, Ahmed
J
, von Eichborn
J
, et al.
SuperTarget goes quantitative: update on drug-target interactions
.
Nucleic Acids Res
2012
;
40
(
D1
):
1113
–
7
.
2.Schomburg
I
, Chang
A
, Placzek
S
, et al.
BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA
.
Nucleic Acids Res
2013
;
41
(
D1
):
764
–
72
.
3.Kanehisa
M
, Goto
S
.
From genomics to chemical genomics: new developments in KEGG
.
Nucleic Acids Res
2006
;
34
(
Suppl 1
):
354
–
7
.
4.Law
V
, Knox
C
, Djoumbou
Y
, et al.
DrugBank 4.0: shedding new light on drug metabolism
.
Nucleic Acids Res
2014
;
42
(
D1
):
1091
–
7
.
5.Chen
X
, Yan
CC
, Zhang
X
, et al.
Drug–target interaction prediction: databases, web servers and computational models
.
Brief Bioinform
2016
;
17
(
4
):
696
–
712
.
6.Chen
X
, Liu
MX
, Yan
GY
.
Drug-target interaction prediction by random walk on the heterogeneous network
.
Mol Biosyst
2012
;
8
(
7
):
1970
–
8
.
7.Wei
L
, Xing
P
, Zeng
J
, et al.
Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier
.
Artif Intell Med
2017
;
83
(
11
):
67
–
74
.
8.Wei
L
, Wan
S
, Guo
J
, et al.
A novel hierarchical selective ensemble classifier with bioinformatics application
.
Artif Intell Med
2017
;
83
(
11
):
82
–
90
.
9.Mousavian
Z
, Khakabimamaghani
S
, Kavousi
K
, et al.
Drug-target interaction prediction from PSSM based evolutionary information
.
J Pharmacol Toxicol Methods
2015
;
78
:
42
–
51
.
10.Cao
DS
, Liu
S
, Xu
QS
, et al.
Large-scale prediction of drug-target interactions using protein sequences and drug topological structures
.
Anal Chim Acta
2012
;
752
(
21
):
1
–
10
.
11.Li
Z
, Han
P
, You
ZH
, et al.
In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences
.
Sci Rep
2017
;
7
(
1
):
11174
.
12.Lin
J
, Chen
H
, Li
S
, et al.
Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier
.
Artif Intell Med
2019
;
98
(
1
):
35
–
47
.
13.Shi
H
, Liu
S
, Chen
J
, et al.
Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure
.
Genomics
2019
;
111
(
6
):
1839
–
52
.
14.Xia
Z
, Wu
LY
, Zhou
X
, et al.
Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces
.
BMC Syst Biol
2010
;
4
(
Suppl 2
):
6
–
17
.
15.van Laarhoven
T
, Nabuurs
SB
, Marchiori
E
.
Gaussian interaction profile kernels for predicting drug–target interaction
.
Bioinformatics
2011
;
27
(
21
):
3036
–
43
.
16.Hao
M
, Wang
Y
, Bryant
SH
.
Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique
.
Anal Chim Acta
2016
;
909
:
41
–
50
.
17.Nascimento
A
, Prudêncio
RBC
, Costa
IG
.
A multiple kernel learning algorithm for drug-target interaction prediction
.
BMC Bioinformatics.
2016
;
17
(
1
):
46
–
61
.
18.Cichonska
A
, Ravikumar
B
, Parri
E
, et al.
Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors
.
PLoS Comput Biol
2017
;
13
(
8
):e1005678.
19.van Laarhoven
T
, Marchiori
E
.
Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile
.
Plos One
2013
;
8
(
6
):e66952.
20.Bleakley
K
, Yamanishi
Y
.
Supervised prediction of drug-target interactions using bipartite local models
.
Bioinformatics
2009
;
25
(
18
):
2397
–
403
.
21.Ding
Y
, Tang
J
, Guo
F
.
Identification of drug-target interactions via fuzzy bipartite local model
.
Neural Computing and Applications
2020
;
32
:
10303
–
19
.
22.Mei
J
, Kwoh
CK
, Yang
P
, et al.
Drug-target interaction prediction by learning from local information and neighbors
.
Bioinformatics
2013
;
29
(
2
):
238
–
45
.
23.Lee
I
, Keum
J
, Nam
H
.
DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences
.
PLoS Comput Biol
2019
;
15
:
e1007129
.
.
24.Peng
J
, Li
J
, Shang
X
.
A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network
.
BMC Bioinformatics
2020
;
21
:
394
.
25.Eslami Manoochehri
H
, Nourani
M
.
Drug-target interaction prediction using semi-bipartite graph model and deep learning
.
BMC Bioinformatics.
2020
;
21
:
248
.
26.Zhao
T
, Hu
Y
, Valsdottir
LR
, et al.
Identifying drug-target interactions based on graph convolutional network and deep neural network
.
Brief Bioinform
2020
;
22
:
2141
.
.
27.Zheng
X
, Ding H, Mamitsuka H, et al. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In:
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
. Association for Computing Machinery,
2013
, pp.
1025
–
33
.
28.Gonen
M
.
Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization
.
Bioinformatics
2012
;
28
(
18
):
2304
–
10
.
29.Liu
Y
, Wu
M
, Miao
C
, et al.
Neighborhood regularized logistic matrix factorization for drug-target interaction prediction
.
PLoS Comput Biol
2016
;
12
(
2
):e1004760.
30.Ezzat
A
, Zhao
P
, Wu
M
, et al.
Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization
.
IEEE/ACM Trans Comput Biol Bioinform
2016
;
14
(
3
):
646
–
56
.
31.Zhang
Z
, Zhang
XF
, Wu
M
, et al.
A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks
.
Bioinformatics
2020
;
36
(
11
):
3474
–
81
.
32.Bolgár
B
, Antal
P
.
VB-MK-LMF: fusion of drugs, targets and interactions using variational Bayesian multiple kernel logistic matrix factorization. Bmc
.
Bioinformatics
2017
;
18
(
1
):
440
–
57
.
33.Luo
Y
, Zhao
X
, Zhou
J
, et al.
A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information
.
Nat Commun
2017
;
8
:
573
.
.
34.Shi
JY
, Huang
H
, Li
JX
, et al.
TMFUF: a triple matrix factorization-based unified framework for predicting comprehensive drug-drug interactions of new drugs
.
BMC Bioinformatics
2018
;
19
:
411
.
35.Yamanishi
Y
, Araki
M
, Gutteridge
A
, et al.
Prediction of drug-target interaction networks from the integration of chemical and genomic spaces
.
Bioinformatics
2008
;
24
(
13
):
i232
–
40
.
36.Perlman
L
, Gottlieb
A
, Atias
N
, et al.
Combining drug and gene similarity measures for drug-target elucidation
.
J Comput Biol
2011
;
18
(
2
):
133
.
37.Smedley
D
.
The BioMart community portal: an innovative alternative to large, centralized data repositories
.
Nucleic Acids Res
2015
;
43
(
1
):
589
–
98
.
38.Ovaska
K
, Laakso
M
, Hautaniemi
S
.
Fast gene ontology based clustering for microarray experiments
.
Biodata Mining
2008
;
1
(
1
):
11
–
1
.
39.Takarabe
M
, Kotera
M
, Nishimura
Y
, et al.
Drug target prediction using adverse event report systems
.
Bioinformatics
2012
;
28
(
18
):
i611
–
8
.
40.Hattori
M
, Okuno
Y
, Goto
S
, et al.
Development of a Chemical Structure Comparison Method for Integrated Analysis of Chemical and Genomic Information in the Metabolic Pathways
.
J Am Chem Soc
2003
;
125
(
39
):
11853
–
65
.
41.Ding
Y
, Tang
J
, Guo
F
.
Identification of Drug-Target Interactions via Multiple Information Integration
.
Inform Sci
2017
;
418
:
546
–
60
.
42.He
J
, Chang S, Xie L. Fast kernel learning for spatial pyramid matching. In:
IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
. IEEE Computer Society,
2008
, pp.
1
–
7
.
43.Cristianini
N
, Elisseeff
A
.
On Kernel-Target Alignment
.
Advances in Neural Information Processing Systems
2001
;
179
(
5
):
367
–
73
.
44.Casas
AI
, Hassan
AA
, Larsen
SJ
, et al.
From single drug targets to synergistic network pharmacology in ischemic stroke
.
Proc Natl Acad Sci
2019
;
116
(
14
):
7129
–
36
.
45.Jiang
Y
, Deng
Z
, Chung
FL
, et al.
Recognition of Epileptic EEG Signals Using a Novel Multiview TSK Fuzzy System
.
IEEE Trans Fuzzy Syst
2017
;
25
(
1
):
3
–
20
.
46.Jiang
Y
, Zhang
Y
, Lin
C
, et al.
EEG-Based Driver Drowsiness Estimation Using an Online Multi-View and Transfer TSK Fuzzy System
.
IEEE Transactions on Intelligent Transportation Systems
2021
;
22
(
3
):
1752
–
64
.
47.Zhou
H
, Wang
H
.
Identify ncRNA subcellular localization via graph regularized k-local hyperplane distance nearest neighbor model on multi-kernel learning
.
IEEE/ACM Trans Comput Biol Bioinform
2021
.
48.Ding
Y
, Yang
C
, Tang
J
, et al.
Identification of protein-nucleotide binding residues via graph regularized k-local hyperplane distance nearest neighbor model
.
Applied Intelligence
2021
.
49.Qian
Y
, Meng
H
, Lu
W
, et al.
Identification of DNA-binding proteins via Hypergraph based Laplacian Support Vector Machine
.
Current Bioinformatics
2021
;
16
.
50.Yang
H
, Ding
Y
, Tang
J
, et al.
Drug-disease associations prediction via Multiple Kernel-based Dual Graph Regularized Least Squares
.
Appl Soft Comput
2021
;
112
:
107811
.
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com