Identification of drug–target interactions via multiple kernel-based triple collaborative matrix factorization

Based on previous MF [27–34], we develop an MF model called multiple kernel-based triple collaborative matrix factorization (MK-TCMF). Different from other MF, we decompose the original adjacency matrix into three matrices, including the feature matrix of the drug space, the bi-projection matrix (used to join the two spaces) and the feature matrix of the target space. In the process of solving the model, multiple drug kernel matrices and target kernel matrices are all linearly weighted and fused by the MKL algorithm.

The contributions of our work are as follows: (1) Different from the previous MF [27, 30], we decompose the original adjacency matrix into the bi-projection matrix, feature matrix of the drug space and target space; (2) The weights of multiple drug and target kernel matrices are synergistically optimized with MF algorithm; (3) To solve the parameters of the MK-TCMF model, an efficient iterative optimization algorithm is proposed; (4) Our method achieves better results on most data sets.

Our study is organized as follows: In the Materials and methods section, we propose an MK-TCMF model to predict DTIs. In the Results section, we test MK-TCMF model on benchmark data sets. In the Discussion section, we discuss the performance of the method and the experimental results. In the Conclusion section, the future work is given.

2 Materials and methods

2.1 Problem description

The network of DTIs can be considered as a bipartite network (in Figure 1). Our goal is to use the known DTIs (links) to estimate new associations between drugs and targets. The DTIs network has |$n$| drugs and |$m$| targets in two sets (⁠|$D = \left \{d_{1},d_{2},...,d_{n}\right \}$| and |$T = \left \{t_{1},t_{2},...,t_{m} \right \}$|⁠), respectively. In our study, the similarity of these drugs (or between targets) can be described as a kernel matrix. Actually, these kernel matrices reflect the topology of drug–drug (⁠|$n \times n$|⁠) and target–target (⁠|$m \times m$|⁠). And the values of the kernel matrix elements are between |$0$| and |$1$|⁠. There are some known links (associations) between the drug and the target set. They can be represented as an adjacency matrix |$\mathbf{Y}_{train} \in \mathbf{R}^{n \times m}$|⁠. If |$Y_{train}(i,j)=1$|⁠, drug |$d_{i}$| and target |$t_{j}$| have directly interaction, otherwise |$Y_{train}(i,j)=0$|⁠. The goal of our method is to calculate a new matrix |$\mathbf{Y}^{*}\in \mathbf{R}^{n \times m}$| and make it approximately equal to |$\mathbf{Y}_{train}$|⁠. If some new values (non-zero) appear in |$\mathbf{Y}^{*}$| (e.g. |$Y_{i,j}^{*}>0$|⁠), drug |$d_{i}$| and target |$t_{j}$| may have link (interaction) in DTIs network. In Figure 1, the black solid lines represent the links between drug and target nodes.

2.2 Related work

CMF [27] was an MF-based method with minimizing the objective function:

$$\begin{align}& \underset{\mathbf{A},\mathbf{B}}{min} \ \Vert \mathbf{W} \bigodot (\mathbf{Y}_{train} - \mathbf{A}\mathbf{B}^{T})\Vert_{F}^{2} \nonumber \\ & + \lambda_{l}(\Vert \mathbf{A}\Vert_{F}^{2} + \Vert \mathbf{B}\Vert_{F}^{2}) \nonumber\\ & + \lambda_{d} \Vert \mathbf{K}_{d} - \mathbf{A}\mathbf{A}^{T} \Vert_{F}^{2} \nonumber \\ & + \lambda_{t} \Vert \mathbf{K}_{t} - \mathbf{B}\mathbf{B}^{T} \Vert_{F}^{2} \end{align}$$

(1)

where |$\mathbf{W} \in \mathbf{R}^{n \times m}$| was a weight matrix. |$\mathbf{A} \in \mathbf{R}^{n \times k}$| and |$\mathbf{B} \in \mathbf{R}^{m \times k}$| were two latent feature matrices. The predicted matrix for DTIs network was obtained by multiplying |$\mathbf{A}$| and |$\mathbf{B}$|⁠.

NRLMF [29] estimated the probability of DTIs by logistic matrix factorization. The objective function of NRLMF was formulated as:

$$\begin{align}& \underset{\mathbf{U},\mathbf{V}}{min} \ \sum_{i=1}^{n}\sum_{j=1}^{m} ((1+cY_{train,ij}-Y_{train,ij}) ln(1+exp(\mathbf{u}_{i}\mathbf{v}_{j}^{T})) \nonumber\\ & -cY_{train,ij}\mathbf{u}_{i}\mathbf{v}_{j}^{T}) \nonumber\\ & + \frac{1}{2}tr (\mathbf{U}^{T}(\lambda_{d}\mathbf{I}_{d} + \alpha\mathbf{L}_{d})\mathbf{U}) \nonumber \\ & + \frac{1}{2}(\mathbf{V}^{T}(\lambda_{t}\mathbf{I}_{t} + \beta\mathbf{L}_{t})\mathbf{V}) \end{align}$$

(2)

where |$k$| denoted the dimension of low-rank matrix. |$\mathbf{I}_{d} \in \mathbf{R}^{n \times n}$|⁠, |$\mathbf{I}_{t} \in \mathbf{R}^{m \times m}$|⁠, |$\mathbf{U} \in \mathbf{R}^{n \times k}$| and |$\mathbf{V} \in \mathbf{R}^{m \times k}$| denoted identity matrices and two low-rank matrices. |$\beta $|⁠, |$\alpha $| and |$c$| denoted constant parameters. |$\mathbf{P}^{*}\in \mathbf{R}^{n \times m}$| of prediction of probability was calculated by

$$\begin{align}& \mathbf{P}^{*} = \frac{exp(\mathbf{U}\mathbf{V}^{T})}{1 + exp(\mathbf{U}\mathbf{V}^{T})}. \end{align}$$

(3)

The GRMF [30] also decomposed |$\mathbf{Y}_{train}$| into |$\mathbf{A} \in \mathbf{R}^{n \times k}$| and |$\mathbf{B} \in \mathbf{R}^{m \times k}$| by

$$\begin{align}& \underset{\mathbf{A},\mathbf{B}}{min} \ \Vert \mathbf{Y}_{train} - \mathbf{A}\mathbf{B}^{T}\Vert_{F}^{2} \nonumber \\ & + \lambda_{l}(\Vert \mathbf{A}\Vert_{F}^{2} + \Vert \mathbf{B}\Vert_{F}^{2}) \nonumber\\ & + \lambda_{d}tr ((\mathbf{A})^{T}\mathbf{L}_{d}\mathbf{A}) \nonumber \\ & + \lambda_{t}tr ((\mathbf{B})^{T}\mathbf{L}_{t}\mathbf{B}) \end{align}$$

(4)

where |$\lambda _{l}$|⁠, |$\lambda _{d}$| and |$\lambda _{t}$| were positive parameters of regular terms. The prediction |$\mathbf{F}^{*}$| was calculated by

$$\begin{align}& \mathbf{F}^{*} = \mathbf{A}\mathbf{B}^{T}. \end{align}$$

(5)

2.3 Benchmark data set

Four real DTIs networks are used to test MK-TCMF. They are Enzymes (Es), G Protein-Coupled Receptors (GPCRs), Ion Channel (IC) and Nuclear Receptors (NRs) [20] from DrugBank [4], BRENDA [2], SuperTarget [1] and KEGG BRITE [3] databases. The information of DTIs networks are list in Table 1.

Table 1

The information of four DTIs networks.

Data set	Interaction	Target	Drug	Sparsity
NRs	\|$90$\|	\|$26$\|	\|$54$\|	93.59%
GPCRs	\|$635$\|	\|$95$\|	\|$223$\|	97.00%
IC	\|$1476$\|	\|$204$\|	\|$210$\|	96.55%
Es	\|$2926$\|	\|$664$\|	\|$445$\|	99.01%

Data set	Interaction	Target	Drug	Sparsity
NRs	\|$90$\|	\|$26$\|	\|$54$\|	93.59%
GPCRs	\|$635$\|	\|$95$\|	\|$223$\|	97.00%
IC	\|$1476$\|	\|$204$\|	\|$210$\|	96.55%
Es	\|$2926$\|	\|$664$\|	\|$445$\|	99.01%

Table 1

The information of four DTIs networks.

Data set	Interaction	Target	Drug	Sparsity
NRs	\|$90$\|	\|$26$\|	\|$54$\|	93.59%
GPCRs	\|$635$\|	\|$95$\|	\|$223$\|	97.00%
IC	\|$1476$\|	\|$204$\|	\|$210$\|	96.55%
Es	\|$2926$\|	\|$664$\|	\|$445$\|	99.01%

Data set	Interaction	Target	Drug	Sparsity
NRs	\|$90$\|	\|$26$\|	\|$54$\|	93.59%
GPCRs	\|$635$\|	\|$95$\|	\|$223$\|	97.00%
IC	\|$1476$\|	\|$204$\|	\|$210$\|	96.55%
Es	\|$2926$\|	\|$664$\|	\|$445$\|	99.01%

2.4 Drug and target kernels

Several kinds of DTIs information are introduced into our model. In target space, the protein sequence [20, 35], Protein–Protein Interactions (PPIs) network [36], functional annotation (with Gene Ontology) [37, 38] and known target interaction profile [15, 19] are utilized to build target kernel matrices. In drug space, side-effects [17, 39], chemical structure (with SIMCOMP) [40], drug substructure (with fingerprint) [41] and known profile of drug interaction [15, 19] are employed to build drug kernel matrices. The details of kernels are list in Table 2.

Table 2

The kernels in two feature spaces.

Feature Space	Kernel	Description
Drug	\|$\mathbf{K}_{GIP,d}$\|	Gaussian interaction profile for drug[15, 19]
	\|$\mathbf{K}_{SE,d}$\|	network of drug-side effect associations [17, 39]
	\|$\mathbf{K}_{MACCS,d}$\|	drug substructure fingerprint[41]
	\|$\mathbf{K}_{SIMCOMP,d}$\|	chemical structure [35, 40]
Target	\|$\mathbf{K}_{GIP,t}$\|	Gaussian interaction profile for target[15, 19]
	\|$\mathbf{K}_{PPI,t}$\|	PPIs network of target[17, 36]
	\|$\mathbf{K}_{GO,t}$\|	functional information of target[17, 37]
	\|$\mathbf{K}_{SW,t}$\|	sequence information of target[35]

Feature Space	Kernel	Description
Drug	\|$\mathbf{K}_{GIP,d}$\|	Gaussian interaction profile for drug[15, 19]
	\|$\mathbf{K}_{SE,d}$\|	network of drug-side effect associations [17, 39]
	\|$\mathbf{K}_{MACCS,d}$\|	drug substructure fingerprint[41]
	\|$\mathbf{K}_{SIMCOMP,d}$\|	chemical structure [35, 40]
Target	\|$\mathbf{K}_{GIP,t}$\|	Gaussian interaction profile for target[15, 19]
	\|$\mathbf{K}_{PPI,t}$\|	PPIs network of target[17, 36]
	\|$\mathbf{K}_{GO,t}$\|	functional information of target[17, 37]
	\|$\mathbf{K}_{SW,t}$\|	sequence information of target[35]

Table 2

The kernels in two feature spaces.

Feature Space	Kernel	Description
Drug	\|$\mathbf{K}_{GIP,d}$\|	Gaussian interaction profile for drug[15, 19]
	\|$\mathbf{K}_{SE,d}$\|	network of drug-side effect associations [17, 39]
	\|$\mathbf{K}_{MACCS,d}$\|	drug substructure fingerprint[41]
	\|$\mathbf{K}_{SIMCOMP,d}$\|	chemical structure [35, 40]
Target	\|$\mathbf{K}_{GIP,t}$\|	Gaussian interaction profile for target[15, 19]
	\|$\mathbf{K}_{PPI,t}$\|	PPIs network of target[17, 36]
	\|$\mathbf{K}_{GO,t}$\|	functional information of target[17, 37]
	\|$\mathbf{K}_{SW,t}$\|	sequence information of target[35]

Feature Space	Kernel	Description
Drug	\|$\mathbf{K}_{GIP,d}$\|	Gaussian interaction profile for drug[15, 19]
	\|$\mathbf{K}_{SE,d}$\|	network of drug-side effect associations [17, 39]
	\|$\mathbf{K}_{MACCS,d}$\|	drug substructure fingerprint[41]
	\|$\mathbf{K}_{SIMCOMP,d}$\|	chemical structure [35, 40]
Target	\|$\mathbf{K}_{GIP,t}$\|	Gaussian interaction profile for target[15, 19]
	\|$\mathbf{K}_{PPI,t}$\|	PPIs network of target[17, 36]
	\|$\mathbf{K}_{GO,t}$\|	functional information of target[17, 37]
	\|$\mathbf{K}_{SW,t}$\|	sequence information of target[35]

2.5 The Proposed Model of Multiple Kernel-based Triple Collaborative Matrix Factorization

The multiple drug (⁠|$k_{d}$|⁠) and target (⁠|$k_{t}$|⁠) kernels can be expressed as |$\left \{\mathbf{K}_{d}^{1}, \mathbf{K}_{d}^{2},...,\mathbf{K}_{d}^{k_{d}}\right \}$| and |$\left \{\mathbf{K}_{t}^{1}, \mathbf{K}_{t}^{2},...,\mathbf{K}_{t}^{k_{t}}\right \}$|⁠.

$$\begin{align} \mathbf{K}_{d}^{*} = \sum_{i=1}^{k_{d}} \beta_{d}^{i} \mathbf{K}_{d}^{i}, \quad \mathbf{K}_{d}^{i} \in \mathbf{R}^{n \times n} \end{align}$$

(6a)

$$\begin{align} \mathbf{K}_{t}^{*} = \sum_{j=1}^{k_{t}} \beta_{t}^{j} \mathbf{K}_{t}^{j}, \quad \mathbf{K}_{t}^{j} \in \mathbf{R}^{m \times m} \end{align}$$

(6b)

where |$\boldsymbol{\beta }_{d}=\left \{\beta _{d}^{1},\beta _{d}^{2},...,\beta _{d}^{k_{d}} \right \}$| and |$\boldsymbol{\beta }_{t}=\left \{\beta _{t}^{1},\beta _{t}^{2},...,\beta _{d}^{k_{t}} \right \}$| are the weights of kernels for the drug and target, respectively. MKL algorithm computes the weights for kernels. So, the |$\mathbf{K}_{t}^{*}$| and |$\mathbf{K}_{d}^{*}$| are optimal integrated kernels.

Figure 2

Schematic of MK-TCMF.

Inspired by MF [27, 29, 30], the matrix can be approximated by low rank representation of two drug (or target) feature as follows:

$$\begin{align} \mathbf{K}_{d}^{*} \approx \mathbf{A}\mathbf{A}^{T}, \ \mathbf{A} \in \mathbf{R}^{n \times r_{d}} \end{align}$$

(7a)

$$\begin{align} \mathbf{K}_{t}^{*} \approx \mathbf{B}\mathbf{B}^{T}, \ \mathbf{B} \in \mathbf{R}^{m \times r_{t}} \end{align}$$

(7b)

where |$ \mathbf{A}$| and |$\mathbf{B}$| denote the matrices of low-rank approximation. |$r_{d}$| and |$r_{t}$| are the dimensions of the latent feature space in drug and target space. The objective formula is defined as following:

$$\begin{align}& min \ J(\boldsymbol{\Theta},\mathbf{A},\mathbf{B},\boldsymbol{\beta}_{d},\boldsymbol{\beta}_{t})= \Vert \mathbf{Y}_{train} - \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T}\Vert_{F}^{2} \nonumber\\ & + \lambda_{\Theta} \Vert \boldsymbol{\Theta} \Vert_{F}^{2} \nonumber\\ &+ \lambda_{l} (\Vert \mathbf{A} \Vert_{F}^{2} + \Vert \mathbf{B} \Vert_{F}^{2}) \nonumber\\ & + \lambda_{d} \Vert \sum_{i=1}^{k_{d}} \beta_{d}^{i} \mathbf{K}_{d}^{i} - \mathbf{A}\mathbf{A}^{T}\Vert_{F}^{2} \nonumber\\ &+ \lambda_{t} \Vert \sum_{j=1}^{k_{t}} \beta_{t}^{j} \mathbf{K}_{t}^{j} - \mathbf{B}\mathbf{B}^{T}\Vert_{F}^{2} \nonumber\\ &+ \lambda_{\beta} (\Vert \boldsymbol{\beta}_{d} \Vert_{F}^{2} + \Vert \boldsymbol{\beta}_{t} \Vert_{F}^{2}) \nonumber\\ s.t. &\sum_{i=1}^{k_{d}} \beta_{d}^{i} = 1, \ \ \sum_{j=1}^{k_{t}} \beta_{t}^{j} = 1 \nonumber\\ & \mathbf{K}_{d}^{*} = \sum_{i=1}^{k_{d}} \beta_{d}^{i} \mathbf{K}_{d}^{i}, \ \ \mathbf{K}_{t}^{*} = \sum_{j=1}^{k_{t}} \beta_{t}^{j} \mathbf{K}_{t}^{j} \end{align}$$

(8)

where |$\boldsymbol{\Theta } \in \mathbf{R}^{r_{d} \times r_{t}}$| is the bi-projection matrix. |$\lambda _{\Theta }$|⁠, |$\lambda _{l}$|⁠, |$\lambda _{d}$|⁠, |$\lambda _{t}$|⁠, |$\lambda _{\beta }$| are the regularization coefficients of the next five regular items. The range of these coefficients is 0–1 with 0.1 step. In addition to using the |$\mathbf{A}$|⁠, |$\mathbf{B}$| and |$\boldsymbol{\Theta }$| matrices to approximate |$\mathbf{Y}_{train}$|⁠, we also use |$\mathbf{A}$| and |$\mathbf{B}$| to approximate the kernel matrices of |$\mathbf{K}_{d}^{*}$| and |$\mathbf{K}_{t}^{*}$|⁠, respectively.

2.6 Optimization

To optimize the Eq. 8, we use the alternating least squares algorithm (ALSA). First, we fix the variables of |$\mathbf{A}$|⁠, |$\mathbf{B}$|⁠, |$\boldsymbol{\beta }_{d}$|⁠, |$\boldsymbol{\beta }_{t}$| to solve the variable |$\boldsymbol{\Theta }$|⁠. Let |$\partial J/\partial \boldsymbol{\Theta } = 0$|⁠, we can obtain functions as follows:

$$\begin{align} \partial (\Vert \mathbf{Y}_{train} - \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T}\Vert_{F}^{2} + \lambda_{\Theta} \Vert \boldsymbol{\Theta} \Vert_{F}^{2})/\partial \boldsymbol{\Theta} = 0 \end{align}$$

(9a)

$$\begin{align} -2\mathbf{A}^{T}(\mathbf{Y}_{train} - \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T})\mathbf{B} + 2\lambda_{\Theta} \boldsymbol{\Theta}=0 \end{align}$$

(9b)

$$\begin{align} \mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T}\mathbf{B}+\lambda_{\Theta} \boldsymbol{\Theta}=\mathbf{A}^{T}\mathbf{Y}_{train}\mathbf{B} \end{align}$$

(9c)

$$\begin{align} \mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta}+\lambda_{\Theta} \boldsymbol{\Theta}(\mathbf{B}^{T}\mathbf{B})^{-1}=\mathbf{A}^{T}\mathbf{Y}_{train}\mathbf{B}(\mathbf{B}^{T}\mathbf{B})^{-1} \end{align}$$

(9d)

$$\begin{align} \mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta}+\lambda_{\Theta} \boldsymbol{\Theta}(\mathbf{B}^{T}\mathbf{B})^{-1}=\mathbf{A}^{T}\mathbf{Y}_{train}(\mathbf{B}^{T})^{-1} \end{align}$$

(9e)

where Eq. 9e is a Sylvester equation. Then, |$\boldsymbol{\Theta }$|⁠, |$\mathbf{B}$|⁠, |$\boldsymbol{\beta }_{d}$|⁠, |$\boldsymbol{\beta }_{t}$| are fixed, let |$\partial J/\partial \mathbf{A} = 0$|⁠:

$$\begin{align} \partial J(\boldsymbol{\Theta},\mathbf{A},\mathbf{B},\boldsymbol{\beta}_{d},\boldsymbol{\beta}_{t})/\partial \mathbf{A} = 0 \end{align}$$

(10a)

$$\begin{align} -2(\mathbf{Y}_{train} - \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T})\mathbf{B}\boldsymbol{\Theta}^{T} + 2\lambda_{l} \mathbf{A} + 2 \lambda_{d} (\mathbf{K}_{d}^{*}-\mathbf{A}\mathbf{A}^{T})(-\mathbf{A})=0 \end{align}$$

(10b)

$$\begin{align} -\mathbf{Y}_{train}\mathbf{B}\boldsymbol{\Theta}^{T} + \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T}\mathbf{B}\boldsymbol{\Theta}^{T} + \lambda_{l} \mathbf{A} + \lambda_{d} \mathbf{A}\mathbf{A}^{T}\mathbf{A} - \lambda_{d} \mathbf{K}_{d}^{*} \mathbf{A} = 0 \end{align}$$

(10c)

$$\begin{align} \mathbf{A}(\boldsymbol{\Theta}\mathbf{B}^{T}\mathbf{B}\boldsymbol{\Theta}^{T} + \lambda_{l} \mathbf{I}_{r_{d}} + \lambda_{d} \mathbf{A}^{T}\mathbf{A}) = \mathbf{Y}_{train}\mathbf{B}\boldsymbol{\Theta}^{T} + \lambda_{d} \mathbf{K}_{d}^{*} \mathbf{A} \end{align}$$

(10d)

$$\begin{align} \mathbf{A} = (\mathbf{Y}_{train}\mathbf{B}\boldsymbol{\Theta}^{T} + \lambda_{d} \mathbf{K}_{d}^{*} \mathbf{A})(\boldsymbol{\Theta}\mathbf{B}^{T}\mathbf{B}\boldsymbol{\Theta}^{T} + \lambda_{l} \mathbf{I}_{r_{d}} + \lambda_{d} \mathbf{A}^{T}\mathbf{A})^{-1} \end{align}$$

(10e)

Next, setting |$\partial J/\partial \mathbf{B} = 0$|⁠:

$$\begin{align} \partial J(\boldsymbol{\Theta},\mathbf{A},\mathbf{B},\boldsymbol{\beta}_{d},\boldsymbol{\beta}_{t})/\partial \mathbf{B} = 0 \end{align}$$

(11a)

$$\begin{align} -2(\mathbf{Y}_{train} - \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T})^{T}\mathbf{A}\boldsymbol{\Theta} + 2\lambda_{l} \mathbf{B} + 2 \lambda_{t} (\mathbf{K}_{t}^{*}-\mathbf{B}\mathbf{B}^{T})(-\mathbf{B})=0 \end{align}$$

(11b)

$$\begin{align} -\mathbf{Y}_{train}^{T}\mathbf{A}\boldsymbol{\Theta} + \mathbf{B}\boldsymbol{\Theta}^{T}\mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta} + \lambda_{l} \mathbf{B} + \lambda_{t} \mathbf{B}\mathbf{B}^{T}\mathbf{B} - \lambda_{t} \mathbf{K}_{t}^{*} \mathbf{B} = 0 \end{align}$$

(11c)

$$\begin{align} \mathbf{B}(\boldsymbol{\Theta}^{T}\mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta} + \lambda_{l} \mathbf{I}_{r_{t}} + \lambda_{t} \mathbf{B}^{T}\mathbf{B}) = \mathbf{Y}_{train}^{T}\mathbf{A}\boldsymbol{\Theta} + \lambda_{t} \mathbf{K}_{t}^{*} \mathbf{B} \end{align}$$

(11d)

$$\begin{align} \mathbf{B} = (\mathbf{Y}_{train}^{T}\mathbf{A}\boldsymbol{\Theta} + \lambda_{t} \mathbf{K}_{t}^{*} \mathbf{B})(\boldsymbol{\Theta}^{T}\mathbf{A}^{T}\mathbf{A}\boldsymbol{\Theta} + \lambda_{l} \mathbf{I}_{r_{t}} + \lambda_{t} \mathbf{B}^{T}\mathbf{B})^{-1} \end{align}$$

(11e)

where |$\mathbf{I}_{r_{d}} \in \mathbf{R}^{r_{d} \times r_{d}}$| and |$\mathbf{I}_{r_{t}} \in \mathbf{R}^{r_{t} \times r_{t}}$| are identity matrices.

The weights of |$\boldsymbol{\beta }_{d}$| and |$\boldsymbol{\beta }_{t})$| can be calculated by

$$\begin{align}& \boldsymbol{\beta}_{d} = (\mathbf{G}_{d}+\lambda_{\beta}\mathbf{I}_{d} )^{-1} \nonumber\\ & \ \ (\mathbf{z}_{d} - \frac{(\mathbf{l}_{d}^{T}(\mathbf{G}_{d}+\lambda_{\beta}\mathbf{I}_{d} )^{-1}\mathbf{z}_{d}-1)}{\mathbf{l}_{d}^{T}(\mathbf{G}_{d}+\lambda_{\beta}\mathbf{I}_{d} )^{-1}\mathbf{l}_{d}}\mathbf{l}_{d}) \end{align}$$

(12)

$$\begin{align}& \boldsymbol{\beta}_{t} = (\mathbf{G}_{t}+\lambda_{\beta}\mathbf{I}_{t} )^{-1} \nonumber\\ & \ \ (\mathbf{z}_{t} - \frac{(\mathbf{l}_{t}^{T}(\mathbf{G}_{t}+\lambda_{\beta}\mathbf{I}_{t} )^{-1}\mathbf{z}_{t}-1)}{\mathbf{l}_{t}^{T}(\mathbf{G}_{t}+\lambda_{\beta}\mathbf{I}_{t} )^{-1}\mathbf{l}_{t}}\mathbf{l}_{t}) \end{align}$$

(13)

where |$\mathbf{I}_{d} \in \mathbf{R}^{k_{d} \times k_{d}}$| and |$\mathbf{I}_{t} \in \mathbf{R}^{k_{t} \times k_{t}}$| are also identity matrices. |$\mathbf{l}_{d}=(1,...,1)^{T} \in \mathbf{R}^{k_{d} \times 1}$| and |$\mathbf{l}_{t}=(1,...,1)^{T} \in \mathbf{R}^{k_{t} \times 1}$| are vectors with elements of |$1$|⁠. And |$\mathbf{G}_{d} \in \mathbf{R}^{k_{d} \times k_{d}}$|⁠, |$\mathbf{G}_{t} \in \mathbf{R}^{k_{t} \times k_{t}}$|⁠, |$\mathbf{z}_{d} \in \mathbf{R}^{k_{d} \times 1}$| and |$\mathbf{z}_{t} \in \mathbf{R}^{k_{t} \times 1}$| can be calculated as follows:

$$\begin{align} \mathbf{G}_{d}(i,j) &= Trace(\mathbf{K}_{d}^{i}(\mathbf{K}_{d}^{j})^{T}), \end{align}$$

(14a)

$$\begin{align} \mathbf{z}_{d}(i) &= Trace(\mathbf{A}^{T}\mathbf{K}_{d}^{i}\mathbf{A}), \ \ i,j = 1,...,k_{d} \end{align}$$

(14b)

$$\begin{align} \mathbf{G}_{t}(i,j) &= Trace(\mathbf{K}_{t}^{i}(\mathbf{K}_{t}^{j})^{T}), \end{align}$$

(15a)

$$\begin{align} \mathbf{z}_{t}(i) &= Trace(\mathbf{B}^{T}\mathbf{K}_{t}^{i}\mathbf{B}), \ \ i,j = 1,...,k_{t} \end{align}$$

(15b)

The final predictive values can be constructed by

$$\begin{align}& \mathbf{Y}^{*} = \mathbf{A}\boldsymbol{\Theta}\mathbf{B}^{T}, \end{align}$$

(16)

where |$\mathbf{Y}^{*} \in \mathbf{R}^{n \times m}$|⁠, and the highly ranked pair score |$\mathbf{Y}^{*}(i,j)$| for drug |$i$| and target |$j$|⁠, the higher probability of interaction between them. In our study, the initialization of |$\mathbf{A}$| and |$\mathbf{B}$| are calculated as follows:

$$\begin{align} \mathbf{A} &= \mathbf{U}_{d}\mathbf{S}_{d,r_{d}}^{1/2} \end{align}$$

(17a)

$$\begin{align} \mathbf{K}_{d}^{*} &= \mathbf{U}_{d}\mathbf{S}_{d,r_{d}}\mathbf{V}_{d}^{T} \end{align}$$

(17b)

$$\begin{align} \mathbf{U}_{d} & \in \mathbf{R}^{n \times r_{d}}, \ \ \mathbf{S}_{d,r_{d}} \in \mathbf{R}^{r_{d} \times r_{d}}, \ \ \mathbf{V}_{d} \in \mathbf{R}^{n \times r_{d}} \end{align}$$

(17c)

$$\begin{align} \mathbf{B} &= \mathbf{U}_{t}\mathbf{S}_{t,r_{t}}^{1/2} \end{align}$$

(18a)

$$\begin{align} \mathbf{K}_{t}^{*} &= \mathbf{U}_{t}\mathbf{S}_{t,r_{t}}\mathbf{V}_{t}^{T} \end{align}$$

(18b)

$$\begin{align} \mathbf{U}_{t} & \in \mathbf{R}^{m \times r_{t}}, \ \ \mathbf{S}_{t,r_{t}} \in \mathbf{R}^{r_{t} \times r_{t}}, \ \ \mathbf{V}_{t} \in \mathbf{R}^{m \times r_{t}} \end{align}$$

(18c)

where |$\mathbf{K}_{d}^{*}$| and |$\mathbf{K}_{t}^{*}$| can be decomposed by |$\mathbf{U}_{d} \in \mathbf{R}^{n \times r_{d}}, \ \mathbf{S}_{d,r_{d}} \in \mathbf{R}^{r_{d} \times r_{d}}, \ \mathbf{V}_{d} \in \mathbf{R}^{n \times r_{d}}$| and |$\mathbf{U}_{t} \in \mathbf{R}^{m \times r_{t}}, \ \mathbf{S}_{t,r_{t}} \in \mathbf{R}^{r_{t} \times r_{t}}, \ \mathbf{V}_{t} \in \mathbf{R}^{m \times r_{t}}$|⁠, respectively. |$\mathbf{S}_{d,r_{d}}$| and |$\mathbf{S}_{t,r_{t}}$| are two diagonal matrices containing the |$r_{d}$| and |$r_{t}$| largest singular values.

The process of MK-TCMF is shown in Figure 2 and Algorithm 1.

3 Results

3.1 Cross-validation test set

In our study, |$10$|-fold Cross Validation (CV) is utilized to test our model. The evaluation method of predictive performance is Area Under the Precision-Recall curve (AUPR). To fully test the predictive performance of the model, the following three types of cross-validation test sets (CVS) [29] will be used: CVS1: Identifying potential DTIs from known network. Targets and drugs both exist in testing and training set; CVS2: Identifying DTIs for novel drugs. The new drugs are not present in training set; CVS3: Identifying DTIs of new targets. The new targets do not exist in training set.

3.2 The parameters of models

We obtain the best model parameters via grid search. The parameters are listed in Table 3. The range of |$\lambda _{\Theta }$|⁠, |$\lambda _{l}$|⁠, |$\lambda _{d}$|⁠, |$\lambda _{t}$|⁠, |$\lambda _{\beta }$| is from |$0$| to |$1$| with step of |$0.1$|⁠. |$r_{d}$| and |$r_{t}$| are smaller than the size of the corresponding kernel matrix, respectively.

In addition, the results of iterations are shown in Figure 3. After several rounds of iterative optimization, the model tends to be stable. At the end, we chose different maximum iterations for different data sets. The numbers of maximum iterations are |$5$|⁠, |$2$|⁠, |$2$| and |$5$| for Es, IC, GPCRs and NRs, respectively.

Table 3

The parameters of models.

Data set	\|$\lambda _{\Theta }$\|	\|$\lambda _{l}$\|	\|$\lambda _{d}$\|	\|$\lambda _{t}$\|	\|$\lambda _{\beta }$\|	\|$r_{d}$\|	\|$r_{t}$\|	\|$iter$\|
Es	1	1	1	1	1	300	300	5
IC	1	1	1	1	1	200	200	2
GPCRs	1	1	1	1	1	200	80	2
NRs	0.5	0.5	0.5	0.5	0.5	53	21	5

Data set	\|$\lambda _{\Theta }$\|	\|$\lambda _{l}$\|	\|$\lambda _{d}$\|	\|$\lambda _{t}$\|	\|$\lambda _{\beta }$\|	\|$r_{d}$\|	\|$r_{t}$\|	\|$iter$\|
Es	1	1	1	1	1	300	300	5
IC	1	1	1	1	1	200	200	2
GPCRs	1	1	1	1	1	200	80	2
NRs	0.5	0.5	0.5	0.5	0.5	53	21	5

Table 3

The parameters of models.

Data set	\|$\lambda _{\Theta }$\|	\|$\lambda _{l}$\|	\|$\lambda _{d}$\|	\|$\lambda _{t}$\|	\|$\lambda _{\beta }$\|	\|$r_{d}$\|	\|$r_{t}$\|	\|$iter$\|
Es	1	1	1	1	1	300	300	5
IC	1	1	1	1	1	200	200	2
GPCRs	1	1	1	1	1	200	80	2
NRs	0.5	0.5	0.5	0.5	0.5	53	21	5

Data set	\|$\lambda _{\Theta }$\|	\|$\lambda _{l}$\|	\|$\lambda _{d}$\|	\|$\lambda _{t}$\|	\|$\lambda _{\beta }$\|	\|$r_{d}$\|	\|$r_{t}$\|	\|$iter$\|
Es	1	1	1	1	1	300	300	5
IC	1	1	1	1	1	200	200	2
GPCRs	1	1	1	1	1	200	80	2
NRs	0.5	0.5	0.5	0.5	0.5	53	21	5

Figure 3

The performance (AUPR) with different numbers of iterations.

3.3 Performance analysis

To evaluate the performance of the MKL algorithm, we test the MK-TCMF and the MK-TCMF (mean weighted) method on four data sets. The results are listed in Table 4 and Figure 4. The AUPRs of MK-TCMF are |$0.912$|⁠, |$0.933$|⁠, |$0.752$| and |$0.552$|⁠, respectively. They are all better than method of MK-TCMF (mean weighted). The MKL algorithm can regulate the weight of each kernel matrix for an optimal combination.

Table 4

The AUPR of different weighted methods under CV1.

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
MK-TCMF (mean weighted)	0.899	0.923	0.670	0.474

Table 4

The AUPR of different weighted methods under CV1.

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
MK-TCMF (mean weighted)	0.899	0.923	0.670	0.474

Figure 4

The AUPR of different weighted methods under CVS1.

Different kernel matrices often have different contribution of prediction model, so the MKL algorithm assigns different weights to kernels. In general, the higher the value of weight, the greater the contribution for the model. In Figure 5, |$\mathbf{K}_{SE,d}$| (drug), |$\mathbf{K}_{GIP,d}$| (drug), |$\mathbf{K}_{SW,t}$| (target) and |$\mathbf{K}_{GIP,t}$| (target) have higher weights in drug and target feature spaces. |$\mathbf{K}_{GIP,d}$| (drug) and |$\mathbf{K}_{GIP,t}$| (target) contain the known association information between the drug and the target; it provides a priori information in the prediction process.

Figure 5

The weights on four sets.

To further verify the effectiveness of our method, other multi-kernel learning methods are also used to fuse the kernel matrix. They are Heuristically Kernel Alignment-based MKL (HKA-MKL) [17], fast kernel learning-based MKL (FKL-MKL) [42] and centered kernel alignment-based MKL (CKA-MKL) [43]. The results of different MKL model are listed in Table 5. MK-TCMF obtains the best performance on four data sets. On Es, IC, GPCRs and NRs, the AUPRs are |$0.912$|⁠, |$0.933$|⁠, |$0.752$| and |$0.552$|⁠, respectively. The TCMF + CKA-MKL also has good performance on Es (⁠|$0.900$|⁠), IC (⁠|$0.927$|⁠) and GPCRs(⁠|$0.724$|⁠).

3.4 Comparison with existing predictors

In this section, NetLapRLS [14], WNN-GIP [19], CMF [27], NRLMF [29], BLM-NII [22], GRMF [30], VB-MK-LMF and KBMF2K [28] are compared with our model (MK-TCMF). CVS1, CVS2 and CVS3 are utilized to verify these predictive models. The results of comparisons (AUPR) are list in Tables 6, 7 and 8. The results of VB-MK-LMF are taken from the work [32] of Bence. And the result of GRMF were incomplete (missing CVS1). So, we employ the source code and parameter settings, which were obtained from the article of GRMF, to get relevant results. In CVS1 (Table 6), our method (MK-TCMF) achieves best AUPR (IC: |$0.933$|⁠, Es: |$0.912$|⁠) on data sets of IC and Es, respectively. And second best AUPR (⁠|$0.752$|⁠) is achieved by MK-TCMF on GPCRs data set. GRMF (⁠|$0.923$|⁠) and CMF (⁠|$0.923$|⁠) have second best AUPR on IC data set. The best AUPRs of GPCRs (⁠|$0.777$|⁠) and NRs (⁠|$0.773$|⁠) are achieved by VB-MK-LMF. It can be seen that the MF-based methods have better predictive performance.

Table 5

AUPRs of different MKL methods (withe TCMF) under CVS1.

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
TCMF + FKL-MKL	0.884	0.920	0.703	0.503
TCMF + TKA-MKL	0.899	0.918	0.716	0.540
TCMF + CKA-MKL	0.900	0.927	0.724	0.538

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
TCMF + FKL-MKL	0.884	0.920	0.703	0.503
TCMF + TKA-MKL	0.899	0.918	0.716	0.540
TCMF + CKA-MKL	0.900	0.927	0.724	0.538

Table 5

AUPRs of different MKL methods (withe TCMF) under CVS1.

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
TCMF + FKL-MKL	0.884	0.920	0.703	0.503
TCMF + TKA-MKL	0.899	0.918	0.716	0.540
TCMF + CKA-MKL	0.900	0.927	0.724	0.538

Method	Es	IC	GPCRs	NRs
MK-TCMF	0.912	0.933	0.752	0.552
TCMF + FKL-MKL	0.884	0.920	0.703	0.503
TCMF + TKA-MKL	0.899	0.918	0.716	0.540
TCMF + CKA-MKL	0.900	0.927	0.724	0.538

Table 6

Comparison results with existing models in CVS1.

Model	Es	IC	GPCRs	NRs
Our method	0.912\|$\pm $\|0.018	0.933\|$\pm $\|0.017	0.752\|$\pm $\|0.040	0.552\|$\pm $\|0.099
GRMF	0.878\|$\pm $\|0.002	0.923\|$\pm $\|0.002	0.737\|$\pm $\|0.002	0.602\|$\pm $\|0.038
VB-MK-LMF\|$^{1}$\|	0.890\|$\pm $\|0.006	0.916\|$\pm $\|0.007	0.777\|$\pm $\|0.016	0.773\|$\pm $\|0.030
NRLMF\|$^{2}$\|	0.892\|$\pm $\|0.006	0.906\|$\pm $\|0.008	0.749\|$\pm $\|0.015	0.728\|$\pm $\|0.041
CMF\|$^{2}$\|	0.877\|$\pm $\|0.005	0.923\|$\pm $\|0.006	0.745\|$\pm $\|0.013	0.584\|$\pm $\|0.042
KBMF2K\|$^{2}$\|	0.654\|$\pm $\|0.008	0.771\|$\pm $\|0.009	0.578\|$\pm $\|0.018	0.534\|$\pm $\|0.050
WNN-GIP\|$^{2}$\|	0.706\|$\pm $\|0.017	0.717\|$\pm $\|0.020	0.520\|$\pm $\|0.021	0.589\|$\pm $\|0.034
BLM-NII\|$^{2}$\|	0.752\|$\pm $\|0.011	0.821\|$\pm $\|0.012	0.524\|$\pm $\|0.024	0.659\|$\pm $\|0.039
NetLapRLS\|$^{2}$\|	0.789\|$\pm $\|0.005	0.837\|$\pm $\|0.009	0.616\|$\pm $\|0.015	0.465\|$\pm $\|0.044

Model	Es	IC	GPCRs	NRs
Our method	0.912\|$\pm $\|0.018	0.933\|$\pm $\|0.017	0.752\|$\pm $\|0.040	0.552\|$\pm $\|0.099
GRMF	0.878\|$\pm $\|0.002	0.923\|$\pm $\|0.002	0.737\|$\pm $\|0.002	0.602\|$\pm $\|0.038
VB-MK-LMF\|$^{1}$\|	0.890\|$\pm $\|0.006	0.916\|$\pm $\|0.007	0.777\|$\pm $\|0.016	0.773\|$\pm $\|0.030
NRLMF\|$^{2}$\|	0.892\|$\pm $\|0.006	0.906\|$\pm $\|0.008	0.749\|$\pm $\|0.015	0.728\|$\pm $\|0.041
CMF\|$^{2}$\|	0.877\|$\pm $\|0.005	0.923\|$\pm $\|0.006	0.745\|$\pm $\|0.013	0.584\|$\pm $\|0.042
KBMF2K\|$^{2}$\|	0.654\|$\pm $\|0.008	0.771\|$\pm $\|0.009	0.578\|$\pm $\|0.018	0.534\|$\pm $\|0.050
WNN-GIP\|$^{2}$\|	0.706\|$\pm $\|0.017	0.717\|$\pm $\|0.020	0.520\|$\pm $\|0.021	0.589\|$\pm $\|0.034
BLM-NII\|$^{2}$\|	0.752\|$\pm $\|0.011	0.821\|$\pm $\|0.012	0.524\|$\pm $\|0.024	0.659\|$\pm $\|0.039
NetLapRLS\|$^{2}$\|	0.789\|$\pm $\|0.005	0.837\|$\pm $\|0.009	0.616\|$\pm $\|0.015	0.465\|$\pm $\|0.044

¹Results are from [32].

²Results are from [29]. The bold faces and underlined are best and second best results in each column.

Table 6

Comparison results with existing models in CVS1.

Model	Es	IC	GPCRs	NRs
Our method	0.912\|$\pm $\|0.018	0.933\|$\pm $\|0.017	0.752\|$\pm $\|0.040	0.552\|$\pm $\|0.099
GRMF	0.878\|$\pm $\|0.002	0.923\|$\pm $\|0.002	0.737\|$\pm $\|0.002	0.602\|$\pm $\|0.038
VB-MK-LMF\|$^{1}$\|	0.890\|$\pm $\|0.006	0.916\|$\pm $\|0.007	0.777\|$\pm $\|0.016	0.773\|$\pm $\|0.030
NRLMF\|$^{2}$\|	0.892\|$\pm $\|0.006	0.906\|$\pm $\|0.008	0.749\|$\pm $\|0.015	0.728\|$\pm $\|0.041
CMF\|$^{2}$\|	0.877\|$\pm $\|0.005	0.923\|$\pm $\|0.006	0.745\|$\pm $\|0.013	0.584\|$\pm $\|0.042
KBMF2K\|$^{2}$\|	0.654\|$\pm $\|0.008	0.771\|$\pm $\|0.009	0.578\|$\pm $\|0.018	0.534\|$\pm $\|0.050
WNN-GIP\|$^{2}$\|	0.706\|$\pm $\|0.017	0.717\|$\pm $\|0.020	0.520\|$\pm $\|0.021	0.589\|$\pm $\|0.034
BLM-NII\|$^{2}$\|	0.752\|$\pm $\|0.011	0.821\|$\pm $\|0.012	0.524\|$\pm $\|0.024	0.659\|$\pm $\|0.039
NetLapRLS\|$^{2}$\|	0.789\|$\pm $\|0.005	0.837\|$\pm $\|0.009	0.616\|$\pm $\|0.015	0.465\|$\pm $\|0.044

Model	Es	IC	GPCRs	NRs
Our method	0.912\|$\pm $\|0.018	0.933\|$\pm $\|0.017	0.752\|$\pm $\|0.040	0.552\|$\pm $\|0.099
GRMF	0.878\|$\pm $\|0.002	0.923\|$\pm $\|0.002	0.737\|$\pm $\|0.002	0.602\|$\pm $\|0.038
VB-MK-LMF\|$^{1}$\|	0.890\|$\pm $\|0.006	0.916\|$\pm $\|0.007	0.777\|$\pm $\|0.016	0.773\|$\pm $\|0.030
NRLMF\|$^{2}$\|	0.892\|$\pm $\|0.006	0.906\|$\pm $\|0.008	0.749\|$\pm $\|0.015	0.728\|$\pm $\|0.041
CMF\|$^{2}$\|	0.877\|$\pm $\|0.005	0.923\|$\pm $\|0.006	0.745\|$\pm $\|0.013	0.584\|$\pm $\|0.042
KBMF2K\|$^{2}$\|	0.654\|$\pm $\|0.008	0.771\|$\pm $\|0.009	0.578\|$\pm $\|0.018	0.534\|$\pm $\|0.050
WNN-GIP\|$^{2}$\|	0.706\|$\pm $\|0.017	0.717\|$\pm $\|0.020	0.520\|$\pm $\|0.021	0.589\|$\pm $\|0.034
BLM-NII\|$^{2}$\|	0.752\|$\pm $\|0.011	0.821\|$\pm $\|0.012	0.524\|$\pm $\|0.024	0.659\|$\pm $\|0.039
NetLapRLS\|$^{2}$\|	0.789\|$\pm $\|0.005	0.837\|$\pm $\|0.009	0.616\|$\pm $\|0.015	0.465\|$\pm $\|0.044

¹Results are from [32].

²Results are from [29]. The bold faces and underlined are best and second best results in each column.

Table 7

Comparison results with existing models in CVS2.

Model	Es	IC	GPCRs	NRs
Our method	0.407\|$\pm $\|0.050	0.426\|$\pm $\|0.086	0.412\|$\pm $\|0.071	0.386\|$\pm $\|0.098
GRMF	0.390\|$\pm $\|0.010	0.356\|$\pm $\|0.014	0.404\|$\pm $\|0.011	0.542\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.349\|$\pm $\|0.042	0.345\|$\pm $\|0.035	0.368\|$\pm $\|0.023	0.593\|$\pm $\|0.058
NRLMF\|$^{2}$\|	0.358\|$\pm $\|0.040	0.344\|$\pm $\|0.033	0.364\|$\pm $\|0.023	0.545\|$\pm $\|0.054
CMF\|$^{2}$\|	0.229 \|$\pm $\|0.020	0.286\|$\pm $\|0.030	0.365\|$\pm $\|0.022	0.488\|$\pm $\|0.050
KBMF2K\|$^{2}$\|	0.263\|$\pm $\|0.033	0.308\|$\pm $\|0.038	0.366\|$\pm $\|0.024	0.477\|$\pm $\|0.049
WNN-GIP\|$^{2}$\|	0.278\|$\pm $\|0.037	0.258\|$\pm $\|0.032	0.295\|$\pm $\|0.025	0.504\|$\pm $\|0.056
BLM-NII\|$^{2}$\|	0.253\|$\pm $\|0.023	0.302\|$\pm $\|0.033	0.315\|$\pm $\|0.022	0.438\|$\pm $\|0.048
NetLapRLS\|$^{2}$\|	0.123\|$\pm $\|0.009	0.200\|$\pm $\|0.026	0.229\|$\pm $\|0.017	0.417\|$\pm $\|0.048

Model	Es	IC	GPCRs	NRs
Our method	0.407\|$\pm $\|0.050	0.426\|$\pm $\|0.086	0.412\|$\pm $\|0.071	0.386\|$\pm $\|0.098
GRMF	0.390\|$\pm $\|0.010	0.356\|$\pm $\|0.014	0.404\|$\pm $\|0.011	0.542\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.349\|$\pm $\|0.042	0.345\|$\pm $\|0.035	0.368\|$\pm $\|0.023	0.593\|$\pm $\|0.058
NRLMF\|$^{2}$\|	0.358\|$\pm $\|0.040	0.344\|$\pm $\|0.033	0.364\|$\pm $\|0.023	0.545\|$\pm $\|0.054
CMF\|$^{2}$\|	0.229 \|$\pm $\|0.020	0.286\|$\pm $\|0.030	0.365\|$\pm $\|0.022	0.488\|$\pm $\|0.050
KBMF2K\|$^{2}$\|	0.263\|$\pm $\|0.033	0.308\|$\pm $\|0.038	0.366\|$\pm $\|0.024	0.477\|$\pm $\|0.049
WNN-GIP\|$^{2}$\|	0.278\|$\pm $\|0.037	0.258\|$\pm $\|0.032	0.295\|$\pm $\|0.025	0.504\|$\pm $\|0.056
BLM-NII\|$^{2}$\|	0.253\|$\pm $\|0.023	0.302\|$\pm $\|0.033	0.315\|$\pm $\|0.022	0.438\|$\pm $\|0.048
NetLapRLS\|$^{2}$\|	0.123\|$\pm $\|0.009	0.200\|$\pm $\|0.026	0.229\|$\pm $\|0.017	0.417\|$\pm $\|0.048

|$^{1}$|Results are from [32].

|$^{2}$|Results are from [29]. The bold faces and underlined are best and second best results in each column.

Table 7

Comparison results with existing models in CVS2.

Model	Es	IC	GPCRs	NRs
Our method	0.407\|$\pm $\|0.050	0.426\|$\pm $\|0.086	0.412\|$\pm $\|0.071	0.386\|$\pm $\|0.098
GRMF	0.390\|$\pm $\|0.010	0.356\|$\pm $\|0.014	0.404\|$\pm $\|0.011	0.542\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.349\|$\pm $\|0.042	0.345\|$\pm $\|0.035	0.368\|$\pm $\|0.023	0.593\|$\pm $\|0.058
NRLMF\|$^{2}$\|	0.358\|$\pm $\|0.040	0.344\|$\pm $\|0.033	0.364\|$\pm $\|0.023	0.545\|$\pm $\|0.054
CMF\|$^{2}$\|	0.229 \|$\pm $\|0.020	0.286\|$\pm $\|0.030	0.365\|$\pm $\|0.022	0.488\|$\pm $\|0.050
KBMF2K\|$^{2}$\|	0.263\|$\pm $\|0.033	0.308\|$\pm $\|0.038	0.366\|$\pm $\|0.024	0.477\|$\pm $\|0.049
WNN-GIP\|$^{2}$\|	0.278\|$\pm $\|0.037	0.258\|$\pm $\|0.032	0.295\|$\pm $\|0.025	0.504\|$\pm $\|0.056
BLM-NII\|$^{2}$\|	0.253\|$\pm $\|0.023	0.302\|$\pm $\|0.033	0.315\|$\pm $\|0.022	0.438\|$\pm $\|0.048
NetLapRLS\|$^{2}$\|	0.123\|$\pm $\|0.009	0.200\|$\pm $\|0.026	0.229\|$\pm $\|0.017	0.417\|$\pm $\|0.048

Model	Es	IC	GPCRs	NRs
Our method	0.407\|$\pm $\|0.050	0.426\|$\pm $\|0.086	0.412\|$\pm $\|0.071	0.386\|$\pm $\|0.098
GRMF	0.390\|$\pm $\|0.010	0.356\|$\pm $\|0.014	0.404\|$\pm $\|0.011	0.542\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.349\|$\pm $\|0.042	0.345\|$\pm $\|0.035	0.368\|$\pm $\|0.023	0.593\|$\pm $\|0.058
NRLMF\|$^{2}$\|	0.358\|$\pm $\|0.040	0.344\|$\pm $\|0.033	0.364\|$\pm $\|0.023	0.545\|$\pm $\|0.054
CMF\|$^{2}$\|	0.229 \|$\pm $\|0.020	0.286\|$\pm $\|0.030	0.365\|$\pm $\|0.022	0.488\|$\pm $\|0.050
KBMF2K\|$^{2}$\|	0.263\|$\pm $\|0.033	0.308\|$\pm $\|0.038	0.366\|$\pm $\|0.024	0.477\|$\pm $\|0.049
WNN-GIP\|$^{2}$\|	0.278\|$\pm $\|0.037	0.258\|$\pm $\|0.032	0.295\|$\pm $\|0.025	0.504\|$\pm $\|0.056
BLM-NII\|$^{2}$\|	0.253\|$\pm $\|0.023	0.302\|$\pm $\|0.033	0.315\|$\pm $\|0.022	0.438\|$\pm $\|0.048
NetLapRLS\|$^{2}$\|	0.123\|$\pm $\|0.009	0.200\|$\pm $\|0.026	0.229\|$\pm $\|0.017	0.417\|$\pm $\|0.048

|$^{1}$|Results are from [32].

|$^{2}$|Results are from [29]. The bold faces and underlined are best and second best results in each column.

Table 8

Comparison results with existing models in CVS3.

Model	Es	IC	GPCRs	NRs
Our method	0.831\|$\pm $\|0.044	0.826\|$\pm $\|0.079	0.583\|$\pm $\|0.090	0.384\|$\pm $\|0.097
GRMF	0.807\|$\pm $\|0.016	0.815\|$\pm $\|0.010	0.615\|$\pm $\|0.023	0.500\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.794\|$\pm $\|0.017	0.826\|$\pm $\|0.021	0.596\|$\pm $\|0.040	0.601\|$\pm $\|0.081
NRLMF\|$^{2}$\|	0.812\|$\pm $\|0.018	0.785\|$\pm $\|0.028	0.556\|$\pm $\|0.038	0.449\|$\pm $\|0.079
CMF\|$^{2}$\|	0.698\|$\pm $\|0.021	0.620\|$\pm $\|0.027	0.433\|$\pm $\|0.028	0.400\|$\pm $\|0.077
KBMF2K\|$^{2}$\|	0.565\|$\pm $\|0.023	0.677\|$\pm $\|0.021	0.516\|$\pm $\|0.045	0.324\|$\pm $\|0.071
WNN-GIP\|$^{2}$\|	0.566\|$\pm $\|0.038	0.696\|$\pm $\|0.035	0.550\|$\pm $\|0.047	0.531\|$\pm $\|0.073
BLM-NII\|$^{2}$\|	0.735\|$\pm $\|0.022	0.762\|$\pm $\|0.020	0.341\|$\pm $\|0.034	0.402\|$\pm $\|0.083
NetLapRLS\|$^{2}$\|	0.669\|$\pm $\|0.021	0.737\|$\pm $\|0.020	0.334\|$\pm $\|0.025	0.449\|$\pm $\|0.074

Model	Es	IC	GPCRs	NRs
Our method	0.831\|$\pm $\|0.044	0.826\|$\pm $\|0.079	0.583\|$\pm $\|0.090	0.384\|$\pm $\|0.097
GRMF	0.807\|$\pm $\|0.016	0.815\|$\pm $\|0.010	0.615\|$\pm $\|0.023	0.500\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.794\|$\pm $\|0.017	0.826\|$\pm $\|0.021	0.596\|$\pm $\|0.040	0.601\|$\pm $\|0.081
NRLMF\|$^{2}$\|	0.812\|$\pm $\|0.018	0.785\|$\pm $\|0.028	0.556\|$\pm $\|0.038	0.449\|$\pm $\|0.079
CMF\|$^{2}$\|	0.698\|$\pm $\|0.021	0.620\|$\pm $\|0.027	0.433\|$\pm $\|0.028	0.400\|$\pm $\|0.077
KBMF2K\|$^{2}$\|	0.565\|$\pm $\|0.023	0.677\|$\pm $\|0.021	0.516\|$\pm $\|0.045	0.324\|$\pm $\|0.071
WNN-GIP\|$^{2}$\|	0.566\|$\pm $\|0.038	0.696\|$\pm $\|0.035	0.550\|$\pm $\|0.047	0.531\|$\pm $\|0.073
BLM-NII\|$^{2}$\|	0.735\|$\pm $\|0.022	0.762\|$\pm $\|0.020	0.341\|$\pm $\|0.034	0.402\|$\pm $\|0.083
NetLapRLS\|$^{2}$\|	0.669\|$\pm $\|0.021	0.737\|$\pm $\|0.020	0.334\|$\pm $\|0.025	0.449\|$\pm $\|0.074

|$^{1}$|Results are from [32].

|$^{2}$| Results are from [29]. The bold faces and underlined are best and second best results in each column.

Table 8

Comparison results with existing models in CVS3.

Model	Es	IC	GPCRs	NRs
Our method	0.831\|$\pm $\|0.044	0.826\|$\pm $\|0.079	0.583\|$\pm $\|0.090	0.384\|$\pm $\|0.097
GRMF	0.807\|$\pm $\|0.016	0.815\|$\pm $\|0.010	0.615\|$\pm $\|0.023	0.500\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.794\|$\pm $\|0.017	0.826\|$\pm $\|0.021	0.596\|$\pm $\|0.040	0.601\|$\pm $\|0.081
NRLMF\|$^{2}$\|	0.812\|$\pm $\|0.018	0.785\|$\pm $\|0.028	0.556\|$\pm $\|0.038	0.449\|$\pm $\|0.079
CMF\|$^{2}$\|	0.698\|$\pm $\|0.021	0.620\|$\pm $\|0.027	0.433\|$\pm $\|0.028	0.400\|$\pm $\|0.077
KBMF2K\|$^{2}$\|	0.565\|$\pm $\|0.023	0.677\|$\pm $\|0.021	0.516\|$\pm $\|0.045	0.324\|$\pm $\|0.071
WNN-GIP\|$^{2}$\|	0.566\|$\pm $\|0.038	0.696\|$\pm $\|0.035	0.550\|$\pm $\|0.047	0.531\|$\pm $\|0.073
BLM-NII\|$^{2}$\|	0.735\|$\pm $\|0.022	0.762\|$\pm $\|0.020	0.341\|$\pm $\|0.034	0.402\|$\pm $\|0.083
NetLapRLS\|$^{2}$\|	0.669\|$\pm $\|0.021	0.737\|$\pm $\|0.020	0.334\|$\pm $\|0.025	0.449\|$\pm $\|0.074

Model	Es	IC	GPCRs	NRs
Our method	0.831\|$\pm $\|0.044	0.826\|$\pm $\|0.079	0.583\|$\pm $\|0.090	0.384\|$\pm $\|0.097
GRMF	0.807\|$\pm $\|0.016	0.815\|$\pm $\|0.010	0.615\|$\pm $\|0.023	0.500\|$\pm $\|0.028
VB-MK-LMF\|$^{1}$\|	0.794\|$\pm $\|0.017	0.826\|$\pm $\|0.021	0.596\|$\pm $\|0.040	0.601\|$\pm $\|0.081
NRLMF\|$^{2}$\|	0.812\|$\pm $\|0.018	0.785\|$\pm $\|0.028	0.556\|$\pm $\|0.038	0.449\|$\pm $\|0.079
CMF\|$^{2}$\|	0.698\|$\pm $\|0.021	0.620\|$\pm $\|0.027	0.433\|$\pm $\|0.028	0.400\|$\pm $\|0.077
KBMF2K\|$^{2}$\|	0.565\|$\pm $\|0.023	0.677\|$\pm $\|0.021	0.516\|$\pm $\|0.045	0.324\|$\pm $\|0.071
WNN-GIP\|$^{2}$\|	0.566\|$\pm $\|0.038	0.696\|$\pm $\|0.035	0.550\|$\pm $\|0.047	0.531\|$\pm $\|0.073
BLM-NII\|$^{2}$\|	0.735\|$\pm $\|0.022	0.762\|$\pm $\|0.020	0.341\|$\pm $\|0.034	0.402\|$\pm $\|0.083
NetLapRLS\|$^{2}$\|	0.669\|$\pm $\|0.021	0.737\|$\pm $\|0.020	0.334\|$\pm $\|0.025	0.449\|$\pm $\|0.074

|$^{1}$|Results are from [32].

|$^{2}$| Results are from [29]. The bold faces and underlined are best and second best results in each column.

Actually, the interactions for novel targets or novel drugs do not exist in database. The CVS2 and CVS3 are used to test the performance of predicting new drugs or targets, which do not exist in training set. The AUPRs are listed in Tables 7 and 8. Under CVS2, our model (MK-TCMF) has the best AUPRs on GPCRs (⁠|$0.412$|⁠), IC (⁠|$0.426$|⁠) and Es (⁠|$0.407$|⁠). Compared with the second best results, |$0.017$|⁠, |$0.070$| and |$0.008$| are improved. Under CV3, MK-TCMF achieves best AUPRs on Es (⁠|$0.831$|⁠) and IC (⁠|$0.826$|⁠). But the results of MK-TCMF are not outstanding under CVS1 (⁠|$0.552$|⁠), CVS2 (⁠|$0.386$|⁠) and CVS3 (⁠|$0.384$|⁠) on NRs data set. From these results (CVS1, CVS2 and CVS3), we can find that MK-TCMF is still comparable to other methods. For NRs data set, the main reason is the size of data set, which is excessively small. The NRs only contain 26 targets and 54 drugs. The model is not only prone to overfitting on the training data, but also overfitting on the validation set, which ultimately reduces the stability of the model. Outliers may appear in features or in response variables. The cost of processing these outliers is high. In future work, we will deal with the outliers to improve the generalization performance of the model on small data sets.

3.5 Comparison on DTINet data set

We also test MK-TCMF on data set of DTINet [33], which employed |$1512$| targets and |$708$| drugs. The comparison methods include DTINet [33], GRMF [30], VB-MK-LMF [32], NRLMF [29], CMF [27], GRGMF [31], KronRLS-MKL [17], BLM-NII [22], NetLapRLS [14], GCN-DTI [26] and DTI-CNN [24]. The input of these models are same, including known drug–protein, protein–disease, protein–protein, drug–disease, drug-side effect, drug–drug associations. Above models are all performed under the same random seed. Except for DTINet, VB-MK-LMF, KronRLS-MKL and MK-TCMF, the remaining methods fuse heterogeneous information by average weighting. The values of AUPR and area under curve (AUC) are shown in Table 9. Our model achieves best AUPR and AUC of |$0.949$| and |$0.936$|⁠. DTI-CNN has best AUC of |$0.936$|⁠. The AUPR (⁠|$0.940$|⁠) and AUC (⁠|$0.932$|⁠) values of GRGMF are second best results.

Table 9

Comparison on DTINet data set (with same random seed).

Method	AUC	AUPR
Our method	0.936\|$\pm $\|0.020	0.949\|$\pm $\|0.021
DTINet	0.922\|$\pm $\|0.019	0.931\|$\pm $\|0.021
GRMF	0.877\|$\pm $\|0.025	0.913\|$\pm $\|0.016
VB-MK-LMF	0.921\|$\pm $\|0.019	0.937\|$\pm $\|0.017
NRLMF	0.905\|$\pm $\|0.023	0.918\|$\pm $\|0.018
CMF	0.895\|$\pm $\|0.030	0.924\|$\pm $\|0.018
GRGMF	0.932\|$\pm $\|0.019	0.940\|$\pm $\|0.022
KronRLS-MKL	0.919\|$\pm $\|0.027	0.938\|$\pm $\|0.023
BLM-NII	0.894\|$\pm $\|0.021	0.910\|$\pm $\|0.014
NetLapRLS	0.904\|$\pm $\|0.019	0.913\|$\pm $\|0.015
GCN-DTI	0.929\|$\pm $\|0.021	0.936\|$\pm $\|0.024
DTI-CNN	0.936\|$\pm $\|0.022	0.939\|$\pm $\|0.016

Method	AUC	AUPR
Our method	0.936\|$\pm $\|0.020	0.949\|$\pm $\|0.021
DTINet	0.922\|$\pm $\|0.019	0.931\|$\pm $\|0.021
GRMF	0.877\|$\pm $\|0.025	0.913\|$\pm $\|0.016
VB-MK-LMF	0.921\|$\pm $\|0.019	0.937\|$\pm $\|0.017
NRLMF	0.905\|$\pm $\|0.023	0.918\|$\pm $\|0.018
CMF	0.895\|$\pm $\|0.030	0.924\|$\pm $\|0.018
GRGMF	0.932\|$\pm $\|0.019	0.940\|$\pm $\|0.022
KronRLS-MKL	0.919\|$\pm $\|0.027	0.938\|$\pm $\|0.023
BLM-NII	0.894\|$\pm $\|0.021	0.910\|$\pm $\|0.014
NetLapRLS	0.904\|$\pm $\|0.019	0.913\|$\pm $\|0.015
GCN-DTI	0.929\|$\pm $\|0.021	0.936\|$\pm $\|0.024
DTI-CNN	0.936\|$\pm $\|0.022	0.939\|$\pm $\|0.016

The bold faces and underlined are best and second best results in each column.

Table 9

Comparison on DTINet data set (with same random seed).

Method	AUC	AUPR
Our method	0.936\|$\pm $\|0.020	0.949\|$\pm $\|0.021
DTINet	0.922\|$\pm $\|0.019	0.931\|$\pm $\|0.021
GRMF	0.877\|$\pm $\|0.025	0.913\|$\pm $\|0.016
VB-MK-LMF	0.921\|$\pm $\|0.019	0.937\|$\pm $\|0.017
NRLMF	0.905\|$\pm $\|0.023	0.918\|$\pm $\|0.018
CMF	0.895\|$\pm $\|0.030	0.924\|$\pm $\|0.018
GRGMF	0.932\|$\pm $\|0.019	0.940\|$\pm $\|0.022
KronRLS-MKL	0.919\|$\pm $\|0.027	0.938\|$\pm $\|0.023
BLM-NII	0.894\|$\pm $\|0.021	0.910\|$\pm $\|0.014
NetLapRLS	0.904\|$\pm $\|0.019	0.913\|$\pm $\|0.015
GCN-DTI	0.929\|$\pm $\|0.021	0.936\|$\pm $\|0.024
DTI-CNN	0.936\|$\pm $\|0.022	0.939\|$\pm $\|0.016

Method	AUC	AUPR
Our method	0.936\|$\pm $\|0.020	0.949\|$\pm $\|0.021
DTINet	0.922\|$\pm $\|0.019	0.931\|$\pm $\|0.021
GRMF	0.877\|$\pm $\|0.025	0.913\|$\pm $\|0.016
VB-MK-LMF	0.921\|$\pm $\|0.019	0.937\|$\pm $\|0.017
NRLMF	0.905\|$\pm $\|0.023	0.918\|$\pm $\|0.018
CMF	0.895\|$\pm $\|0.030	0.924\|$\pm $\|0.018
GRGMF	0.932\|$\pm $\|0.019	0.940\|$\pm $\|0.022
KronRLS-MKL	0.919\|$\pm $\|0.027	0.938\|$\pm $\|0.023
BLM-NII	0.894\|$\pm $\|0.021	0.910\|$\pm $\|0.014
NetLapRLS	0.904\|$\pm $\|0.019	0.913\|$\pm $\|0.015
GCN-DTI	0.929\|$\pm $\|0.021	0.936\|$\pm $\|0.024
DTI-CNN	0.936\|$\pm $\|0.022	0.939\|$\pm $\|0.016

The bold faces and underlined are best and second best results in each column.

3.6 Predicting Novel Interactions

Table 10 shows the predictive results of our method on GPCRs data set. It contains the estimated values of interactions, recorded database (letter abbreviation), ID of drug and target. In the top |$30$| results, we confirm |$18$| valid interaction records in the database. It can be seen that our method is effective in predicting DTIs.

Table 10

Top |$30$| predict DTIs with our model for GPCRs.

Drug ID	Target ID	Score	Confirmed^*	Drug ID	Target ID	Score	Confirmed^*
D00283	hsa1814	0.78262985	C D M	D00437	hsa152	0.67321722	—
D00255	hsa152	0.66170877	D	D02358	hsa154	0.64519076	D
D00095	hsa155	0.60468714	C D K	D00604	hsa148	0.60353368	D
D00604	hsa147	0.59977694	D	D02340	hsa1812	0.59190334	D
D01713	hsa152	0.58508138	—	D00136	hsa1812	0.55156516	D
D00397	hsa1131	0.54579513	C D K	D02342	hsa155	0.53660241	—
D00235	hsa155	0.53582014	M	D00635	hsa155	0.53274981	—
D04625	hsa154	0.53197951	D K	D00632	hsa155	0.53106934	—
D00598	hsa155	0.52867707	—	D02361	hsa1814	0.51970226	—
D01164	hsa1812	0.50973009	D	D03880	hsa155	0.50639507	—
D02147	hsa153	0.50405330	D M	D03490	hsa155	0.50159544	K
D00513	hsa152	0.49695517	—	D00110	hsa1813	0.49590903	—
D00270	hsa152	0.49571719	D K	D00136	hsa152	0.49513088	—
D00283	hsa1131	0.49354532	D	D00790	hsa1814	0.48902350	C D
D00503	hsa3356	0.48451227	—	D00283	hsa1132	0.48273789	D

Drug ID	Target ID	Score	Confirmed^*	Drug ID	Target ID	Score	Confirmed^*
D00283	hsa1814	0.78262985	C D M	D00437	hsa152	0.67321722	—
D00255	hsa152	0.66170877	D	D02358	hsa154	0.64519076	D
D00095	hsa155	0.60468714	C D K	D00604	hsa148	0.60353368	D
D00604	hsa147	0.59977694	D	D02340	hsa1812	0.59190334	D
D01713	hsa152	0.58508138	—	D00136	hsa1812	0.55156516	D
D00397	hsa1131	0.54579513	C D K	D02342	hsa155	0.53660241	—
D00235	hsa155	0.53582014	M	D00635	hsa155	0.53274981	—
D04625	hsa154	0.53197951	D K	D00632	hsa155	0.53106934	—
D00598	hsa155	0.52867707	—	D02361	hsa1814	0.51970226	—
D01164	hsa1812	0.50973009	D	D03880	hsa155	0.50639507	—
D02147	hsa153	0.50405330	D M	D03490	hsa155	0.50159544	K
D00513	hsa152	0.49695517	—	D00110	hsa1813	0.49590903	—
D00270	hsa152	0.49571719	D K	D00136	hsa152	0.49513088	—
D00283	hsa1131	0.49354532	D	D00790	hsa1814	0.48902350	C D
D00503	hsa3356	0.48451227	—	D00283	hsa1132	0.48273789	D

^*Database: ChEMBL(C), KEGG(K), Matador(M), DrugBank(D).

Table 10

Top |$30$| predict DTIs with our model for GPCRs.

Drug ID	Target ID	Score	Confirmed^*	Drug ID	Target ID	Score	Confirmed^*
D00283	hsa1814	0.78262985	C D M	D00437	hsa152	0.67321722	—
D00255	hsa152	0.66170877	D	D02358	hsa154	0.64519076	D
D00095	hsa155	0.60468714	C D K	D00604	hsa148	0.60353368	D
D00604	hsa147	0.59977694	D	D02340	hsa1812	0.59190334	D
D01713	hsa152	0.58508138	—	D00136	hsa1812	0.55156516	D
D00397	hsa1131	0.54579513	C D K	D02342	hsa155	0.53660241	—
D00235	hsa155	0.53582014	M	D00635	hsa155	0.53274981	—
D04625	hsa154	0.53197951	D K	D00632	hsa155	0.53106934	—
D00598	hsa155	0.52867707	—	D02361	hsa1814	0.51970226	—
D01164	hsa1812	0.50973009	D	D03880	hsa155	0.50639507	—
D02147	hsa153	0.50405330	D M	D03490	hsa155	0.50159544	K
D00513	hsa152	0.49695517	—	D00110	hsa1813	0.49590903	—
D00270	hsa152	0.49571719	D K	D00136	hsa152	0.49513088	—
D00283	hsa1131	0.49354532	D	D00790	hsa1814	0.48902350	C D
D00503	hsa3356	0.48451227	—	D00283	hsa1132	0.48273789	D

Drug ID	Target ID	Score	Confirmed^*	Drug ID	Target ID	Score	Confirmed^*
D00283	hsa1814	0.78262985	C D M	D00437	hsa152	0.67321722	—
D00255	hsa152	0.66170877	D	D02358	hsa154	0.64519076	D
D00095	hsa155	0.60468714	C D K	D00604	hsa148	0.60353368	D
D00604	hsa147	0.59977694	D	D02340	hsa1812	0.59190334	D
D01713	hsa152	0.58508138	—	D00136	hsa1812	0.55156516	D
D00397	hsa1131	0.54579513	C D K	D02342	hsa155	0.53660241	—
D00235	hsa155	0.53582014	M	D00635	hsa155	0.53274981	—
D04625	hsa154	0.53197951	D K	D00632	hsa155	0.53106934	—
D00598	hsa155	0.52867707	—	D02361	hsa1814	0.51970226	—
D01164	hsa1812	0.50973009	D	D03880	hsa155	0.50639507	—
D02147	hsa153	0.50405330	D M	D03490	hsa155	0.50159544	K
D00513	hsa152	0.49695517	—	D00110	hsa1813	0.49590903	—
D00270	hsa152	0.49571719	D K	D00136	hsa152	0.49513088	—
D00283	hsa1131	0.49354532	D	D00790	hsa1814	0.48902350	C D
D00503	hsa3356	0.48451227	—	D00283	hsa1132	0.48273789	D

^*Database: ChEMBL(C), KEGG(K), Matador(M), DrugBank(D).

4 Discussion

In this work, we develop an MK-TCMF model to fuse multi-kernels (features) and predict potential DTIs. Different features often have different contributions for the model. How to select them effectively is a problem. Fortunately, the MKL algorithm can regulate the weight of each kernel matrix according to the prediction error, and obtain better prediction performance in the fusion process.

After the model converges, each kernel will obtain a weight coefficient. The values (from |$0$| to |$1$|⁠) of these coefficients reflects the strength of the feature contribution. The higher the value of weight, the greater the contribution for the model. It can be obtained from the experimental results (Figure 5) that |$\mathbf{K}_{SE,d}$|⁠, |$\mathbf{K}_{GIP,d}$|⁠, |$\mathbf{K}_{SW,t}$| and |$\mathbf{K}_{GIP,t}$| have higher weights in drug and target feature spaces, respectively. The |$\mathbf{K}_{SE,d}$| (drug side effects) has the information of pharmacology. Currently, network pharmacology believes that drug development should follow network targeting-multi-component therapy [44]. In addition, |$\mathbf{K}_{SW,t}$| (sequence similarity of target) reflects the similarity of some structures. |$\mathbf{K}_{GIP,t}$| (target) and |$\mathbf{K}_{GIP,d}$| (drug) contain the known association information of drug and target. These associations provide a priori information in the prediction process. Therefore, the above features get higher weights.

Under three kinds of CV, MK-TCMF is compared with existing models. Under CVS1, MK-TCMF reaches best AUPR (Es: 0.912, IC: 0.933). Under CVS2, MK-TCMF has the best AUPRs on GPCRs (0.412), IC (0.426) and Es (0.407). Under CVS3, MK-TCMF also achieves best AUPRs on Es (0.831) and IC (0.826). These results show that MK-TCMF is comparable.

5 Conclusions

The MF models had been employed in recommendation systems, and it has obtained good prediction performance in the sparse adjacency matrices. With the generation of massive amounts of data, multi-source information fusion and deep learning have entered the field of medicine and been well applied. For example, the model based on graph neural network can extract topological information of drug. In future work, we will introduce multi-view learning [45, 46], graph method [47–50] and deep learning to feature representation and further improve the prediction performance of the model.

Key Points

We develop a matrix factorization (MF) model called multiple kernel-based triple collaborative matrix factorization (MK-TCMF).
We decompose the original adjacency matrix into three matrices, including the feature matrix of the drug space, the bi-projection matrix (used to join the two spaces) and the feature matrix of the target space.
In the process of solving the model, multiple drug kernel matrices and target kernel matrices are all linearly weighted and fused by the multiple kernel learning (MKL) algorithm.
To solve the parameters of the MK-TCMF model, an efficient iterative optimization algorithm is proposed.

Data availability

The codes and data are available from https://github.com/guofei-tju/IDTI-MK-TCMF or https://figshare.com/s/d1c4083564157150f9e7.

Funding

This work is supported by a grant from the National Natural Science Foundation of China (NSFC 61902271, 62172296 and 62172076), Special Science Foundation of Quzhou (2021D004) and Natural Science Research of Jiangsu Higher Education Institutions of China (19KJB520014).

Yijie Ding is an associate researcher in the Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, 324000, P.R. China.

Jijun Tang is a Professor in Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.

Fei Guo is a Professor in School of Computer Science and Engineering, Central South University, Changsha, 410083, P.R. China.

Quan Zou is a Professor in Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, P.R. China.

References

1.

Hecker

N

,

Ahmed

J

,

von Eichborn

J

, et al.

SuperTarget goes quantitative: update on drug-target interactions

.

Nucleic Acids Res

2012

;

40

(

D1

):

1113

–

7

.

2.

Schomburg

I

,

Chang

A

,

Placzek

S

, et al.

BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA

.

Nucleic Acids Res

2013

;

41

(

D1

):

764

–

72

.

3.

Kanehisa

M

,

Goto

S

.

From genomics to chemical genomics: new developments in KEGG

.

Nucleic Acids Res

2006

;

34

(

Suppl 1

):

354

–

7

.

4.

Law

V

,

Knox

C

,

Djoumbou

Y

, et al.

DrugBank 4.0: shedding new light on drug metabolism

.

Nucleic Acids Res

2014

;

42

(

D1

):

1091

–

7

.

5.

Chen

X

,

Yan

CC

,

Zhang

X

, et al.

Drug–target interaction prediction: databases, web servers and computational models

.

Brief Bioinform

2016

;

17

(

4

):

696

–

712

.

6.

Chen

X

,

Liu

MX

,

Yan

GY

.

Drug-target interaction prediction by random walk on the heterogeneous network

.

Mol Biosyst

2012

;

8

(

7

):

1970

–

8

.

7.

Wei

L

,

Xing

P

,

Zeng

J

, et al.

Improved prediction of protein-protein interactions using novel negative samples, features, and an ensemble classifier

.

Artif Intell Med

2017

;

83

(

11

):

67

–

74

.

8.

Wei

L

,

Wan

S

,

Guo

J

, et al.

A novel hierarchical selective ensemble classifier with bioinformatics application

.

Artif Intell Med

2017

;

83

(

11

):

82

–

90

.

9.

Mousavian

Z

,

Khakabimamaghani

S

,

Kavousi

K

, et al.

Drug-target interaction prediction from PSSM based evolutionary information

.

J Pharmacol Toxicol Methods

2015

;

78

:

42

–

51

.

10.

Cao

DS

,

Liu

S

,

Xu

QS

, et al.

Large-scale prediction of drug-target interactions using protein sequences and drug topological structures

.

Anal Chim Acta

2012

;

752

(

21

):

1

–

10

.

11.

Li

Z

,

Han

P

,

You

ZH

, et al.

In silico prediction of drug-target interaction networks based on drug chemical structure and protein sequences

.

Sci Rep

2017

;

7

(

1

):

11174

.

12.

Lin

J

,

Chen

H

,

Li

S

, et al.

Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier

.

Artif Intell Med

2019

;

98

(

1

):

35

–

47

.

13.

Shi

H

,

Liu

S

,

Chen

J

, et al.

Predicting drug-target interactions using Lasso with random forest based on evolutionary information and chemical structure

.

Genomics

2019

;

111

(

6

):

1839

–

52

.

14.

Xia

Z

,

Wu

LY

,

Zhou

X

, et al.

Semi-supervised drug-protein interaction prediction from heterogeneous biological spaces

.

BMC Syst Biol

2010

;

4

(

Suppl 2

):

6

–

17

.

15.

van Laarhoven

T

,

Nabuurs

SB

,

Marchiori

E

.

Gaussian interaction profile kernels for predicting drug–target interaction

.

Bioinformatics

2011

;

27

(

21

):

3036

–

43

.

16.

Hao

M

,

Wang

Y

,

Bryant

SH

.

Improved prediction of drug-target interactions using regularized least squares integrating with kernel fusion technique

.

Anal Chim Acta

2016

;

909

:

41

–

50

.

17.

Nascimento

A

,

Prudêncio

RBC

,

Costa

IG

.

A multiple kernel learning algorithm for drug-target interaction prediction

.

BMC Bioinformatics.

2016

;

17

(

1

):

46

–

61

.

18.

Cichonska

A

,

Ravikumar

B

,

Parri

E

, et al.

Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors

.

PLoS Comput Biol

2017

;

13

(

8

):e1005678.

19.

van Laarhoven

T

,

Marchiori

E

.

Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile

.

Plos One

2013

;

8

(

6

):e66952.

20.

Bleakley

K

,

Yamanishi

Y

.

Supervised prediction of drug-target interactions using bipartite local models

.

Bioinformatics

2009

;

25

(

18

):

2397

–

403

.

21.

Ding

Y

,

Tang

J

,

Guo

F

.

Identification of drug-target interactions via fuzzy bipartite local model

.

Neural Computing and Applications

2020

;

32

:

10303

–

19

.

10.1371/journal.pcbi.1007129

22.

Mei

J

,

Kwoh

CK

,

Yang

P

, et al.

Drug-target interaction prediction by learning from local information and neighbors

.

Bioinformatics

2013

;

29

(

2

):

238

–

45

.

23.

Lee

I

,

Keum

J

,

Nam

H

.

DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences

.

PLoS Comput Biol

2019

;

15

:

e1007129

.

.

24.

Peng

J

,

Li

J

,

Shang

X

.

A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network

.

BMC Bioinformatics

2020

;

21

:

394

.

25.

Eslami Manoochehri

H

,

Nourani

M

.

Drug-target interaction prediction using semi-bipartite graph model and deep learning

.

BMC Bioinformatics.

2020

;

21

:

248

.

26.

Zhao

T

,

Hu

Y

,

Valsdottir

LR

, et al.

Identifying drug-target interactions based on graph convolutional network and deep neural network

.

Brief Bioinform

2020

;

22

:

2141

.

.

27.

Zheng

X

, Ding H, Mamitsuka H, et al. Collaborative matrix factorization with multiple similarities for predicting drug-target interactions. In:

ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

. Association for Computing Machinery,

2013

, pp.

1025

–

33

.

28.

Gonen

M

.

Predicting drug-target interactions from chemical and genomic kernels using Bayesian matrix factorization

.

Bioinformatics

2012

;

28

(

18

):

2304

–

10

.

29.

Liu

Y

,

Wu

M

,

Miao

C

, et al.

Neighborhood regularized logistic matrix factorization for drug-target interaction prediction

.

PLoS Comput Biol

2016

;

12

(

2

):e1004760.

30.

Ezzat

A

,

Zhao

P

,

Wu

M

, et al.

Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization

.

IEEE/ACM Trans Comput Biol Bioinform

2016

;

14

(

3

):

646

–

56

.

31.

Zhang

Z

,

Zhang

XF

,

Wu

M

, et al.

A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks

.

Bioinformatics

2020

;

36

(

11

):

3474

–

81

.

32.

Bolgár

B

,

Antal

P

.

VB-MK-LMF: fusion of drugs, targets and interactions using variational Bayesian multiple kernel logistic matrix factorization. Bmc

.

Bioinformatics

2017

;

18

(

1

):

440

–

57

.

10.1038/s41467–017–00680–8

33.

Luo

Y

,

Zhao

X

,

Zhou

J

, et al.

A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information

.

Nat Commun

2017

;

8

:

573

.

.

34.

Shi

JY

,

Huang

H

,

Li

JX

, et al.

TMFUF: a triple matrix factorization-based unified framework for predicting comprehensive drug-drug interactions of new drugs

.

BMC Bioinformatics

2018

;

19

:

411

.

35.

Yamanishi

Y

,

Araki

M

,

Gutteridge

A

, et al.

Prediction of drug-target interaction networks from the integration of chemical and genomic spaces

.

Bioinformatics

2008

;

24

(

13

):

i232

–

40

.

36.

Perlman

L

,

Gottlieb

A

,

Atias

N

, et al.

Combining drug and gene similarity measures for drug-target elucidation

.

J Comput Biol

2011

;

18

(

2

):

133

.

37.

Smedley

D

.

The BioMart community portal: an innovative alternative to large, centralized data repositories

.

Nucleic Acids Res

2015

;

43

(

1

):

589

–

98

.

38.

Ovaska

K

,

Laakso

M

,

Hautaniemi

S

.

Fast gene ontology based clustering for microarray experiments

.

Biodata Mining

2008

;

1

(

1

):

11

–

1

.

39.

Takarabe

M

,

Kotera

M

,

Nishimura

Y

, et al.

Drug target prediction using adverse event report systems

.

Bioinformatics

2012

;

28

(

18

):

i611

–

8

.

40.

Hattori

M

,

Okuno

Y

,

Goto

S

, et al.

Development of a Chemical Structure Comparison Method for Integrated Analysis of Chemical and Genomic Information in the Metabolic Pathways

.

J Am Chem Soc

2003

;

125

(

39

):

11853

–

65

.

41.

Ding

Y

,

Tang

J

,

Guo

F

.

Identification of Drug-Target Interactions via Multiple Information Integration

.

Inform Sci

2017

;

418

:

546

–

60

.

42.

He

J

, Chang S, Xie L. Fast kernel learning for spatial pyramid matching. In:

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

. IEEE Computer Society,

2008

, pp.

1

–

7

.

43.

Cristianini

N

,

Elisseeff

A

.

On Kernel-Target Alignment

.

Advances in Neural Information Processing Systems

2001

;

179

(

5

):

367

–

73

.

44.

Casas

AI

,

Hassan

AA

,

Larsen

SJ

, et al.

From single drug targets to synergistic network pharmacology in ischemic stroke

.

Proc Natl Acad Sci

2019

;

116

(

14

):

7129

–

36

.

45.

Jiang

Y

,

Deng

Z

,

Chung

FL

, et al.

Recognition of Epileptic EEG Signals Using a Novel Multiview TSK Fuzzy System

.

IEEE Trans Fuzzy Syst

2017

;

25

(

1

):

3

–

20

.

46.

Jiang

Y

,

Zhang

Y

,

Lin

C

, et al.

EEG-Based Driver Drowsiness Estimation Using an Online Multi-View and Transfer TSK Fuzzy System

.

IEEE Transactions on Intelligent Transportation Systems

2021

;

22

(

3

):

1752

–

64

.

47.

Zhou

H

,

Wang

H

.

Identify ncRNA subcellular localization via graph regularized k-local hyperplane distance nearest neighbor model on multi-kernel learning

.

IEEE/ACM Trans Comput Biol Bioinform

2021

.

48.

Ding

Y

,

Yang

C

,

Tang

J

, et al.

Identification of protein-nucleotide binding residues via graph regularized k-local hyperplane distance nearest neighbor model

.

Applied Intelligence

2021

.

49.

Qian

Y

,

Meng

H

,

Lu

W

, et al.

Identification of DNA-binding proteins via Hypergraph based Laplacian Support Vector Machine

.

Current Bioinformatics

2021

;

16

.

50.

Yang

H

,

Ding

Y

,

Tang

J

, et al.

Drug-disease associations prediction via Multiple Kernel-based Dual Graph Regularized Least Squares

.

Appl Soft Comput

2021

;

112

:

107811

.