[1]CAO Hantong,CHEN Jing.Prediction of multitype protein interactions combining Doc2vec and GCN[J].CAAI Transactions on Intelligent Systems,2023,18(6):1165-1172.[doi:10.11992/tis.202212029]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
18
Number of periods:
2023 6
Page number:
1165-1172
Column:
学术论文—机器学习
Public date:
2023-11-05
- Title:
-
Prediction of multitype protein interactions combining Doc2vec and GCN
- Author(s):
-
CAO Hantong1; CHEN Jing1; 2
-
1. School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China;
2. Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computing Intelligence, Jiangnan University, Wuxi 214122, China
-
- Keywords:
-
PPI network; graph neural network; protein function prediction; deep learning; biological significance; complex network; GCN; unsupervised learning; protein sequence
- CLC:
-
TP391;Q811.4
- DOI:
-
10.11992/tis.202212029
- Abstract:
-
The study of multitype protein-protein interactions (PPIs) is the basis for understanding biological processes and revealing disease mechanisms from a systematic perspective. Existing prediction methods for multiple types of PPIs, such as GNN-PPI and PIPR, show a considerable decline in test accuracy when the breadth- and depth-first searches are used to divide data sets. Therefore, this paper proposes a new multitype PPI prediction method (GDP) based on the Doc2vec method and graph convolutional neural network technology, which does not need to rely on the physical and biological properties of proteins. Moreover, the method only uses sequence information to encode proteins and combines the network structure information to conduct characteristic protein polymerization for developing PPI information to perform multitype prediction. Experimental results show that this method can effectively improve the prediction accuracy of multiple type PPIs in real data with different scales, especially in PPI between new proteins that have not been previously observed in the training set.