[1]尤洁,李劲,张赛,等.基于图勾勒的图链路预测方法[J].智能系统学报,2019,14(04):761-768.[doi:10.11992/tis.201806007]
 YOU Jie,LI Jin,ZHANG Sai,et al.Graph sketches-based link prediction over graph data[J].CAAI Transactions on Intelligent Systems,2019,14(04):761-768.[doi:10.11992/tis.201806007]
点击复制

基于图勾勒的图链路预测方法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年04期
页码:
761-768
栏目:
出版日期:
2019-07-02

文章信息/Info

Title:
Graph sketches-based link prediction over graph data
作者:
尤洁1 李劲12 张赛1 李婷1
1. 云南大学 软件学院, 云南 昆明 650091;
2. 云南省软件工程重点实验室, 云南 昆明 650091
Author(s):
YOU Jie1 LI Jin12 ZHANG Sai1 LI Ting1
1. School of Software, Yunnan University, Kunming 650091, China;
2. Key Laboratory in Software Engineering of Yunnan Province, Kunming 650091, China
关键词:
图数据算法复杂度链路预测图勾勒节点相似性并行计算Apache Spark
Keywords:
graph dataalgorithm complexitylink-predictiongraph sketchesnodes similarityparallel computingApache Spark
分类号:
TP311
DOI:
10.11992/tis.201806007
摘要:
针对已有链路预测算法复杂度高,不适于在大规模图上进行链接预测的问题,本文基于图勾勒近似技术对已有链路预测方法进行优化,提出了基于图勾勒的链路预测方法。该方法将链路预测算法的计算复杂度由On3)降低至On2k2log2n)。为进一步提高链接预测效率,给出了基于Spark的并行化链路预测实现方法。在真实图数据集上进行测试,实验结果表明本文方法在保证链接预测精度的前提下,可有效提升算法效率。
Abstract:
The high computational complexity of existing link prediction algorithms makes them unsuitable for link prediction on large-scale graphs. To solve this problem, we propose a novel link prediction approach that involves combining the existing link prediction approaches with graph sketch approximation. Our proposed approach reduces the computation complexity of link prediction from O(n3) to O(n2k2log2n) Furthermore, to enhance the efficiency of our approach; we also provide a parallel link prediction algorithm, which is implemented on the parallel computing framework Apache Spark. Finally, we conducted extensive experiments on a real network dataset to test the validation and efficiency of our approach. The experimental results indicate that our methods can effectively improve the efficiency of link prediction while guaranteeing prediction accuracy as well.

参考文献/References:

[1] 吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010, 39(5):651-661 LYU Linyuan. Link prediction on complex networks[J]. Journal of University of Electronic Science and Technology of China, 2010, 39(5):651-661
[2] CLAUSET A, MOORE C, NEWMAN M E J. Hierarchical structure and the prediction of missing links in networks[J]. Nature, 2008, 453(7191):98-101.
[3] AIROLDI E M, BLEI D M, FIENBERG S E, et al. Mixed membership stochastic blockmodels[J]. Journal of machine learning research, 2008, 9:1981-2014.
[4] YU Kai, CHU Wei, YU Shipeng, et al. Stochastic relational models for discriminative link prediction[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems. Vancouver, Canada, 2006:1553-1560.
[5] LIBEN-NOWELL D, KLEINBERG J. The link-prediction problem for social networks[J]. Journal of the association for information science and technology, 2007, 58(7):1019-1031.
[6] ADAMIC L A, ADAR E. Friends and neighbors on the Web[J]. Social networks, 2003, 25(3):211-230.
[7] LYU Linyuan, JIN Cihang, ZHOU Tao. Similarity index based on local paths for link prediction of complex networks[J]. Physical review E, 2009, 80(4):046122.
[8] ZHOU Tao, Lü Linyuan, ZHANG Yicheng. Predicting missing links via local information[J]. The European physical journal B, 2009, 71(4):623-630.
[9] KATZ L. A new status index derived from sociometric analysis[J]. Psychometrika, 1953, 18(1):39-43.
[10] LEICHT E A, HOLME P, NEWMAN M E J. Vertex similarity in networks[J]. Physical review E, 2006, 73:026120.
[11] KLEIN D J, RANDIC M. Resistance distance[J]. Journal of mathematical chemistry, 1993, 12(1):81-95.
[12] FOUSS F, PIROTTE A, RENDERS J M, et al. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J]. IEEE transactions on knowledge and data engineering, 2007, 19(3):355-369.
[13] BRIN S, PAGE L. The anatomy of a large-scale hypertextual Web search engine[J]. Computer networks and ISDN systems, 1998, 30(1-7):107-117.
[14] JEH G, WIDOM J. SimRank:a measure of structural-context similarity[C]//Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada, 2002:538-543.
[15] 饶君, 吴斌, 东昱晓. MapReduce环境下的并行复杂网络链路预测[J]. 软件学报, 2012, 23(12):3175-3186 RAO Jun, WU Bin, DONG Yuxiao. Parallel link prediction in complex network using mapreduce[J]. Journal of software, 2012, 23(12):3175-3186
[16] COHEN E. All-distances sketches[M]//KAO M Y. Encyclopedia of Algorithms. New York:Springer, 2016:2320-2334.
[17] BATAGELJ V, MRVAR A. Pajek datasets[EB/OL]. 2006 http://vlado.fmf.uni-lj.si/pub/networks/data/default.html.
[18] BU Dongbo, ZHAO Yi, CAI Lun, et al. Topological structure analysis of the protein-protein interaction network in budding yeast[J]. Nucleic acids research, 2003, 31(9):2443-2450.
[19] WATTS D J, STROGATZ S H. Collective dynamics of ‘small-world’ networks[J]. Nature, 1998, 393(6684):440-442.
[20] PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA, 2014:701-710
[21] BREITKREUTZ B J, STARK C, REGULY T, et al. The bioGRID interaction database:2008 update[J]. Nucleic acids research, 2008, 36(S1):D637-D640.

备注/Memo

备注/Memo:
收稿日期:2018-06-02。
基金项目:国家自然科学基金项目(61562091);云南省应用基础研究计划面上项目(2016FB110).
作者简介:尤洁,女,1991年生,硕士研究生,主要研究方向为数据与知识工程;李劲,男,1975年生,副教授,中国人工智能学会不确定性人工智能专委会委员,主要研究方向为数据与知识工程;张赛,女,1994年生,硕士研究生,主要研究方向为数据与知识工程。
通讯作者:李劲.E-mail:lijin@ynu.edu.cn
更新日期/Last Update: 2019-08-25