[1]DENG Siyu,LIU Fulun,HUANG Yuting,et al.Active learning through PageRank[J].CAAI Transactions on Intelligent Systems,2019,14(3):551-559.[doi:10.11992/tis.201804052]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
14
Number of periods:
2019 3
Page number:
551-559
Column:
学术论文—机器学习
Public date:
2019-05-05
- Title:
-
Active learning through PageRank
- Author(s):
-
DENG Siyu1; LIU Fulun1; HUANG Yuting1; WANG Min2
-
1. School of Computer Science, Southwest Petroleum University, Chengdu 610500, China;
2. School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu 610500, China
-
- Keywords:
-
classification; active learning; PageRank; neighborhood; clustering; binary tree
- CLC:
-
TP181
- DOI:
-
10.11992/tis.201804052
- Abstract:
-
In many classification tasks, there are a large number of unlabeled samples, and it is expensive and time-consuming to obtain a label for each class. The goal of active learning is to train an accurate classifier with minimum cost by labeling the most informative samples. In this paper, we propose a PageRank-based active learning algorithm (PAL), which makes full use of sample distribution information for effective sample selection. First, based on the PageRank theory, we sequentially calculate the neighborhoods, score matrices, and ranking vectors based on similarity relationships in the data. Next, we select representative samples and establish a binary tree to express the relationships between representative samples. Then, we use a binary tree to cluster, label, and predict representative samples. Lastly, we regard the representative samples as training sets for classifying other samples. We conducted experiments on eight datasets to compare the performance of our proposed algorithm with those of five traditional classification algorithms and three state-of-the-art active learning algorithms. The results demonstrate that PAL obtained higher classification accuracy.