[1]CHENG Kangming,XIONG Weili.A dual-optimal semi-supervised regression algorithm[J].CAAI Transactions on Intelligent Systems,2019,14(4):689-696.[doi:10.11992/tis.201805010]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
14
Number of periods:
2019 4
Page number:
689-696
Column:
学术论文—机器学习
Public date:
2019-07-02
- Title:
-
A dual-optimal semi-supervised regression algorithm
- Author(s):
-
CHENG Kangming1; XIONG Weili1; 2
-
1. School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China;
2. Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China
-
- Keywords:
-
unlabeled samples; select; semi-supervised regression; center of sample dense area; similarity; Gaussian process regression; auxiliary learner; main learner; debutanizer process; prediction performance
- CLC:
-
TP274
- DOI:
-
10.11992/tis.201805010
- Abstract:
-
Aiming at the problem that there are few label samples in some industrial processes and that the traditional semi-supervised learning cannot guarantee the accurate prediction of unlabeled samples, a dual-optimal semi-supervised regression algorithm is proposed in this paper. First, in this method, the center of the label-concentrated area is found, and the similarity between unlabeled samples and the center is calculated, and therefore, the unlabeled samples are optimized. At the same time, the labeled samples are selected according to similarity between the unlabeled samples and the center of the dense area. Second, by employing the Gaussian process regression method, an auxiliary learner is established according to the selected labeled sample, and then the labels of the selected unlabeled samples are predicted by the auxiliary learner. Finally, the performance of the main learner is improved with these pseudo-label samples. Through a simulation of the numerical case and the actual debutanizer process, the proposed method is verified to have a good prediction performance when the labeled samples are few.