[1]程康明,熊伟丽.一种双优选的半监督回归算法[J].智能系统学报,2019,14(04):689-696.[doi:10.11992/tis.201805010]
 CHENG Kangming,XIONG Weili.A dual-optimal semi-supervised regression algorithm[J].CAAI Transactions on Intelligent Systems,2019,14(04):689-696.[doi:10.11992/tis.201805010]
点击复制

一种双优选的半监督回归算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年04期
页码:
689-696
栏目:
出版日期:
2019-07-02

文章信息/Info

Title:
A dual-optimal semi-supervised regression algorithm
作者:
程康明1 熊伟丽12
1. 江南大学 物联网工程学院, 江苏 无锡 214122;
2. 江南大学 轻工过程先进控制教育部重点实验室, 江苏 无锡 214122
Author(s):
CHENG Kangming1 XIONG Weili12
1. School of Internet of Things Engineering, Jiangnan University, Wuxi 214122, China;
2. Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, China
关键词:
无标签样本优选半监督回归样本密集区中心相似度高斯过程回归辅学习器主学习器脱丁烷塔过程预测性能
Keywords:
unlabeled samplesselectsemi-supervised regressioncenter of sample dense areasimilarityGaussian process regressionauxiliary learnermain learnerdebutanizer processprediction performance
分类号:
TP274
DOI:
10.11992/tis.201805010
摘要:
针对一些工业过程中存在的有标签样本少,而传统的半监督学习无法保证对无标签样本准确预测的问题,提出一种双优选的半监督回归算法。首先,确定有标签样本密集区中心,并计算无标签样本与该中心的相似度,实现对无标签样本的优选,同时根据有标签样本间相似度优选有标签样本;然后,利用高斯过程回归方法对选出的有标签样本建立辅学习器,以对优选出的无标签样本预测标签;最后,利用这些伪标签样本提升主学习器的预测效果。通过数值例子以及实际脱丁烷塔过程数据进行建模仿真,证明了所提方法在有标签样本较少的情况下有良好的预测性能。
Abstract:
Aiming at the problem that there are few label samples in some industrial processes and that the traditional semi-supervised learning cannot guarantee the accurate prediction of unlabeled samples, a dual-optimal semi-supervised regression algorithm is proposed in this paper. First, in this method, the center of the label-concentrated area is found, and the similarity between unlabeled samples and the center is calculated, and therefore, the unlabeled samples are optimized. At the same time, the labeled samples are selected according to similarity between the unlabeled samples and the center of the dense area. Second, by employing the Gaussian process regression method, an auxiliary learner is established according to the selected labeled sample, and then the labels of the selected unlabeled samples are predicted by the auxiliary learner. Finally, the performance of the main learner is improved with these pseudo-label samples. Through a simulation of the numerical case and the actual debutanizer process, the proposed method is verified to have a good prediction performance when the labeled samples are few.

参考文献/References:

[1] 周志华. 基于分歧的半监督学习[J]. 自动化学报, 2013, 39(11):1871-1878 ZHOU Zhihua. Disagreement-based semi-supervised learning[J]. Acta Automatica Sinica, 2013, 39(11):1871-1878
[2] ZHOU Zhihua, LI Ming. Semisupervised regression with cotraining-style algorithms[J]. IEEE Transactions on knowledge and data engineering, 2007, 19(11):1479-1493.
[3] 姜婷, 袭肖明, 岳厚光. 基于分布先验的半监督FCM的肺结节分类[J]. 智能系统学报, 2017, 12(5):729-734 JIANG Ting, XI Xiaoming, YUE Houguang. Classification of pulmonary nodules by semi-supervised FCM based on prior distribution[J]. CAAI transactions on intelligent systems, 2017, 12(5):729-734
[4] 刘建伟, 刘媛, 罗雄麟. 半监督学习方法[J]. 计算机学报, 2015, 38(8):1592-1617 LIU Jianwei, LIU Yuan, LUO Xionglin. Semi-supervised learning methods[J]. Chinese journal of computers, 2015, 38(8):1592-1617
[5] 刘杨磊, 梁吉业, 高嘉伟, 等. 基于Tri-training的半监督多标记学习算法[J]. 智能系统学报, 2013, 8(5):439-445 LIU Yanglei, LIANG Jiye, GAO Jiawei, et al. Semi-supervised multi-label learning algorithm based on Tri-training[J]. CAAI Transactions on intelligent systems, 2013, 8(5):439-445
[6] 徐蓉, 姜峰, 姚鸿勋. 流形学习概述[J]. 智能系统学报, 2006, 1(1):44-51 XU Rong, JIANG Feng, YAO Hongxun. Overview of manifold learning[J]. CAAI transactions on intelligent systems, 2006, 1(1):44-51
[7] 杨剑, 王珏, 钟宁. 流形上的Laplacian半监督回归[J]. 计算机研究与发展, 2007, 44(7):1121-1127 YANG Jian, WANG Jue, ZHONG Ning. Laplacian semi-supervised regression on a manifold[J]. Journal of computer research and development, 2007, 44(7):1121-1127
[8] ZHOU Zhihua, LI Ming. Semi-supervised regression with co-training[C]//Proceedings of the 19th International Joint Conference on Artificial Intelligence. Edinburgh, Scotland, UK, 2005:908-913.
[9] 程玉虎, 冀杰, 王雪松. 基于Help-Training的半监督支持向量回归[J]. 控制与决策, 2012, 27(2):205-210, 226 CHENG Yuhu, JI Jie, WANG Xuesong. Semi-supervised support vector regression based on Help-Training[J]. Control and decision, 2012, 27(2):205-210, 226
[10] 盛高斌, 姚明海. 基于半监督回归的选择性集成算法[J]. 计算机仿真, 2009, 26(10):198-201, 318 SHENG Gaobin, YAO Minghai. An ensemble selection algorithm based on semi-supervised regression[J]. Computer simulation, 2009, 26(10):198-201, 318
[11] 何志昆, 刘光斌, 赵曦晶, 等. 高斯过程回归方法综述[J]. 控制与决策, 2013, 28(8):1121-1129, 1137 HE Zhikun, LIU Guangbin, ZHAO Xijing, et al. Overview of Gaussian process regression[J]. Control and decision, 2013, 28(8):1121-1129, 1137
[12] 熊伟丽, 李妍君, 姚乐, 等. 一种动态校正的AGMM-GPR多模型软测量建模方法[J]. 大连理工大学学报, 2016, 56(1):77-85 XIONG Weili, LI Yanjun, YAO Le, et al. A dynamically corrected AGMM-GPR multi-model soft sensor modeling method[J]. Journal of Dalian University of Technology, 2016, 56(1):77-85
[13] 郭帅, 马书根, 李斌, 等. VorSLAM算法中基于多规则的数据关联方法[J]. 自动化学报, 2013, 39(6):883-894 GUO Shuai, MA Shugen, LI Bin, et al. A data association approach based on multi-rules in VorSLAM[J]. Acta automatica sinica, 2013, 39(6):883-894
[14] KNORR E M, NG R T. A unified notion of outliers:properties and computation[C]//Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. Newport Beach, CA, USA, 1997:219-222.
[15] 曾静, 王军, 郭金玉. 基于向量相似度的多模型局部建模方法研究[J]. 计算机应用研究, 2012, 29(5):1631-1633, 1640 ZENG Jing, WANG Jun, GUO Jinyu. Local multi-model method based on similarity of vector[J]. Application research of computers, 2012, 29(5):1631-1633, 1640
[16] 阮宏镁, 田学民, 王平. 基于联合互信息的动态软测量方法[J]. 化工学报, 2014, 65(11):4497-4502 RUAN Hongmei, TIAN Xuemin, WANG Ping. Dynamic soft sensor method based on joint mutual information[J]. CIESC journal, 2014, 65(11):4497-4502
[17] FORTUNA L, GRAZIANI S, RIZZO A, et al. Soft sensors for monitoring and control of industrial processes[M]. London:Springer, 2007:229-231.

备注/Memo

备注/Memo:
收稿日期:2018-05-09。
基金项目:国家自然科学基金项目(61773182,60712228);江苏省自然科学基金项目(BK20170198).
作者简介:程康明,男,1993年生,硕士研究生,主要研究方向为工业过程建模;熊伟丽, 女,1978年生,教授,博士,主要研究方向为复杂工业过程建模及优化、智能优化算法及应用。入选江苏省"青蓝工程"中青年学术带头人。主持国家自然科学基金面上项目、江苏省产学研等纵向项目8项;参与国家863计划、重点研发计划等多项。发表研究学术论文60余篇。
通讯作者:熊伟丽.E-mail:greenpre@163.com
更新日期/Last Update: 2019-08-25