[1]吴晗,王士同.不完整数据分类与缺失信息重要性识别特权LSSVM[J].智能系统学报,2023,18(4):743-753.[doi:10.11992/tis.202202026]
WU Han,WANG Shitong.Privileged LSSVM for classification and simultaneous importance identification of missing information on incomplete data[J].CAAI Transactions on Intelligent Systems,2023,18(4):743-753.[doi:10.11992/tis.202202026]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第4期
页码:
743-753
栏目:
学术论文—机器感知与模式识别
出版日期:
2023-07-15
- Title:
-
Privileged LSSVM for classification and simultaneous importance identification of missing information on incomplete data
- 作者:
-
吴晗, 王士同
-
江南大学 人工智能与计算机学院, 江苏 无锡 214122
- Author(s):
-
WU Han, WANG Shitong
-
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China
-
- 关键词:
-
最小二乘支持向量机; 特权信息学习; 可加性核; 数据缺失; k最近邻; 样本空间; 特权空间; 数据质量
- Keywords:
-
least squares support vector machines; learning using privileged information; additional kernel; missing data; k-nearest neighbor; sample space; privileged space; data quality
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202202026
- 摘要:
-
针对直接移除缺失数据的样本可能会导致因样本数量规模的减少从而降低了分类性能的问题,本文基于同时处理缺失数据与构建模式分类模型的策略,提出使用特权信息学习(learning using privileged information, LUPI)的特权最小二乘支持向量机 (privileged least squares support vector machine, P-LSSVM),从而达到既能改进其分类性能,又能在保证无偏的情况下确定缺失特征的重要性。本文的基本思想是将完整数据的训练作为特权信息,以此来引导面向整个不完全数据的最小二乘支持向量机(least squares support vector machine, LSSVM)的学习,通过可加性核表达每个特征(含缺失特征)的重要性,推导完整数据的训练的特权信息,并以此构建P-LSSVM,运用所提出的留一交叉验证方法完成无偏的缺失特征重要性识别。实验结果表明,本文提出的方法不但在平均测试精度上优于对比算法,还能同时确定缺失特征的重要性。
- Abstract:
-
While handling missing data classification tasks, the commonly-used removal strategy of missing data may perhaps degrade the classifier’s performance, due to very insufficient perfect data. Based on the strategy of processing missing data and constructing classification model simultaneously, we develop a novel privileged LSSVM (P-LSSVM), which learns using privilaged information. It can not only improve its classification performance, but also determines the importance of missing features without bias. The basic idea is to take the trained classifier of the available perfect data as the privileged information to guide the learning of LSSVM for the whole incomplete data, express the importance of each feature including missing features through the additivity kernel, then deduce the privilaged information of complete data after training, based on which P-LSSVM is constructed. Finally, the unbiased missing feature importance recognition is completed by the proposed leaving-one cross-validation method. Experimental results show that the proposed method can achieve better testing accuracies, with the importance identification of missing features.
更新日期/Last Update:
1900-01-01