[1]周红标,乔俊飞.基于高维k-近邻互信息的特征选择方法[J].智能系统学报,2017,12(5):595-600.[doi:10.11992/tis.201609020]
ZHOU Hongbiao,QIAO Junfei.Feature selection method based on high dimensional k-nearest neighbors mutual information[J].CAAI Transactions on Intelligent Systems,2017,12(5):595-600.[doi:10.11992/tis.201609020]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
12
期数:
2017年第5期
页码:
595-600
栏目:
学术论文—人工智能基础
出版日期:
2017-10-25
- Title:
-
Feature selection method based on high dimensional k-nearest neighbors mutual information
- 作者:
-
周红标1,2,3, 乔俊飞1,2
-
1. 北京工业大学 信息学部, 北京 100124;
2. 计算智能和智能系统北京市重点实验室, 北京 100124;
3. 淮阴工学院 自动化学院, 江苏 淮安 223003
- Author(s):
-
ZHOU Hongbiao1,2,3, QIAO Junfei1,2
-
1. Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China;
2. Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124, China;
3. Faculty of Automation, Huaiyin Institute of Technology, Huai’an 223003, China
-
- 关键词:
-
特征选择; 互信息; k-近邻; 高维互信息; 多层感知器
- Keywords:
-
feature selection; mutual information; k-nearest neighbor; high-dimensional mutual information; multilayer perceptron
- 分类号:
-
TP183
- DOI:
-
10.11992/tis.201609020
- 摘要:
-
针对多元序列预测建模过程中特征选择问题,提出了一种基于数据驱动型高维k-近邻互信息的特征选择方法。该方法首先将数据驱动型k-近邻法扩展用于高维特征变量之间互信息的估计,然后采用前向累加策略给出全部特征最优排序,根据预设无关特征个数剔除无关特征,再利用后向交叉策略找出并剔除冗余特征,最终得到最优强相关特征子集。以Friedman数据、Housing数据和实际污水处理出水总磷预测数据为例,采用多层感知器神经网络预测模型进行仿真实验,验证了所提方法的有效性。
- Abstract:
-
Feature selection plays an important role in the modeling and forecast of multivariate series. In this paper, we propose a feature selection method based on data-driven high-dimensional k-nearest neighbor mutual information. First, this method extends the k-nearest neighbor method to estimate the amount of mutual information among high-dimensional feature variables. Next, optimal sorting of all these features is achieved by adopting a forward accumulation strategy in which irrelevant features are eliminated according to a preset number. Then, redundant features are located and removed using a backward cross strategy. Lastly, this method obtains optimal subsets that feature a strong correlation. Using Friedman data, housing data, and actual effluent total-phosphorus forecast data from wastewater treatment plant as examples, we performed a simulation experiment by adopting a neural network forecast model with multilayer perception. The simulation results demonstrate the feasibility of the proposed method.
备注/Memo
收稿日期:2016-09-21。
基金项目:国家自然科学基金重点项目(61533002)
作者简介:周红标, 男, 1980年生, 讲师, 博士研究生, 主要研究方向为神经网络分析与设计。发表论文十余篇, 其中被EI检索6篇;乔俊飞, 男, 1968年生, 教授, 博士生导师, 主要研究方向为污水处理过程智能优化控制。获教育部科技进步奖一等奖和北京市科学进步奖三等奖各1项, 发表论文近100篇, 其中被SCI收录18篇, EI收录60篇, 获发明专利20项.
通讯作者:乔俊飞.E-mail:hyitzhb@163.com
更新日期/Last Update:
2017-10-25