[1]ZHANG Gang,XIE Xiaoshan,HUANG Ying,et al.An online multi-kernel learning algorithm for big data[J].CAAI Transactions on Intelligent Systems,2014,9(3):355-363.[doi:10.3969/j.issn.1673-4785.201403067]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
9
Number of periods:
2014 3
Page number:
355-363
Column:
学术论文—机器学习
Public date:
2014-06-25
- Title:
-
An online multi-kernel learning algorithm for big data
- Author(s):
-
ZHANG Gang; XIE Xiaoshan; HUANG Ying; WANG Chunru
-
School of Automation, Guangdong University of Technology, Guangzhou 510006, China
-
- Keywords:
-
big data stream; online multi-kernel learning; manifold learning; data-dependent kernel; semi-supervised learning
- CLC:
-
TP18
- DOI:
-
10.3969/j.issn.1673-4785.201403067
- Abstract:
-
In machine learning, a proper kernel function affects much on the performance of target learners. Commonly an effective kernel function can be obtained through kernel learning. We present a semi-supervised online multiple kernel algorithm for big data stream analysis. The algorithm learns a kernel function through an online update procedure by reading current segments of a big data stream. The algorithm adjusts the parameters of currently learned kernel function in a supervised manner and modifies the kernel through unsupervised manifold learning, so as to make the contour surfaces of the kernel along with some low dimensionality manifold in the data space as far as possible. The novelty is that it performs supervised and unsupervised learning at the same time, and scans the training data only once, which reduces the computational complexity and is suitable for the kernel learning tasks in big datasets and high speed data streams. This algorithm’s support to the unsupervised learning effectively solves the problem of label missing in big data streams. The evaluation results from the synthetic datasets generated by MOA and the benchmark datasets of the big data analysis from the UCI data repository show the effectiveness of the proposed algorithm.