[1]易磊,潘志松,邱俊洋,等.在线学习的大规模网络流量分类研究[J].智能系统学报编辑部,2016,11(3):318-327.[doi:10.11992/tis.201603033]
 YI Lei,PAN Zhisong,QIU Junyang,et al.Large-scale network traffic classification based on online learning[J].CAAI Transactions on Intelligent Systems,2016,11(3):318-327.[doi:10.11992/tis.201603033]
点击复制

在线学习的大规模网络流量分类研究(/HTML)
分享到:

《智能系统学报》编辑部[ISSN:1673-4785/CN:23-1538/TP]

卷:
第11卷
期数:
2016年3期
页码:
318-327
栏目:
出版日期:
2016-06-25

文章信息/Info

Title:
Large-scale network traffic classification based on online learning
作者:
易磊 潘志松 邱俊洋 薛胶 任会峰
中国人民解放军理工大学 指挥信息系统学院, 江苏 南京 210007
Author(s):
YI Lei PAN Zhisong QIU Junyang XUE Jiao REN Huifeng
Institute of Command Information System, PLA University of Science and Technology, Nanjing 210007, China
关键词:
在线学习大规模网络流量分类时序相关性数据流随机优化
Keywords:
online learninglarge-scaletraffic classificationtiming correlationdata streamstochastic optimization
分类号:
TP181
DOI:
10.11992/tis.201603033
摘要:
传统的批处理机器学习方法在面对大规模网络流量分类问题时存在分类器训练速度慢、计算复杂度高的缺陷。近年来迅速发展的在线学习方法是解决大规模问题的有效途径。本文针对高速骨干网上的大规模网络流量分类问题,提出了一个基于在线学习的分类框架,并应用了8种在线学习算法。在真实数据集上的实验表明,在分类精度相当的情况下,在线学习算法与支持向量机(SVM)相比空间开销小、模型训练时间显著缩短。同时,为了考察网络流量中样本顺序对分类效果的影响,本文对比了样本按时序处理与随机处理两种方式的差异,验证了网络流量样本存在着时序上的相关性。
Abstract:
Facing the challenges of large-scale network traffic classification problem, traditional batch machine learning algorithms suffer from slow training process and high computational complexity. In recent years, the rapid developing online learning technology is an effective way to solve large-scale problems. To address the issue of large-scale network traffic classification problem on a high-speed backbone network, we proposed a traffic classification scheme based on online learning and applied eight online learning algorithms. Experiments on real network traffic data sets showed that in the classification accuracy similar situation, online learning algorithm has less space overhead and training time than the support vector machine. Meanwhile, to examine the impact of the order of network traffic samples on the classification results, this paper compared the difference between the two ways of processing samples, sequentially and random, we verified that the presence of timing correlation in network traffic samples by comparing online learning and stochastic optimization.

参考文献/References:

[1] ZHANG Jun, CHEN Xiao, XIANG Yang, et al. Robust network traffic classification[J]. IEEE/ACM transactions on networking, 2015, 23(4): 1257-1270.
[2] NGUYEN T T T, ARMITAGE G. A survey of techniques for internet traffic classification using machine learning[J]. IEEE communications surveys & tutorials, 2008, 10(4): 56-76.
[3] 陶卿, 高乾坤, 姜纪远, 等. 稀疏学习优化问题的求解综述[J]. 软件学报, 2013, 24(11): 2498-2507. TAO Qing, GAO Qiankun, JIANG Jiyuan, et al. Survey of solving the optimization problems for sparse learning[J]. Journal of software, 2013, 24(11): 2498-2507.
[4] MOORE A W, ZUEV D. Internet traffic classification using bayesian analysis techniques[J]. ACM sigmetrics performance evaluation review, 2005, 33(1): 50-60.
[5] AULD T, MOORE A W, GULL S F. Bayesian neural networks for internet traffic classification[J]. IEEE transactions on neural networks, 2007, 18(1): 223-239.
[6] ESTE A, GRINGOLI F, SALGARELLI L. Support vector machines for TCP traffic classification[J]. Computer networks, 2009, 53(14): 2476-2490.
[7] SCHATZMANN D, MüHLBAUER W, SPYROPOULOS T, et al. Digging into HTTPS: flow-based classification of webmail traffic[C]//Proceedings of the 10th ACM SIGCOMM conference on internet measurement. New York, NY, USA, 2010: 322-327.
[8] WANG Yu, YU Shunzheng. Supervised learning real-time traffic classifiers[J]. Journal of networks, 2009, 4(7): 622-629.
[9] NGUYEN T T T, ARMITAGE G, BRANCH P, et al. Timely and continuous machine-learning-based classification for interactive IP traffic[J]. IEEE/ACM transactions on networking, 2012, 20(6): 1880-1894.
[10] ZANDER S, NGUYEN T, ARMITAGE G. Automated traffic classification and application identification using machine learning[C]//Proceedings of the IEEE conference on local computer networks 30th anniversary. Sydney, NSW, Australia, 2005: 250-257.
[11] ERMAN J, ARLITT M, MAHANTI A. Traffic classification using clustering algorithms[C]//Proceedings of the 2006 SIGCOMM workshop on mining network data. New York, NY, USA, 2006: 281-286.
[12] ROSENBLATT F. The perception: a probabilistic model for information storage and organization in the brain[J]. Psychological review, 1958, 65(6): 386-408.
[13] CRAMMER K, DEKEL O, KESHET J, et al. Online passive-aggressive algorithms[J]. Journal of machine learning research, 2006, 7(3): 551-585.
[14] CESA-BIANCHI N, CONCONI A, GENTILE C. A second-order perceptron algorithm[J]. SIAM journal on computing, 2005, 34(3): 640-668.
[15] CRAMMER K, DREDZE M, PEREIRA F. Exact convex confidence-weighted learning[C]//Advances in neural information processing systems 21. Mountain View, CA, USA, 2008: 345-352.
[16] CRAMMER K, KULESZA A, DREDZE M. Adaptive regularization of weight vectors[J]. Machine learning, 2013, 91(2): 155-187.
[17] WANG Jialei, ZHAO Peilin, HOI S C H. Exact soft confidence-weighted learning[C]//Proceedings of the 29th international conference on machine learning. Edinburgh, Scotland, UK, 2012.
[18] ZINKEVICH M. Online convex programming and generalized infinitesimal gradient ascent[C]//Proceedings of the international conference on machine learning. Washington, DC, USA, 2003: 928-936.
[19] CESA-BIANCHI N, CONCONI A, GENTILE C. On the generalization ability of on-line learning algorithms[J]. IEEE transactions on information theory, 2004, 50(9): 2050-2057.
[20] HOI S C H, WANG Jialei, ZHAO Peilin. LIBOL: a library for online learning algorithms[J]. Journal of machine learning research, 2014, 15(1): 495-499.
[21] LU Jing, HOI S C H, WANG Jialei, et al. Large scale online kernel learning[J]. Journal of machine learning research, 2014, 1: 1-48.

相似文献/References:

[1]富春岩,葛茂松.一种能够适应概念漂移变化的数据流分类方法[J].智能系统学报编辑部,2007,2(04):86.
 FU Chun-yan,GE Mao-song.A data stream classification methods adaptive to concept drift[J].CAAI Transactions on Intelligent Systems,2007,2(3):86.
[2]栾寻,高尉.优化AUC两遍学习算法[J].智能系统学报编辑部,2018,13(03):395.[doi:10.11992/tis.201706079]
 LUAN Xun,GAO Wei.Two-pass AUC optimization[J].CAAI Transactions on Intelligent Systems,2018,13(3):395.[doi:10.11992/tis.201706079]

备注/Memo

备注/Memo:
收稿日期:2016-3-18;改回日期:。
基金项目:国家自然科学基金项目(61473149).
作者简介:易磊,男,1991年生,硕士研究生,主要研究方向为机器学习及其在大规模网络流量分类中的应用。潘志松,男,1973年生,教授,博士生导师,江苏省计算机学会模式识别与人工智能专委会委员,主要研究方向为模式识别、机器学习、网络安全。主持国家科研项目多项,发表学术论文30余篇。邱俊洋,男,1989年生,博士研究生,主要研究方向为机器学习及其在大规模网络数据流异常检测中的应用,发表学术论文2篇。
通讯作者:易磊.E-mail:yileinjut@163.com.
更新日期/Last Update: 1900-01-01