<-上一篇/Previous Article 下一篇/Next Article->

[1]王德文,孙志伟.一种基于内存计算的电力用户聚类分析方法[J].智能系统学报编辑部,2015,10(4):569-576.[doi:10.3969/j.issn.1673-4785.201411011]
　WANG Dewen,SUN Zhiwei.A method for cluster analysis of electric power consumers based on in-memory computing[J].CAAI Transactions on Intelligent Systems,2015,10(4):569-576.[doi:10.3969/j.issn.1673-4785.201411011]

点击复制

一种基于内存计算的电力用户聚类分析方法

PDF下载 HTML

《智能系统学报》编辑部[ISSN 1673-4785/CN 23-1538/TP] 卷: 10 期数: 2015年第4期页码: 569-576 栏目: 学术论文—机器学习出版日期: 2015-08-25

Title:: A method for cluster analysis of electric power consumers based on in-memory computing

作者:: 王德文, 孙志伟; 华北电力大学控制与计算机工程学院, 河北保定 071003

Author(s):: WANG Dewen, SUN Zhiwei; School of Control and Computer Engineering, North China Electric Power University, Baoding 071003, China

关键词:: 大数据; 智能用电; 弹性分布式数据集; 内存计算; 聚类分析

Keywords:: big data; smart electricity consumption; resilient distributed data set; in-memory computing; cluster analysis

分类号:: TP18

DOI:: 10.3969/j.issn.1673-4785.201411011

文献标志码:: A

摘要:: 随着智能电表与采集终端采集的用电数据迅猛增长,传统数据分析方法已经不能满足大数据环境下智能用电行为分析的需要。鉴于K-means算法具有计算效率高、容易并行化等特点,采用弹性分布式数据集与并行内存计算框架对其进行改进与并行化,减少作业的运行与输入输出操作时间,提高聚类分析的处理能力。对用电测量数据进行预处理构建实验数据集,实验结果表明本方法对电力用户聚类分析的准确率高于单机K-means方法,其处理速度和能力明显优于单机和基于MapReduce并行计算框架的聚类方法,并对数据的增长具有较好的适应性。

Abstract:: With the rapid growth of electricity consumption data collected by smart electric meters and data acquisition terminals, the traditional data analysis method cannot meet the demand of smart power consumption behavior analysis in the big data environment. Since K-means algorithm demonstrates high calculation efficiency, easy parallelization and other characteristics, a method for improving and parallelizing K-means with the resilient distributed data set and parallel in-memory computing framework is presented, the running time of job operation and I/O operations is reduced, and the ability of clustering analysis is improved. The experimental data set is built by preprocessed electricity consumption data. Eexperimental results show that the accuracy rate by this cluster analysis method for electric power users is obviously better than the single machine K-means algorithm. The processing speed and ability of this method are superior to the single machine and the clustering method based on MapReduce parallel computing framework, and this method has good adaptability for the growth of data.

参考文献/References:: [1] 王蓓蓓, 李扬, 高赐威. 智能电网框架下的需求侧管理展望与思考[J]. 电力系统自动化, 2009, 33(20): 17-22. WANG Beibei, LI Yang, GAO Ciwei. Demand side management outlook under smart grid infrastructure[J]. Automation of Electric Power Systems, 2009, 33(20): 17-22.
[2] 何永秀, 王冰, 熊威, 等. 基于模糊综合评价的居民智能用电行为分析与互动机制设计[J]. 电网技术, 2012, 36(10): 247-252. HE Yongxiu, WANG Bing, XIONG Wei, et al. Analysis of residents’ smart electricity consumption behavior based on fuzzy synthetic evaluation and the design of interactive mechanism[J]. Power System Technology, 2012, 36(10): 247-252.
[3] 宋亚奇, 周国亮, 朱永利. 智能电网大数据处理技术现状与挑战[J]. 电网技术, 2013, 37(4): 927-935. SONG Yaqi, ZHOU Guoliang, ZHU Yongli. Present status and challenges of big data processing in smart grid [J]. Power System Technology, 2013, 37(4): 927-935.
[4] 何清. 物联网与数据挖掘云服务[J]. 智能系统学报, 2012, 7(3): 189-194. HE Qing. The Internet of things and the data mining cloud service[J]. CAAI Transactions on Intelligent Systems, 2012, 7(3): 189-194.
[5] 冯晓蒲, 张铁峰. 基于实际负荷曲线的电力用户分类技术研究[J]. 电力科学与工程, 2010, 26(9): 18-22. FENG Xiaopu, ZHANG Tiefeng. Research on electricity users classification technology based on actual load curve[J]. Electric Power Science and Engineering, 2010, 26(9): 18-22.
[6] 李培强, 李欣然, 陈辉华, 等. 基于模糊聚类的电力负荷特性的分类与综合[J]. 中国电机工程学报, 2005, 25(24): 73-78. LI Peiqiang, LI Xinran, CHEN Huihua, et al. The characteristics classification and synthesis of power load based on fuzzy clustering[J]. Proceedings of the CSEE, 2005, 25(24): 73-78.
[7] 段铷, 张彩庆, 刘爱芳. 模糊聚类在电力用户分类中的应用[J]. 电力需求侧管理, 2005, 7(5): 18-20. DUAN Ru, ZHANG Caiqing, LIU Aifang. Application of fuzzy clustering method in classification of electricity customers[J]. Power DSM, 2005, 7(5): 18-20.
[8] 张素香, 刘建明, 赵丙镇, 等. 基于云计算的居民用电行为分析模型研究[J]. 电网技术, 2013, 37(6): 1542-1546. ZHANG Suxiang, LIU Jianming, ZHAO Bingzhen, et al. Cloud computing-based analysis on residential electricity consumption behavior[J]. Power System Technology, 2013, 37(6): 1542-1546.
[9] 毛典辉. 基于MapReduce的Canopy-Kmeans改进算法[J]. 计算机工程与应用, 2012, 48(27): 22-26. MAO Dianhui. Improved Canopy-Kmeans algorithm based on MapReduce[J]. Computer Engineering and Applications, 2012, 48(27): 22-26.
[10] ZAHARIA M, CHOWDHURY M, FRANKLIN M J, et al. Spark: cluster computing with working sets[C] //Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing. Berkeley, CA, USA: USENIX Association, 2010.
[11] 赵薇, 刘杰, 叶丹. 基于组件的大数据分析服务平台[J]. 计算机科学, 2014, 41(9): 75-79. ZHAO Wei, LIU Jie, YE Dan. Module based big data analysis platform[J]. Computer Science, 2014, 41(9): 75-79.
[12] 赵莉, 候兴哲, 胡君, 等. 基于改进 k-means 算法的海量智能用电数据分析[J]. 电网技术, 2014, 38(10): 2715-2720. ZHAO Li, HOU Xingzhe, HU Jun, et al. Improved k-means algorithm based analysis on massive data of intelligent power utilization[J]. Power System Technology, 2014, 38(10): 2715-2720.
[13] 程艳柳. 基于云计算的智能电网数据挖掘的研究[D]. 保定: 华北电力大学, 2013:15-20. CHENG Yanliu. Research on smart grid data mining based on cloud computing[D]. Baoding: North China Electric Power University, 2013:15-20.
[14] ZAHARIA M, CHOWDHURY M, DAS T, et al. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing[C] //Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. Berkeley, USA: USENIX Association, 2012:1-14.
[15] LIN X Q, WANG P, WU B. Log analysis in cloud computing environment with Hadoop and Spark[C] //2013 5th IEEE International Conference on Broadband Network & Multimedia Technology (IC-BNMT). Guilin, China: IEEE, 2013: 273-276.
[16] GU L, LI H. Memory or time: performance evaluation for iterative operation on Hadoop and Spark[C]. 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC). Zhangjiajie, China: IEEE, 2013: 721-727.
[17] 海沫, 张书云, 马燕林. 分布式环境中聚类问题算法研究综述[J]. 计算机应用研究, 2013, 30(9): 2561-2564. HAI Mo, ZHANG Shuyun, MA Yanlin. Algorithm review of distributed clustering problem in distributed environments[J]. Application Research of Computers, 2013, 30(9): 2561-2564.
[18] 余晓山, 吴扬扬. 基于MapReduce的文本层次聚类并行化[J]. 计算机应用, 2014, 34(6): 1595-1599, 1680. YU Xiaoshan, WU Yangyang. Parallel text hierarchical clustering based on MapReduce[J]. Journal of Computer Applications, 2014, 34(6): 1595-1599, 1680.
[19] MCCALLUM A, NIGAM K, UNGAR L H. Efficient clustering of high-dimensional data sets with application to reference matching[C]//Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2000: 169-178.
[20] KANUNGO T, MOUNT D M, NETANYAHU N S, et al. An efficient k-means clustering algorithm: Analysis and implementation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 881-892.

相似文献/References:: [1]辛雨璇,闫子飞.基于手绘草图的图像检索技术研究进展[J].智能系统学报编辑部,2015,10(2):167.[doi:10.3969/j.issn.1673-4785.201401045]
　XIN Yuxuan,YAN Zifei.Research progress of image retrieval based on hand-drawn sketches[J].CAAI Transactions on Intelligent Systems,2015,10():167.[doi:10.3969/j.issn.1673-4785.201401045]
[2]申彦,朱玉全.CMP上基于数据集划分的K-means多核优化算法[J].智能系统学报编辑部,2015,10(4):607.[doi:10.3969/j.issn.1673-4785.201411036]
　SHEN Yan,ZHU Yuquan.An optimized algorithm of K-means based on data set partition on CMP systems[J].CAAI Transactions on Intelligent Systems,2015,10():607.[doi:10.3969/j.issn.1673-4785.201411036]
[3]黄河燕,曹朝,冯冲.大数据情报分析发展机遇及其挑战[J].智能系统学报编辑部,2016,11(6):719.[doi:10.11992/tis.201610025]
　HUANG Heyan,CAO Zhao,FENG Chong.Opportunities and challenges of big data intelligence analysis[J].CAAI Transactions on Intelligent Systems,2016,11():719.[doi:10.11992/tis.201610025]
[4]马世龙,乌尼日其其格,李小平.大数据与深度学习综述[J].智能系统学报编辑部,2016,11(6):728.[doi:10.11992/tis.201611021]
　MA Shilong,WUNIRI Qiqige,LI Xiaoping.Deep learning with big data: state of the art and development[J].CAAI Transactions on Intelligent Systems,2016,11():728.[doi:10.11992/tis.201611021]
[5]苗夺谦,张清华,钱宇华,等.从人类智能到机器实现模型——粒计算理论与方法[J].智能系统学报编辑部,2016,11(6):743.[doi:10.11992/tis.201612014]
　MIAO Duoqian,ZHANG Qinghua,QIAN Yuhua,et al.From human intelligence to machine implementation model: theories and applications based on granular computing[J].CAAI Transactions on Intelligent Systems,2016,11():743.[doi:10.11992/tis.201612014]
[6]严新平,柳晨光.智能航运系统的发展现状与趋势[J].智能系统学报编辑部,2016,11(6):807.[doi:10.11992/tis.201605007]
　YAN Xinping,LIU Chenguang.Review and prospect for intelligent waterway transportation system[J].CAAI Transactions on Intelligent Systems,2016,11():807.[doi:10.11992/tis.201605007]
[7]许立波,潘旭伟,袁平,等.知识智能涌现创新：概念、体系与路径[J].智能系统学报编辑部,2017,12(1):47.[doi:10.11992/tis.201610014]
　XU Libo,PAN Xuwei,YUAN Ping,et al.Knowledge innovation by intelligent emergence—concept, framework and its pathway[J].CAAI Transactions on Intelligent Systems,2017,12():47.[doi:10.11992/tis.201610014]
[8]何明,常盟盟,刘郭洋,等.基于SQL-on-Hadoop查询引擎的日志挖掘及其应用[J].智能系统学报编辑部,2017,12(5):717.[doi:10.11992/tis.201706016]
　HE Ming,CHANG Mengmeng,LIU Guoyang,et al.Log mining and application based on sql-on-hadoop query engine[J].CAAI Transactions on Intelligent Systems,2017,12():717.[doi:10.11992/tis.201706016]
[9]马钰,张岩,王宏志,等.面对智能导诊的个性化推荐算法[J].智能系统学报编辑部,2018,13(3):352.[doi:10.11992/tis.201711036]
　MA Yu,ZHANG Yan,WANG Hongzhi,et al.A personalized recommendation algorithm for intelligent guidance[J].CAAI Transactions on Intelligent Systems,2018,13():352.[doi:10.11992/tis.201711036]
[10]牛德姣,刘亚文,蔡涛,等.基于递归神经网络的跌倒检测系统[J].智能系统学报编辑部,2018,13(3):380.[doi:10.11992/tis.201710013]
　NIU Dejiao,LIU Yawen,CAI Tao,et al.Fall detection system based on recurrent neural network[J].CAAI Transactions on Intelligent Systems,2018,13():380.[doi:10.11992/tis.201710013]

备注/Memo

收稿日期:2014-11-10;改回日期:。
基金项目:国家自然科学基金资助项目(61074078);中央高校基本科研业务费专项资金资助项目(12MS113).
作者简介:王德文,男,1973年生,副教授,主要研究方向为云计算、大数据分析;孙志伟,男,1987年生,硕士研究生,主要研究方向为云计算与大数据挖掘。
通讯作者:孙志伟.E-mail:sunzw20120901@126.com.

更新日期/Last Update: 2015-08-28

一种基于内存计算的电力用户聚类分析方法 PDF下载HTML

备注/Memo

一种基于内存计算的电力用户聚类分析方法

PDF下载 HTML