[1]王德文,孙志伟.一种基于内存计算的电力用户聚类分析方法[J].智能系统学报编辑部,2015,10(4):569-576.[doi:10.3969/j.issn.1673-4785.201411011]
WANG Dewen,SUN Zhiwei.A method for cluster analysis of electric power consumers based on in-memory computing[J].CAAI Transactions on Intelligent Systems,2015,10(4):569-576.[doi:10.3969/j.issn.1673-4785.201411011]
点击复制
《智能系统学报》编辑部[ISSN 1673-4785/CN 23-1538/TP] 卷:
10
期数:
2015年第4期
页码:
569-576
栏目:
学术论文—机器学习
出版日期:
2015-08-25
- Title:
-
A method for cluster analysis of electric power consumers based on in-memory computing
- 作者:
-
王德文, 孙志伟
-
华北电力大学 控制与计算机工程学院, 河北 保定 071003
- Author(s):
-
WANG Dewen, SUN Zhiwei
-
School of Control and Computer Engineering, North China Electric Power University, Baoding 071003, China
-
- 关键词:
-
大数据; 智能用电; 弹性分布式数据集; 内存计算; 聚类分析
- Keywords:
-
big data; smart electricity consumption; resilient distributed data set; in-memory computing; cluster analysis
- 分类号:
-
TP18
- DOI:
-
10.3969/j.issn.1673-4785.201411011
- 文献标志码:
-
A
- 摘要:
-
随着智能电表与采集终端采集的用电数据迅猛增长,传统数据分析方法已经不能满足大数据环境下智能用电行为分析的需要。鉴于K-means算法具有计算效率高、容易并行化等特点,采用弹性分布式数据集与并行内存计算框架对其进行改进与并行化,减少作业的运行与输入输出操作时间,提高聚类分析的处理能力。对用电测量数据进行预处理构建实验数据集,实验结果表明本方法对电力用户聚类分析的准确率高于单机K-means方法,其处理速度和能力明显优于单机和基于MapReduce并行计算框架的聚类方法,并对数据的增长具有较好的适应性。
- Abstract:
-
With the rapid growth of electricity consumption data collected by smart electric meters and data acquisition terminals, the traditional data analysis method cannot meet the demand of smart power consumption behavior analysis in the big data environment. Since K-means algorithm demonstrates high calculation efficiency, easy parallelization and other characteristics, a method for improving and parallelizing K-means with the resilient distributed data set and parallel in-memory computing framework is presented, the running time of job operation and I/O operations is reduced, and the ability of clustering analysis is improved. The experimental data set is built by preprocessed electricity consumption data. Eexperimental results show that the accuracy rate by this cluster analysis method for electric power users is obviously better than the single machine K-means algorithm. The processing speed and ability of this method are superior to the single machine and the clustering method based on MapReduce parallel computing framework, and this method has good adaptability for the growth of data.
备注/Memo
收稿日期:2014-11-10;改回日期:。
基金项目:国家自然科学基金资助项目(61074078);中央高校基本科研业务费专项资金资助项目(12MS113).
作者简介:王德文,男,1973年生,副教授,主要研究方向为云计算、大数据分析;孙志伟,男,1987年生,硕士研究生,主要研究方向为云计算与大数据挖掘。
通讯作者:孙志伟.E-mail:sunzw20120901@126.com.
更新日期/Last Update:
2015-08-28