[1]许腾腾,王瑞,黄恒君.一种加入类间因素的曲线聚类算法[J].智能系统学报,2019,14(2):362-368.[doi:10.11992/tis.201709029]
 XU Tengteng,WANG Rui,HUANG Hengjun.Curve clustering algorithms by adding the differences among clusters[J].CAAI Transactions on Intelligent Systems,2019,14(2):362-368.[doi:10.11992/tis.201709029]
点击复制

一种加入类间因素的曲线聚类算法(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第14卷
期数:
2019年2期
页码:
362-368
栏目:
学术论文—智能系统
出版日期:
2019-03-05

文章信息/Info

Title:
Curve clustering algorithms by adding the differences among clusters
作者:
许腾腾 王瑞 黄恒君
兰州财经大学 统计学院, 甘肃 兰州 730020
Author(s):
XU Tengteng WANG Rui HUANG Hengjun
School of Statistics, Lanzhou University of Finance and Economics, Lanzhou 730020, China
关键词:
函数型数据类间差异曲线聚类B-样条距离度量
Keywords:
functional datadifferences among clusterscurve clusteringB-splinedistance metric
分类号:
TP181
DOI:
10.11992/tis.201709029
摘要:
针对目前的曲线聚类算法基于类内差异设计,造成不同类之间的曲线区分度不高的问题。在曲线拟合、曲线距离界定的基础上,构造新的目标函数,提出同时考虑类内和类间差异的曲线聚类算法。模拟结果显示,该曲线聚类能够提高聚类精度;针对NO2污染物小时浓度的实例分析表明,该曲线聚类算法具有更好的类间区分度。
Abstract:
With the improvement of accuracy and frequency of data collection, functional data has appeared. Curves’ clustering is a fundamental exploratory task in functional data analysis, and To sovave currently curves clustering algorithms available are based on the differences within each cluster, which has resulted in a low distinction among different curves. Therefore, on the base of curve fitting and curve distance, and with constructed objective function, curves clustering algorithms will be put forward with the consideration of cluster differences. Simulated results show that the curve cluster improves clustering accuracy. The example analysis of hourly NO2 concentration (μg/m3) indicates that this kind of curves clustering algorithms has a better distinction among different clusters.

参考文献/References:

[1] RAMSAY J O. When the data are functions[J]. Psychometrika, 1982, 47(4):379-396.
[2] JACQUES J, PREDA C. Functional data clustering:a survey[J]. Advances in data analysis and classification, 2014, 8(3):231-255.
[3] RAMSAY J O, SILVERMAN B W. Functional data analysis[M]. 2nd ed. New York:Springer, 2005:1-18.
[4] FERRATY F, VIEU P. Nonparametric functional data analysis:theory and practice[M]. New York:Springer, 2006:11-18.
[5] BOUVEYRON C, BRUNET-SAUMARD C. Model-based clustering of high-dimensional data:a review[J]. Computational statistics & data analysis, 2014, 71:52-78.
[6] ROSSI F, CONAN-GUEZ B, GOLLI A E. Clustering functional data with the SOM algorithm[C]//proceedings of European Symposium on Artificial Neural Networks. Bruges, Belgium, 2004:305-312.
[7] PENG Jie, MüLLER H G. Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions[J]. The annals of applied statistics, 2008, 2(3):1056-1077.
[8] ABRAHAM C, CORNILLON P A, MATZNER-L?BER E, et al. Unsupervised curve clustering using B-splines[J]. Scandinavian journal of statistics, 2003, 30(3):581-595.
[9] 黄恒君. 基于B-样条基底展开的曲线聚类方法[J]. 统计与信息论坛, 2013, 28(9):3-8 HUANG Hengjun. Curves clustering using B-splines expansion[J]. Statistics & information forum, 2013, 28(9):3-8
[10] KAYANO M, DOZONO K, KONISHI S. Functional cluster analysis via orthonormalized gaussian basis expansions and its application[J]. Journal of classification, 2010, 27(2):211-230.
[11] 王永坤, 王海洋, 潘平峻, 等. 面向公共安全的时空数据挖掘综述[J]. 重庆邮电大学学报(自然科学版), 2018, 30(1):40-52 WANG Yongkun, WANG Haiyang, PAN Pingjun, et al. A survey of data mining on spatial-temporal user behavior data for public safety[J]. Journal of chongqing university of posts and telecommunications (natural science edition), 2018, 30(1):40-52
[12] CHEAM A S M, MARBAC M, MCNICHOLAS P D. Model-based clustering for spatiotemporal data on air quality monitoring[J]. Environmetrics, 2017, 28(3):e2437.
[13] BOUVEYRON C, JACQUES J. Model-based clustering of time series in group-specific functional subspaces[J]. Advances in data analysis and classification, 2011, 5(4):281-300.
[14] CHIOU J M, LI Pailing. Functional clustering and identifying substructures of longitudinal data[J]. Journal of the royal statistical society series B, 2007, 69(4):679-699.
[15] 王永, 万潇逸, 陶娅芝, 等. 基于K-medoids项目聚类的协同过滤推荐算法[J]. 重庆邮电大学学报(自然科学版), 2017, 29(4):521-526 WANG Yong, WAN Xiaoyi, TAO Yazhi, et al. Collaborative filtering recommendation algorithm based on K-medoids item clustering[J]. Journal of Chongqing university of posts and telecommunications (natural science edition), 2017, 29(4):521-526
[16] JACQUES J, PREDA C. Model-based clustering for multivariate functional data[J]. Computational statistics & data analysis, 2014, 71:92-106.
[17] JACQUES J, PREDA C. Funclust:a curves clustering method using functional random variables density approximation[J]. Neurocomputing, 2013, 112:164-171.
[18] 卞则康, 王士同. 基于混合距离学习的鲁棒的模糊C均值聚类算法[J]. 智能系统学报, 2017, 12(4):450-458 BIAN Zekang, WANG Shitong. Robust FCM clustering algorithm based on hybrid-distance learning[J]. CAAI transactions on intelligent systems, 2017, 12(4):450-458
[19] HUANG Xiaohui, YE Yunming, ZHANG Haijun. Extensions of kmeans-type algorithms:a new clustering framework by integrating intracluster compactness and intercluster separation[J]. IEEE transactions on neural networks and learning systems, 2014, 25(8):1433-1446.
[20] JAIN A K, DUBES R C. Algorithms for clustering data[M]. Upper Saddle River, NJ:Prentice-Hall, 1988:227-229.
[21] 黄恒君, 漆威. 海量半结构化数据采集、存储及分析——基于实时空气质量数据处理的实践[J]. 统计研究, 2014, 31(5):10-16 HUANG Hengjun, QI Wei. Massive semi-structured data:collection, storage and analysis——based on the practice of real-time air quality data processing[J]. Statistical research, 2014, 31(5):10-16
[22] 刘杰, 杨鹏, 吕文生, 等. 基于北京市6类污染物的环境空气质量评价方法[J]. 安全与环境学报, 2015, 15(1):310-315 LIU Jie, YANG Peng, Lü Wensheng, et al. Environmental air quality evaluation method based on the six pollutants in the urban areas of Beijing[J]. Journal of safety and environment, 2015, 15(1):310-315
[23] 郭云飞, 林红飞, 郑旭. 中国城市空气质量指标的聚类分析[J]. 统计与管理, 2016(8):80-81 GUO Yunfei, LIN Hongfei, ZHENG Xu. Clustering analysis of urban air quality indexes in China[J]. Statistics and management, 2016(8):80-81
[24] YAMAMOTO M, HWANG H. Dimension-reduced clustering of functional data via subspace separation[J]. Journal of classification, 2017, 34(2):294-326.

备注/Memo

备注/Memo:
收稿日期:2017-09-15。
基金项目:国家社科基金青年项目(14CTJ009,15CTJ004);全国统计科学研究重点项目(2017LZ43);陇原青年创新人才扶持计划项目(14GSD95).
作者简介:许腾腾,男,1992年生,硕士研究生,主要研究方向为异源异构数据整合与函数型数据分析。;王瑞,女,1993年生,硕士研究生,主要研究方向为经济统计。;黄恒君,男,1981年生,教授,博士,主要研究方向为异源异构数据整合与函数型数据分析。主持国家社会科学基金项目1项,获得省部级科研奖励4项。发表学术论文30余篇。
通讯作者:黄恒君.E-mail:noahwong@163.com
更新日期/Last Update: 2019-04-25