<-上一篇/Previous Article 下一篇/Next Article->

[1]赵燕伟,朱芬,桂方志,等.基于可拓距的改进k-means聚类算法[J].智能系统学报,2020,15(2):344-351.[doi:10.11992/tis.201811020]
　ZHAO Yanwei,ZHU Fen,GUI Fangzhi,et al.Improved k-means algorithm based on extension distance[J].CAAI Transactions on Intelligent Systems,2020,15(2):344-351.[doi:10.11992/tis.201811020]

点击复制

基于可拓距的改进k-means聚类算法

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 15 期数: 2020年第2期页码: 344-351 栏目: 学术论文—人工智能基础出版日期: 2020-03-05

Title:: Improved k-means algorithm based on extension distance

作者:: 赵燕伟¹, 朱芬¹, 桂方志¹, 任设东², 谢智伟¹, 徐晨¹; 1. 浙江工业大学特种装备制造与先进加工技术教育部/浙江省重点实验室, 浙江杭州 310014;
2. 浙江业大学计算机科学与技术学院, 浙江杭州 310014

Author(s):: ZHAO Yanwei¹, ZHU Fen¹, GUI Fangzhi¹, REN Shedong², XIE Zhiwei¹, XU Chen¹; 1. Key Lab of Special Purpose Equipment and Advanced Manufacturing Technology, Ministry of Education & Zhejiang Province, Zhejiang University of Technology, Hangzhou 310014, China;
2. College of Computer Science and Technology, Zhejiang University of T

关键词:: 可拓距; k-means聚类算法; 缩放因子; 初始聚类中心; 密集度; 疏远度

Keywords:: extension distance; k-means clustering algorithm; scaling factor; initial cluster center; intensity; alienation

分类号:: TP181

DOI:: 10.11992/tis.201811020

摘要:: 针对现有聚类算法在初始聚类中心优化过程中存在首个初始聚类中心点落于边界非密集区域的不足，导致出现算法聚类效果不均衡问题，提出一种基于可拓距优选初始聚类中心的改进k-means算法。将样本经典距离向可拓区间映射，并通过可拓侧距计算方法得到可拓左侧距及可拓右侧距；引入平均可拓侧距概念，将平均可拓左侧距和平均可拓右侧距分别作为样本密集度和聚类中心疏远度的量化指标；在此基础上，给出初始聚类中心选取准则。通过与传统k-means聚类算法进行对比，结果表明改进后的k-means聚类算法选取的初始聚类中心分布更加均匀，聚类效果更好，尤其在对高维数据聚类时具有更高的聚类准确率和更好的均衡性。

Abstract:: An improved k -means algorithm optimizing the initial cluster centers based on extension distance was proposed to solve several problems that lead to clustering imbalance of the algorithm, such as the poor quality of initial cluster center selection or the first initial cluster center easily falling into the non-dense area of the data boundary. First, the classical distance of the sample was mapped onto the extension interval, and the extension left-side and right-side distances were obtained using the extension distance calculation method. Then, the average extension side distance was determined, and the extension left-side and right-side distances were taken as the quantitative indicators of sample density and cluster center distance, respectively. Subsequently, the selection criteria of the initial cluster center were given. Finally, compared with the traditional k-means algorithm, the improved k-means algorithm obtained higher clustering accuracy and better balance, particularly in high-dimensional data clustering.

参考文献/References:: [1] 于佐军, 秦欢. 基于改进蜂群算法的k-means算法[J]. 控制与决策, 2018, 33(1): 181-185
YU Zuojun, QIN Huan. k-means algorithm based on improved artificial bee colony algorithm[J]. Control and decision, 2018, 33(1): 181-185
[2] HE Hong, TAN Yonghong. Automatic pattern recognition of ECG signals using entropy-based adaptive dimensionality reduction and clustering[J]. Applied soft computing, 2017, 55: 238-252.
[3] PENG Chong, KANG Zhao, XU Fei, et al. Image projection ridge regression for subspace clustering[J]. IEEE signal processing letters, 2017, 24(7): 991-995.
[4] R?THLISBERGER V, ZISCHG A P, KEILER M, et al. Identifying spatial clusters of flood exposure to support decision making in risk management[J]. Science of the total environment, 2017, 598: 593-603.
[5] 何熊熊, 管俊轶, 叶宣佐, 等. 一种基于密度和网格的簇心可确定聚类算法[J]. 控制与决策, 2017, 32(5): 913-919
HE Xiongxiong, GUAN Junyi, YE Xuanzuo, et al. A density-based and grid-based cluster centers determination clustering algorithm[J]. Control and decision, 2017, 32(5): 913-919
[6] 李亚, 刘丽平, 李柏青, 等. 基于改进K-Means聚类和BP神经网络的台区线损率计算方法[J]. 中国电机工程学报, 2016, 36(17): 4543-4551
LI Ya, LIU Liping, LI Baiqing, et al. Calculation of line loss rate in transformer district based on improved K-Means clustering algorithm and BP neural network[J]. Proceedings of the CSEE, 2016, 36(17): 4543-4551
[7] 邢长征, 谷浩. 基于平均密度优化初始聚类中心的k-means算法[J]. 计算机工程与应用, 2014, 50(20): 135-138
XING Changzheng, GU Hao. K-means algorithm based on average density optimizing initial cluster centre[J]. Computer engineering and applications, 2014, 50(20): 135-138
[8] 张天骐, 杨强, 宋玉龙, 等. 一种K-means改进算法的软扩频信号伪码序列盲估计[J]. 电子与信息学报, 2018, 40(1): 226-234
ZHANG Tianqi, YANG Qiang, SONG Yulong, et al. Blind estimation PN sequence in soft spread spectrum signal of improved K-means algorithm[J]. Journal of electronics & information technology, 2018, 40(1): 226-234
[9] 李晓瑜, 俞丽颖, 雷航, 等. 一种K-means改进算法的并行化实现与应用[J]. 电子科技大学学报, 2017, 46(1): 61-68
LI Xiaoyu, YU Liying, LEI Hang, et al. The parallel implementation and application of an improved K-means algorithm[J]. Journal of University of Electronic Science and Technology of China, 2017, 46(1): 61-68
[10] TZORTZIS G, LIKAS A. The MinMax k-means clustering algorithm[J]. Pattern recognition, 2014, 47(7): 2505-2516.
[11] ABUALIGAH L M, KHADER A T, AI-BETAR M A. Unsupervised feature selection technique based on harmony search algorithm for improving the text clustering[C]//Proceedings of the 7th International Conference on Computer Science and Information Technology. Amman, Jordan, 2016: 1-6.
[12] LI Yanyan, WANG Qing, CHEN Jianping, et al. K-means algorithm based on particle swarm optimization for the identification of rock discontinuity sets[J]. Rock mechanics and rock engineering, 2015, 48(1): 375-385.
[13] KHANMOHAMMADI S, ADIBEIG N, SHANEHBANDY S. An improved overlapping k-means clustering method for medical applications[J]. Expert systems with applications, 2017, 67: 12-18.
[14] 杨春燕, 蔡文. 可拓学[M]. 北京: 科学出版社, 2014.
[15] 管凤旭. 基于流形学习及可拓分类器的手指静脉识别研究[D]. 哈尔滨: 哈尔滨工程大学, 2010.
GUAN Fengxu. Research on finger vein recognition based on manifold learning and extension classifier[D]. Harbin: Harbin Engineering University, 2010.
[16] 赵燕伟, 苏楠, 张峰, 等. 基于可拓实例推理的产品族配置设计方法[J]. 机械工程学报, 2010, 46(15): 146-154
ZHAO Yanwei, SU Nan, ZHANG Feng, et al. Configuration design method for product family based on extension case reasoning[J]. Journal of mechanical engineering, 2010, 46(15): 146-154
[17] 叶永伟, 张帆, 王运. 基于可拓距的起重机产品配置方法设计[J]. 中国制造业信息化, 2012, 41(23): 24-27
YE Yongwei, ZHANG Fan, WANG Yun. The crane products configuration design based on extension distance[J]. Manufacturing information engineering of China, 2012, 41(23): 24-27
[18] NOUAOURIA N, BOUKADOUM M. Case retrieval with combined adaptability and similarity criteria: application to case retrieval nets[C]//Proceedings of the 18th International Conference on Case-Based Reasoning. Research and Development. Alessandria, Italy, 2010: 242-256.
[19] 赵燕伟, 任设东, 陈尉刚, 等. 基于改进BP神经网络的可拓分类器构建[J]. 计算机集成制造系统, 2015, 21(10): 2807-2815
ZHAO Yanwei, REN Shedong, CHEN Weigang, et al. Extension classifier construction based on improved BP neural network[J]. Computer integrated manufacturing systems, 2015, 21(10): 2807-2815
[20] 李敏. K-means算法的改进及其在文本聚类中的应用研究[D]. 无锡: 江南大学, 2018.
LI Min. The research and application of text clustering based on improved K-means algorithm[D]. Wuxi: Jiangnan University, 2018.
[21] 杨明极, 马池, 王娅, 等. 一种改进K-means聚类的FCMM算法[J]. 计算机应用研究, 2019, 36(7): 2007-2010
YANG Mingji, MA Chi, WANG Ya, et al. Algorithm named FCMM to improve K-means clustering algorithm[J]. Application research of computers, 2019, 36(7): 2007-2010
[22] 韩俊, 谈健, 黄河, 等. 基于改进K-means聚类算法的供电块划分方法[J]. 电力自动化设备, 2015, 35(6): 123-129
HAN Jun, TAN Jian, HUANG He, et al. Power-supplying block partition based on improved K-means clustering algorithm[J]. Electric power automation equipment, 2015, 35(6): 123-129
[23] 李武, 赵娇燕, 严太山. 基于平均差异度优选初始聚类中心的改进K-均值聚类算法[J]. 控制与决策, 2017, 32(4): 759-762
LI Wu, ZHAO Jiaoyan, YAN Taishan. Improved K-means clustering algorithm optimizing initial clustering centers based on average difference degree[J]. Control and decision, 2017, 32(4): 759-762

相似文献/References:: [1]冯柳伟,常冬霞,邓勇,等.最近最远得分的聚类性能评价指标[J].智能系统学报,2017,12(1):67.[doi:10.11992/tis.201611007]
　FENG Liuwei,CHANG Dongxia,DENG Yong,et al.A clustering evaluation index based on the nearest and furthest score[J].CAAI Transactions on Intelligent Systems,2017,12():67.[doi:10.11992/tis.201611007]
[2]徐华畅,许倩,赵钰琳,等.基于AEViT与先验知识的胶质瘤IDH1突变状态预测[J].智能系统学报,2024,19(4):952.[doi:10.11992/tis.202209055]
　XU Huachang,XU Qian,ZHAO Yulin,et al.Prediction of glioma IDH1 mutation status based on AEViT and prior knowledge[J].CAAI Transactions on Intelligent Systems,2024,19():952.[doi:10.11992/tis.202209055]

备注/Memo

收稿日期:2018-11-26。
基金项目:国家自然科学基金项目(51875524)；浙江省公益技术应用研究计划项目(2017C31072)
作者简介:赵燕伟，教授，博士生导师，博士，主要研究方向为可拓设计理论与方法、物流系统智能配送与优化调度、数字化产品现代设计。出版教材4部，多次获得国家自然基金项目资助等。发表学术论文100余篇;朱芬，硕士研究生，主要研究方向为可拓设计;桂方志，博士研究生，主要研究方向为可拓设计
通讯作者:赵燕伟(1959-).E-mail:ywz@zjut.edu.cn

更新日期/Last Update: 1900-01-01

基于可拓距的改进k-means聚类算法 PDF下载HTML

备注/Memo

基于可拓距的改进k-means聚类算法

PDF下载 HTML