[1]唐益明,陈仁好,李冰.面向模糊C均值算法的MAME聚类有效性指标[J].智能系统学报,2023,18(5):945-956.[doi:10.11992/tis.202212028]
TANG Yiming,CHEN Renhao,LI Bing.A clustering validity index called MAME for the fuzzy c-means algorithm[J].CAAI Transactions on Intelligent Systems,2023,18(5):945-956.[doi:10.11992/tis.202212028]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
18
期数:
2023年第5期
页码:
945-956
栏目:
学术论文—机器学习
出版日期:
2023-09-05
- Title:
-
A clustering validity index called MAME for the fuzzy c-means algorithm
- 作者:
-
唐益明, 陈仁好, 李冰
-
合肥工业大学 计算机与信息学院, 安徽 合肥 230601
- Author(s):
-
TANG Yiming, CHEN Renhao, LI Bing
-
School of Computer and Information, Hefei University of Technology, Hefei 230601, China
-
- 关键词:
-
聚类; 模糊聚类; 模糊C均值; 聚类有效性指标; 内部指标; 外部指标; 紧致性; 分离性
- Keywords:
-
clustering; fuzzy clustering; fuzzy c-means; clustering validity index; internal criteria; external criteria; compactness; separation
- 分类号:
-
TP181;TN99
- DOI:
-
10.11992/tis.202212028
- 摘要:
-
聚类有效性指标可用来评估聚类结果的有效性,并且帮助判别聚类的类别数。现有的面向模糊C均值算法的聚类有效性指标存在对于类内紧致性的刻画不太到位、对于类间分离性的度量刻画不够准确的问题。为此,基于类内紧致性和类间分离性两个角度着手设计,提出了一种新的模糊聚类有效性指标——考虑最大值和均值的指标(maximum-mean,MAME)。首先,考虑了整个数据集的综合特征,计算分别分为K类和1类的情况的比值,提出了一种新的模糊紧致性度量表达式。其次,引入最大聚类中心距离和平均聚类中心距离,提出了一种新的分离性度量方法。最后,从模糊紧致性度量表达式、分离性度量方法出发,提出了MAME指标。面向5个UCI数据集和6个人工数据集,和9个聚类有效性指标(包括CH、DB、NPC、PE、FSI、XBI、NPE、WLI和I指标)一起进行了对比实验,验证了所提指标的准确性、稳定性,说明了MAME指标的鲁棒性较好。
- Abstract:
-
The clustering validity index can be used to evaluate the effectiveness of clustering results and determine the number of clusters. However, existing validity indices for fuzzy c-mean algorithm suffer from the inadequate characterization of intracluster compactness and inaccurate measurement of intercluster separability. To address these issues, we proposed a new fuzzy clustering validity index called maximum-mean (MAME), which considers the maximum and mean values and is designed based on two perspectives, intracluster compactness and intercluster separability. First, considering the comprehensive characteristics of the entire dataset, a new expression of fuzzy compactness measure is put forward by calculating the ratio of cases divided into K clusters and one cluster, respectively. Second, by introducing the maximum and mean distance between cluster centers, a new method is proposed for separability measurement. Finally, the MAME index is put forward on the strength of fuzzy compactness measure expression and the separability measure method. Using five UCI and six artificial datasets, MAME is compared with nine other cluster validity indices, including CH, DB, NPC, PE, FSI, XBI, NPE, WLI, and I. The experimental results demonstrate the accuracy and stability of our proposed index, indicating that MAME has good robustness.
更新日期/Last Update:
1900-01-01