[1]WANG Yue,YANG Yan,WANG Hongjun.An improved transfer fuzzy clustering with few labels[J].CAAI Transactions on Intelligent Systems,2016,11(3):310-317.[doi:10.11992/tis.201603046]
Copy
CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume:
11
Number of periods:
2016 3
Page number:
310-317
Column:
学术论文—机器学习
Public date:
2016-06-25
- Title:
-
An improved transfer fuzzy clustering with few labels
- Author(s):
-
WANG Yue; YANG Yan; WANG Hongjun
-
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China
-
- Keywords:
-
clustering; transfer learning; semi-supervised; possibilistic C-means; fuzzy C-means
- CLC:
-
TP301
- DOI:
-
10.11992/tis.201603046
- Abstract:
-
In the traditional clustering algorithm, it is difficult to utilize existing historical information, which tends to be less effective in cases in which the data is contaminated. The semi-supervised clustering algorithm is often used in such circumstances, wherein the target data has some labeled examples. For situations in which the source data has partially labeled samples, in this paper, we propose a semi-supervised fuzzy possibilistic C-means algorithm (SS-FPCM). Based on the transfer learning framework, we use a transfer semi-supervised fuzzy possibilistic C-means algorithm (TSS-FPCM) to avoid the negative transfer learning problem. Finally, in order to make full use of source data information, we use representative points to replace the source data class. Thus, we have developed an improved transfer semi-supervised fuzzy possibilistic C-means algorithm (ITSS-FPCM). The experimental results demonstrate that these three algorithms may be used to improve the clustering performance by using source data effectively, as compared with other clustering algorithms. Moreover, the SS-FPCM and TSS-FPCM algorithms exploit partially labeled data from the source, while the ITSS-FPCM algorithm combines the labeled data and "representative points," for cases having insufficient data information or contaminated data, and an excellent clustering result is attained.