<-Previous Article Next Article->

[1]ZHANG Xiongtao,CHEN Tianyu,ZHAO Kang,et al.TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation[J].CAAI Transactions on Intelligent Systems,2025,20(5):1136-1147.[doi:10.11992/tis.202410028]

Copy

TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation

PDF Download HTML

CAAI Transactions on Intelligent Systems[ISSN 1673-4785/CN 23-1538/TP] Volume: 20 Number of periods: 2025 5 Page number: 1136-1147 Column: 学术论文—机器感知与模式识别 Public date: 2025-09-05

Title:: TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation

Author(s):: ZHANG Xiongtao¹; 2; CHEN Tianyu¹; 2; ZHAO Kang¹; 2; LI Shuimiao²; 3; SHEN Qing¹; 2; 1. School of Information Engineering, Huzhou University, Huzhou 313000, China;
2. Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000, China;
3. Information Technology Center, Huzhou University, Huzhou 313000, China

Keywords:: TSK fuzzy classifier; knowledge distillation; multiple teacher networks; adaptive allocation of weights; dark knowledge; fuzzy system; different perspectives; deep learning

CLC:: TP181

DOI:: 10.11992/tis.202410028

Abstract:: Currently, hierarchical and deep fuzzy systems demonstrate excellent performance, but they often suffer from high model complexity. Lightweight Takagi-Sugeno-Kang (TSK) fuzzy classifiers based on distillation learning typically rely on single-teacher knowledge distillation. However, if the teacher model underperforms, then the distillation effect and the overall model performance can be compromised. Furthermore, traditional multiteacher distillation approaches often assign weights to teacher model outputs using label-free strategies, which may allow low-quality teachers to mislead the student model. Aiming to address these issues, this paper introduces a TSK fuzzy classifier based on multiteacher adaptive knowledge distillation (TSK-MTAKD). The method employs multiple deep neural networks, each with different neural expression capabilities, as teacher models. The proposed distillation framework extracts dark knowledge from these models and transfers it to a TSK fuzzy system, leveraging its strong capability to handle uncertainty. Additionally, an adaptive weight allocator is introduced, which performs cross-entropy calculations between the output of the teacher model and the true label. Outputs that are closer to the true label are assigned higher weights, thereby improving model robustness and the quality of dark knowledge. Experimental results on 13 UCI benchmark datasets validate the advantages of the TSK-MTAKD approach.

References:: [1] 苏丽, 孙雨鑫, 苑守正. 基于深度学习的实例分割研究综述[J]. 智能系统学报, 2022, 17(1): 16-31.
SU Li, SUN Yuxin, YUAN Shouzheng. A survey of instance segmentation research based on deep learning[J]. CAAI transactions on intelligent systems, 2022, 17(1): 16-31.
[2] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[3] 赵壮壮, 王骏, 潘祥, 等. 任务间共享和特有结构分解的多任务TSK模糊系统建模[J]. 智能系统学报, 2021, 16(4): 622-629.
ZHAO Zhuangzhuang, WANG Jun, PAN Xiang, et al. Multi-task TSK fuzzy system modeling based on inter-task common and special structure decomposition[J]. CAAI transactions on intelligent systems, 2021, 16(4): 622-629.
[4] 黄震华, 杨顺志, 林威, 等. 知识蒸馏研究综述[J]. 计算机学报, 2022, 45(3): 624-653.
HUANG Zhenhua, YANG Shunzhi, LIN Wei, et al. Knowledge distillation: a survey[J]. Chinese journal of computers, 2022, 45(3): 624-653.
[5] JIANG Yunliang, WENG Jiangwei, ZHANG Xiongtao, et al. A CNN-based born-again TSK fuzzy classifier integrating soft label information and knowledge distillation[J]. IEEE transactions on fuzzy systems, 2023, 31(6): 1843-1854.
[6] JúNIOR J S S, MENDES J, SOUZA F, et al. Distilling complex knowledge into explainable T–S fuzzy systems[J]. IEEE transactions on fuzzy systems, 2025, 33(3): 1037-1048.
[7] GU Xiangming, CHENG Xiang. Distilling a deep neural network into a Takagi-Sugeno-Kang fuzzy inference system[EB/OL]. (2020-10-10)[2024-10-22]. https://arxiv.org/abs/2010.04974v1.
[8] ZHANG Xiongtao, YIN Zezong, JIANG Yunliang, et al. Fuzzy knowledge distillation from high-order TSK to low-order TSK[EB/OL]. (2023-02-16)[2024-10-22]. https://arxiv.org/abs/2302.08038v1.
[9] ERDEM D, KUMBASAR T. Enhancing the learning of interval type-2 fuzzy classifiers with knowledge distillation[C]//2021 IEEE International Conference on Fuzzy Systems. Luxembourg: IEEE, 2021: 1-6.
[10] YOU Shan, XU Chang, XU Chao, et al. Learning from multiple teacher networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 1285-1294.
[11] WU M C, CHIU C T, WU K H. Multi-teacher knowledge distillation for compressed video action recognition on deep neural networks[C]//2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Brighton: IEEE, 2019: 2202-2206.
[12] PAL N R, PAL K, KELLER J M, et al. A possibilistic fuzzy c-means clustering algorithm[J]. IEEE transactions on fuzzy systems, 2005, 13(4): 517-530.
[13] VENKATACHALAM K, REDDY V P, AMUDHAN M, et al. An implementation of K-means clustering for efficient image segmentation[C]//2021 10th IEEE International Conference on Communication Systems and Network Technologies. Bhopal: IEEE, 2021: 224-229.
[14] ZHOU Ta, CHUNG F L, WANG Shitong. Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J]. IEEE transactions on fuzzy systems, 2017, 25(5): 1207-1221.
[15] ALBAWI S, ABED MOHAMMED T, AL-ZAWI S. Understanding of a convolutional neural network[C]//2017 International Conference on Engineering and Technology. Antalya: IEEE, 2017: 1-6.
[16] QIN Bin, NOJIMA Y, ISHIBUCHI H, et al. Realizing deep high-order TSK fuzzy classifier by ensembling interpretable zero-order TSK fuzzy subclassifiers[J]. IEEE transactions on fuzzy systems, 2021, 29(11): 3441-3455.
[17] XUE Guangdong, WANG Jian, ZHANG Bingjie, et al. Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction[J]. Fuzzy sets and systems, 2023, 469: 108627.
[18] CUI Yuqi, XU Yifan, PENG Ruimin, et al. Layer normalization for TSK fuzzy system optimization in regression problems[J]. IEEE transactions on fuzzy systems, 2023, 31(1): 254-264.
[19] 刘万军, 姜岚, 曲海成, 等. 融合CNN与Transformer的MRI脑肿瘤图像分割[J]. 智能系统学报, 2024, 19(4): 1007-1015.
LIU Wanjun, JIANG Lan, QU Haicheng, et al. MRI brain tumor image segmentation by fusing CNN and Transformer[J]. CAAI transactions on intelligent systems, 2024, 19(4): 1007-1015.
[20] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[21] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[22] WANG Wei, LI Yutao, ZOU Ting, et al. A novel image classification approach via dense-MobileNet models[J]. Mobile information systems, 2020, 2020(1): 7602384.
[23] TAN Mingxing, LE Q. Efficientnet: rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning. California: PMLR, 2019: 6105-6114.
[24] 赵文竹, 袁冠, 张艳梅, 等. 多视角融合的时空动态GCN城市交通流量预测[J]. 软件学报, 2024, 35(4): 1751-1773.
ZHAO Wenzhu, YUAN Guan, ZHANG Yanmei, et al. Multi-view fused spatial-temporal dynamic GCN for urban traffic flow prediction[J]. Journal of software, 2024, 35(4): 1751-1773.
[25] ZHAO Ling, SONG Yujiao, ZHANG Chao, et al. T-GCN: a temporal graph convolutional network for traffic prediction[J]. IEEE transactions on intelligent transportation systems, 2020, 21(9): 3848-3858.
[26] ABU-EL-HAIJA S, KAPOOR A, PEROZZI B, et al. N-GCN: multi-scale graph convolution for semi-supervised node classification[C]//Uncertainty in Artificial Intelligence. Tel Aviv: PMLR, 2020: 841-851.
[27] SUBAKAN C, RAVANELLI M, CORNELL S, et al. Attention is all you need in speech separation[C]//2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Toronto: IEEE, 2021: 21-25.
[28] 任欢, 王旭光. 注意力机制综述[J]. 计算机应用, 2021, 41(S1): 1-6.
REN Huan, WANG Xuguang. Review of attention mechanism[J]. Journal of computer applications, 2021, 41(S1): 1-6.
[29] DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019: 4171-4186.
[30] YANG Zhilin, DAI Zihang, YANG Yiming, et al. XLNet: generalized autoregressive pretraining for language understanding[C]//Proceedings of the 32nd Conference on Neural Information Processing Systems. Vancouver: NeurIPS, 2018: 5754-5764.
[31] HAUPT C E, MARKS M. AI-generated medical advice-GPT and beyond[J]. JAMA, 2023, 329(16): 1349-1350.
[32] CUI Kaiwen, YU Yingchen, ZHAN Fangneng, et al. KD-DLGAN: data limited image generation via knowledge distillation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 3872-3882.
[33] GUO Ziyao, YAN Haonan, LI Hui, et al. Class attention transfer based knowledge distillation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 11868-11877.
[34] GOU Jianping, XIONG Xiangshuo, YU Baosheng, et al. Multi-target knowledge distillation via student self-reflection[J]. International journal of computer vision, 2023, 131(7): 1857-1874.
[35] ZHANG Pu, SHANG Changjing, SHEN Qiang. Fuzzy rule interpolation with $K$-neighbors for TSK models[J]. IEEE transactions on fuzzy systems, 2022, 30(10): 4031-4043.
[36] TIAN Xiaobin, DENG Zhaohong, YING Wenhao, et al. Deep multi-view feature learning for EEG-based epileptic seizure detection[J]. IEEE transactions on neural systems and rehabilitation engineering, 2019, 27(10): 1962-1972.

Similar References:

Memo

Last Update: 2025-09-05

TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation PDF DownloadHTML

Memo

TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation

PDF Download HTML