<-上一篇/Previous Article 下一篇/Next Article->

[1]张雄涛,陈天宇,赵康,等.基于多教师自适应知识蒸馏的TSK模糊分类器[J].智能系统学报,2025,20(5):1136-1147.[doi:10.11992/tis.202410028]
　ZHANG Xiongtao,CHEN Tianyu,ZHAO Kang,et al.TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation[J].CAAI Transactions on Intelligent Systems,2025,20(5):1136-1147.[doi:10.11992/tis.202410028]

点击复制

基于多教师自适应知识蒸馏的TSK模糊分类器

PDF下载 HTML

《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷: 20 期数: 2025年第5期页码: 1136-1147 栏目: 学术论文—机器感知与模式识别出版日期: 2025-09-05

Title:: TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation

作者:: 张雄涛^1,2, 陈天宇^1,2, 赵康^1,2, 李水苗^2,3, 申情^1,2; 1. 湖州师范学院信息工程学院, 浙江湖州 313000;
2. 浙江省现代农业资源智慧管理与应用研究重点实验室, 浙江湖州 313000;
3. 湖州师范学院信息技术中心, 浙江湖州 313000

Author(s):: ZHANG Xiongtao^1,2, CHEN Tianyu^1,2, ZHAO Kang^1,2, LI Shuimiao^2,3, SHEN Qing^1,2; 1. School of Information Engineering, Huzhou University, Huzhou 313000, China;
2. Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources, Huzhou 313000, China;
3. Information Technology Center, Huzhou University, Huzhou 313000, China

关键词:: TSK模糊分类器; 知识蒸馏; 多教师网络; 自适应权重分配; 隐藏知识; 模糊系统; 不同视角; 深度学习

Keywords:: TSK fuzzy classifier; knowledge distillation; multiple teacher networks; adaptive allocation of weights; dark knowledge; fuzzy system; different perspectives; deep learning

分类号:: TP181

DOI:: 10.11992/tis.202410028

摘要:: 目前层次型或深度模糊系统性能优异，但是模型复杂度较高；而基于蒸馏学习的轻量型TSK(Takagi-Sugeno-Kang)模糊分类器主要以单教师知识蒸馏为主，若教师模型表现不佳，则会影响蒸馏效果和模型的整体性能；此外，传统的多教师蒸馏通常使用无标签策略分配教师模型输出的权重，容易使低质量教师误导学生。对此，本文提出了一种基于多教师自适应知识蒸馏的TSK模糊分类器(TSK fuzzy classifier based on multi-teacher adaptive knowledge distillation, TSK-MTAKD)，以多个具有不同神经表达能力的深度神经网络为教师模型，利用本文提出的多教师知识蒸馏框架从多个深度学习模型中提取隐藏知识，并传递给具有强大不确定处理能力的TSK模糊系统。同时设计自适应权重分配器，将教师模型的输出与真实标签做交叉熵处理，更接近真实值的输出将被赋予更高权重，提高了模型的鲁棒性与隐藏知识的有效性。在13个UCI数据集上的实验结果充分验证了TSK-MTAKD的优势。

Abstract:: Currently, hierarchical and deep fuzzy systems demonstrate excellent performance, but they often suffer from high model complexity. Lightweight Takagi-Sugeno-Kang (TSK) fuzzy classifiers based on distillation learning typically rely on single-teacher knowledge distillation. However, if the teacher model underperforms, then the distillation effect and the overall model performance can be compromised. Furthermore, traditional multiteacher distillation approaches often assign weights to teacher model outputs using label-free strategies, which may allow low-quality teachers to mislead the student model. Aiming to address these issues, this paper introduces a TSK fuzzy classifier based on multiteacher adaptive knowledge distillation (TSK-MTAKD). The method employs multiple deep neural networks, each with different neural expression capabilities, as teacher models. The proposed distillation framework extracts dark knowledge from these models and transfers it to a TSK fuzzy system, leveraging its strong capability to handle uncertainty. Additionally, an adaptive weight allocator is introduced, which performs cross-entropy calculations between the output of the teacher model and the true label. Outputs that are closer to the true label are assigned higher weights, thereby improving model robustness and the quality of dark knowledge. Experimental results on 13 UCI benchmark datasets validate the advantages of the TSK-MTAKD approach.

参考文献/References:: [1] 苏丽, 孙雨鑫, 苑守正. 基于深度学习的实例分割研究综述[J]. 智能系统学报, 2022, 17(1): 16-31.
SU Li, SUN Yuxin, YUAN Shouzheng. A survey of instance segmentation research based on deep learning[J]. CAAI transactions on intelligent systems, 2022, 17(1): 16-31.
[2] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[3] 赵壮壮, 王骏, 潘祥, 等. 任务间共享和特有结构分解的多任务TSK模糊系统建模[J]. 智能系统学报, 2021, 16(4): 622-629.
ZHAO Zhuangzhuang, WANG Jun, PAN Xiang, et al. Multi-task TSK fuzzy system modeling based on inter-task common and special structure decomposition[J]. CAAI transactions on intelligent systems, 2021, 16(4): 622-629.
[4] 黄震华, 杨顺志, 林威, 等. 知识蒸馏研究综述[J]. 计算机学报, 2022, 45(3): 624-653.
HUANG Zhenhua, YANG Shunzhi, LIN Wei, et al. Knowledge distillation: a survey[J]. Chinese journal of computers, 2022, 45(3): 624-653.
[5] JIANG Yunliang, WENG Jiangwei, ZHANG Xiongtao, et al. A CNN-based born-again TSK fuzzy classifier integrating soft label information and knowledge distillation[J]. IEEE transactions on fuzzy systems, 2023, 31(6): 1843-1854.
[6] JúNIOR J S S, MENDES J, SOUZA F, et al. Distilling complex knowledge into explainable T–S fuzzy systems[J]. IEEE transactions on fuzzy systems, 2025, 33(3): 1037-1048.
[7] GU Xiangming, CHENG Xiang. Distilling a deep neural network into a Takagi-Sugeno-Kang fuzzy inference system[EB/OL]. (2020-10-10)[2024-10-22]. https://arxiv.org/abs/2010.04974v1.
[8] ZHANG Xiongtao, YIN Zezong, JIANG Yunliang, et al. Fuzzy knowledge distillation from high-order TSK to low-order TSK[EB/OL]. (2023-02-16)[2024-10-22]. https://arxiv.org/abs/2302.08038v1.
[9] ERDEM D, KUMBASAR T. Enhancing the learning of interval type-2 fuzzy classifiers with knowledge distillation[C]//2021 IEEE International Conference on Fuzzy Systems. Luxembourg: IEEE, 2021: 1-6.
[10] YOU Shan, XU Chang, XU Chao, et al. Learning from multiple teacher networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 1285-1294.
[11] WU M C, CHIU C T, WU K H. Multi-teacher knowledge distillation for compressed video action recognition on deep neural networks[C]//2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Brighton: IEEE, 2019: 2202-2206.
[12] PAL N R, PAL K, KELLER J M, et al. A possibilistic fuzzy c-means clustering algorithm[J]. IEEE transactions on fuzzy systems, 2005, 13(4): 517-530.
[13] VENKATACHALAM K, REDDY V P, AMUDHAN M, et al. An implementation of K-means clustering for efficient image segmentation[C]//2021 10th IEEE International Conference on Communication Systems and Network Technologies. Bhopal: IEEE, 2021: 224-229.
[14] ZHOU Ta, CHUNG F L, WANG Shitong. Deep TSK fuzzy classifier with stacked generalization and triplely concise interpretability guarantee for large data[J]. IEEE transactions on fuzzy systems, 2017, 25(5): 1207-1221.
[15] ALBAWI S, ABED MOHAMMED T, AL-ZAWI S. Understanding of a convolutional neural network[C]//2017 International Conference on Engineering and Technology. Antalya: IEEE, 2017: 1-6.
[16] QIN Bin, NOJIMA Y, ISHIBUCHI H, et al. Realizing deep high-order TSK fuzzy classifier by ensembling interpretable zero-order TSK fuzzy subclassifiers[J]. IEEE transactions on fuzzy systems, 2021, 29(11): 3441-3455.
[17] XUE Guangdong, WANG Jian, ZHANG Bingjie, et al. Double groups of gates based Takagi-Sugeno-Kang (DG-TSK) fuzzy system for simultaneous feature selection and rule extraction[J]. Fuzzy sets and systems, 2023, 469: 108627.
[18] CUI Yuqi, XU Yifan, PENG Ruimin, et al. Layer normalization for TSK fuzzy system optimization in regression problems[J]. IEEE transactions on fuzzy systems, 2023, 31(1): 254-264.
[19] 刘万军, 姜岚, 曲海成, 等. 融合CNN与Transformer的MRI脑肿瘤图像分割[J]. 智能系统学报, 2024, 19(4): 1007-1015.
LIU Wanjun, JIANG Lan, QU Haicheng, et al. MRI brain tumor image segmentation by fusing CNN and Transformer[J]. CAAI transactions on intelligent systems, 2024, 19(4): 1007-1015.
[20] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[21] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[22] WANG Wei, LI Yutao, ZOU Ting, et al. A novel image classification approach via dense-MobileNet models[J]. Mobile information systems, 2020, 2020(1): 7602384.
[23] TAN Mingxing, LE Q. Efficientnet: rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning. California: PMLR, 2019: 6105-6114.
[24] 赵文竹, 袁冠, 张艳梅, 等. 多视角融合的时空动态GCN城市交通流量预测[J]. 软件学报, 2024, 35(4): 1751-1773.
ZHAO Wenzhu, YUAN Guan, ZHANG Yanmei, et al. Multi-view fused spatial-temporal dynamic GCN for urban traffic flow prediction[J]. Journal of software, 2024, 35(4): 1751-1773.
[25] ZHAO Ling, SONG Yujiao, ZHANG Chao, et al. T-GCN: a temporal graph convolutional network for traffic prediction[J]. IEEE transactions on intelligent transportation systems, 2020, 21(9): 3848-3858.
[26] ABU-EL-HAIJA S, KAPOOR A, PEROZZI B, et al. N-GCN: multi-scale graph convolution for semi-supervised node classification[C]//Uncertainty in Artificial Intelligence. Tel Aviv: PMLR, 2020: 841-851.
[27] SUBAKAN C, RAVANELLI M, CORNELL S, et al. Attention is all you need in speech separation[C]//2021 IEEE International Conference on Acoustics, Speech and Signal Processing. Toronto: IEEE, 2021: 21-25.
[28] 任欢, 王旭光. 注意力机制综述[J]. 计算机应用, 2021, 41(S1): 1-6.
REN Huan, WANG Xuguang. Review of attention mechanism[J]. Journal of computer applications, 2021, 41(S1): 1-6.
[29] DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis: Association for Computational Linguistics, 2019: 4171-4186.
[30] YANG Zhilin, DAI Zihang, YANG Yiming, et al. XLNet: generalized autoregressive pretraining for language understanding[C]//Proceedings of the 32nd Conference on Neural Information Processing Systems. Vancouver: NeurIPS, 2018: 5754-5764.
[31] HAUPT C E, MARKS M. AI-generated medical advice-GPT and beyond[J]. JAMA, 2023, 329(16): 1349-1350.
[32] CUI Kaiwen, YU Yingchen, ZHAN Fangneng, et al. KD-DLGAN: data limited image generation via knowledge distillation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 3872-3882.
[33] GUO Ziyao, YAN Haonan, LI Hui, et al. Class attention transfer based knowledge distillation[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 11868-11877.
[34] GOU Jianping, XIONG Xiangshuo, YU Baosheng, et al. Multi-target knowledge distillation via student self-reflection[J]. International journal of computer vision, 2023, 131(7): 1857-1874.
[35] ZHANG Pu, SHANG Changjing, SHEN Qiang. Fuzzy rule interpolation with $K$-neighbors for TSK models[J]. IEEE transactions on fuzzy systems, 2022, 30(10): 4031-4043.
[36] TIAN Xiaobin, DENG Zhaohong, YING Wenhao, et al. Deep multi-view feature learning for EEG-based epileptic seizure detection[J]. IEEE transactions on neural systems and rehabilitation engineering, 2019, 27(10): 1962-1972.

相似文献/References:: [1]张欣培,周尧,章毅.用于胎儿超声切面识别的知识蒸馏方法[J].智能系统学报,2022,17(1):181.[doi:10.11992/tis.202105007]
　ZHANG Xinpei,ZHOU Yao,ZHANG Yi.Knowledge distillation method for fetal ultrasound section identification[J].CAAI Transactions on Intelligent Systems,2022,17():181.[doi:10.11992/tis.202105007]
[2]张静宇,续欣莹,谢刚,等.基于弹性权重巩固与知识蒸馏的垃圾持续分类[J].智能系统学报,2023,18(4):878.[doi:10.11992/tis.202211023]
　ZHANG Jingyu,XU Xinying,XIE Gang,et al.Continuous classification of garbage based on the elastic weightconsolidation and knowledge distillation[J].CAAI Transactions on Intelligent Systems,2023,18():878.[doi:10.11992/tis.202211023]
[3]宁欣,赵文尧,宗易昕,等.神经网络压缩联合优化方法的研究综述[J].智能系统学报,2024,19(1):36.[doi:10.11992/tis.202306042]
　NING Xin,ZHAO Wenyao,ZONG Yixin,et al.An overview of the joint optimization method for neural network compression[J].CAAI Transactions on Intelligent Systems,2024,19():36.[doi:10.11992/tis.202306042]
[4]吴瑞林,葛泉波,刘华平.基于YOLOX的类增量印刷电路板缺陷检测方法[J].智能系统学报,2024,19(4):1061.[doi:10.11992/tis.202309044]
　WU Ruilin,GE Quanbo,LIU Huaping.Class-incremental printed circuit board defect detection method based on YOLOX[J].CAAI Transactions on Intelligent Systems,2024,19():1061.[doi:10.11992/tis.202309044]
[5]林孙旗,徐家梦,郑瑜杰,等.面向掌纹掌静脉识别网络轻量化的非对称双模态融合方法[J].智能系统学报,2024,19(5):1190.[doi:10.11992/tis.202212031]
　LIN Sunqi,XU Jiameng,ZHENG Yujie,et al.An asymmetric bimodal fusion method for lightweight palm print and palm vein recognition network[J].CAAI Transactions on Intelligent Systems,2024,19():1190.[doi:10.11992/tis.202212031]
[6]顿家乐,王骏,彭汉琛,等.面向自闭症辅助诊断的知识蒸馏混合域适应方法[J].智能系统学报,2025,20(1):81.[doi:10.11992/tis.202403030]
　DUN Jiale,WANG Jun,PENG Hanchen,et al.Blended domain adaptation for computer-aided diagnosis of autism through knowledge distillation[J].CAAI Transactions on Intelligent Systems,2025,20():81.[doi:10.11992/tis.202403030]
[7]蒋云良,印泽宗,张雄涛,等.高阶Takagi-Sugeno-Kang模糊知识蒸馏分类器及其在脑电信号分类中的应用[J].智能系统学报,2024,19(6):1419.[doi:10.11992/tis.202307029]
　JIANG Yunliang,YIN Zezong,ZHANG Xiongtao,et al.TSK fuzzy distillation classifier with negative Euclidean probability and High-order fuzzy dark knowledge transfer and its application on EEG signals classification[J].CAAI Transactions on Intelligent Systems,2024,19():1419.[doi:10.11992/tis.202307029]

备注/Memo

收稿日期:2024-10-22。
基金项目:国家自然科学基金项目(62376094, U22A201856).
作者简介:张雄涛，副教授，博士，主要研究方向为人工智能与模式识别、机器学习。E-mail：1047897965@qq.com。;陈天宇，硕士研究生，主要研究方向为模糊系统、深度学习。E-mail：2529935825@qq.com。;申情，教授，博士，主要研究方向为智能信息处理、智慧交通。E-mail：sq@zjhu.edu.cn。
通讯作者:申情. E-mail：sq@zjhu.edu.cn

更新日期/Last Update: 2025-09-05

基于多教师自适应知识蒸馏的TSK模糊分类器 PDF下载HTML

备注/Memo

基于多教师自适应知识蒸馏的TSK模糊分类器

PDF下载 HTML