QIAN Dong,WANG Bei,ZHANG Tao,et al.Classification algorithm based on Copula theory and Bayesian decision theory[J].CAAI Transactions on Intelligent Systems,2016,11(1):78-83.[doi:10.11992/tis.201509011]





Classification algorithm based on Copula theory and Bayesian decision theory
钱冬1 王蓓1 张涛2 王行愚1
1. 华东理工大学信息科学与工程学院, 上海 200237;
2. 清华大学自动化系, 北京 100086
QIAN Dong1 WANG Bei1 ZHANG Tao2 WANG Xingyu1
1. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China;
2. Department of Automation, Tsinghua University, Beijing 100086, China
machine learningBayesian decision theoryCopula theorykernel density estimationphysiological signals
Traditional Bayesian decision classification algorithm is easily affected by the estimation of class-conditional probability densities, a fact that may result in incorrect classification results. Therefore, this paper proposes an improved classification algorithm based on Bayesian decision, i.e., Bayesian-Copula Discriminant Classifier (BCDC). This method constructs class-conditional probability densities by combining Copula theory and kernel density estimation instead of making assumptions on the form of class-conditional probability densities. Kernel density estimation is used to smooth the probability distribution of each feature. By performing probability integral transform, continuous distribution is converted to random variables having a uniform distribution. Then, Copula functions are used to construct the dependency structure between these probability distributions for two categories. Moreover, the maximum likelihood estimation is applied to determine the parameters of Copula functions, and two well-fitted Copula functions for two categories are selected based on Bayesian information criterion. The BCDC method was validated with experimental datasets of physiological signals. The obtained results showed that the proposed method outperforms other traditional methods in terms of classification accuracy and AUC as well as robustness. Moreover, it takes full advantage of Copula theory and kernel density estimation and improves the accuracy and flexibility of the estimation.


[1] TIPPING M E. Sparse Bayesian learning and the relevance vector machine[J]. Journal of machine learning research, 2001, 1(3):211-244.
[2] XUE Jinghao, HALL P. Why does rebalancing class-unbalanced data improve AUC for linear discriminant analysis?[J]. IEEE transactions on pattern analysis and machine intelligence, 2015, 37(5):1109-1112.
[3] FERNÁNDEZ-DELGADO M, CERNADAS E, BARRO S, et al. Do we need hundreds of classifiers to solve real world classification problems?[J]. Journal of machine learning research, 2014, 15(1):3133-3181.
[4] RODRIGUEZ A, LAIo A. Clustering by fast search and find of density peaks[J]. Science, 2014, 344(6191):1492-1496.
[5] 李宏伟, 刘扬, 卢汉清, 等. 结合半监督核的高斯过程分类[J]. 自动化学报, 2009, 35(7):888-895. LI Hongwei, LIU Yang, LU Hanqing, et al. Gaussian processes classification combined with semi-supervised kernels[J]. Acta automatica sinica, 2009, 35(7):888-895.
[6] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of machine learning research, 2001, 3(4-5):993-1022.
[7] BISHOP C M. Pattern Recognition and Machine Learning[M]. New York:Springer, 2006:21-31.
[8] NG A Y, JORDAN M I. On discriminative vs. generative classifiers:a comparison of logistic regression and naïve Bayes[C]//Advances in Neural Information Processing Systems. Vancouver, British Columbia, Canada, 2002, 14:841-848.
[9] 李航. 统计学习方法[M]. 北京:清华大学出版社, 2012:77-91.
[10] JAIN A K, DUIN R P W, MAO Jianchang. Statistical pattern recognition:a review[J]. IEEE transactions on pattern analysis and machine intelligence, 2000, 22(1):4-37.
[11] DUDA R O, HART P E, STORK D G. Pattern Classification[M]. 2nd ed. New York:Wiley, 2001:20-45.
[12] MURPHY K P. Machine Learning:A Probabilistic Perspective[M]. England:MIT, 2012:82-87.
[13] NELSEN R B. An Introduction to Copulas[M]. 2nd ed. Springer:Berlin, 2006.
[14] GENEST C, FAVRE A C. Everything you always wanted to know about Copula modeling but were afraid to ask[J]. Journal of hydrologic engineering, 2007, 12(4):347-368.
[15] EBAN E, ROTHSCHILD G, MIZRAHI A, et al. Dynamic Copula networks for modeling real-valued time series[C]//Proceedings of the 16th International Conference on Artificial Intelligence and Statistics. Scottsdale, AZ, USA, 2013, 4:247-255.
[16] KRISTAN M, LEONARDIS A, SKOC AJ D. Multivariate online kernel density estimation with Gaussian kernels[J]. Pattern recognition, 2011, 44(10-11):2630-2642.
[17] CHERUBINI U, LUCIANO E, VECCHIATO W. Copula Methods in Finance[M]. England:John Wiley & Sons, 2004.
[18] PATTON A J. A review of Copula models for economic time series[J]. Journal of multivariate analysis, 2012, 110:4-18.
[19] AUBASI A. Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders[J]. Computers in biology and medicine, 2013, 43(5):576-586.
[20] TAGLUK M E, SEZGIN N, AKIN M. Estimation of sleep stages by an artificial neural network employing EEG, EMG and EOG[J]. Journal of medical systems, 2010, 34(4):717-725.
[21] CICHOCKI A, MANDIC D, DE LATHAUWER L, et al. Tensor decompositions for signal processing applications:from two-way to multiway component analysis[J]. IEEE signal processing, 2015, 32(2):145-163.
[22] KHUSHABA R N, KODAGODA S, LAL S, et al. Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm[J]. IEEE transactions on biomedical engineering, 2011, 58(1):121-131.


 YE Zhi-fei,WEN Yi-min,LU Bao-liang.A survey of imbalanced pattern classification problems[J].CAAI Transactions on Intelligent Systems,2009,4(1):148.
[2]刘奕群,张 敏,马少平.基于非内容信息的网络关键资源有效定位[J].智能系统学报编辑部,2007,2(01):45.
 LIU Yi-qun,ZHANG Min,MA Shao-ping.Web key resource page selection based on non-content inf o rmation[J].CAAI Transactions on Intelligent Systems,2007,2(1):45.
[3]马世龙,眭跃飞,许 可.优先归纳逻辑程序的极限行为[J].智能系统学报编辑部,2007,2(04):9.
 MA Shi-long,SUI Yue-fei,XU Ke.Limit behavior of prioritized inductive logic programs[J].CAAI Transactions on Intelligent Systems,2007,2(1):9.
 YAO Futian,QIAN Yuntao.Gaussian process and its applications in hyperspectral image classification[J].CAAI Transactions on Intelligent Systems,2011,6(1):396.
 WEN Yimin,QIANG Baohua,FAN Zhigang.A survey of the classification of data streams with concept drift[J].CAAI Transactions on Intelligent Systems,2013,8(1):95.[doi:10.3969/j.issn.1673-4785.201208012]
 YANG Chengdong,DENG Tingquan.An approach to attribute reduction combining attribute selection and deletion[J].CAAI Transactions on Intelligent Systems,2013,8(1):183.[doi:10.3969/j.issn.1673-4785.201209056]
 HU Xiaosheng,ZHONG Yong.Support vector machine imbalanced data classification based on weighted clustering centroid[J].CAAI Transactions on Intelligent Systems,2013,8(1):261.
 DING Ke,TAN Ying.A review on general purpose computing on GPUs and its applications in computational intelligence[J].CAAI Transactions on Intelligent Systems,2015,10(1):1.[doi:10.3969/j.issn.1673-4785.201403072]
 KONG Qingchao,MAO Wenji,ZHANG Yuhao.User comment behavior prediction in social networking sites[J].CAAI Transactions on Intelligent Systems,2015,10(1):349.[doi:10.3969/j.issn.1673-4785.201403019]
 YAO Lin,LIU Yi,LI Xinxin,et al.Chinese named entity recognition via word boundarybased character embedding[J].CAAI Transactions on Intelligent Systems,2016,11(1):37.[doi:10.11992/tis.201507065]


更新日期/Last Update: 1900-01-01