[1]邓蔚,邢钰晗,李逸凡,等.公平性机器学习研究综述[J].智能系统学报,2020,15(3):578-586.[doi:10.11992/tis.202007004]
 DENG Wei,XING Yuhan,LI Yifan,et al.Survey on fair machine learning[J].CAAI Transactions on Intelligent Systems,2020,15(3):578-586.[doi:10.11992/tis.202007004]
点击复制

公平性机器学习研究综述(/HTML)
分享到:

《智能系统学报》[ISSN:1673-4785/CN:23-1538/TP]

卷:
第15卷
期数:
2020年3期
页码:
578-586
栏目:
人工智能院长论坛
出版日期:
2020-09-05

文章信息/Info

Title:
Survey on fair machine learning
作者:
邓蔚12 邢钰晗1 李逸凡1 李振华3 王国胤2
1. 西南财经大学 统计研究中心,四川 成都 611130;
2. 重庆邮电大学 计算智能重庆市重点实验室,重庆 400065;
3. 西南财经大学 金融学院,四川 成都 611130
Author(s):
DENG Wei12 XING Yuhan1 LI Yifan1 LI Zhenhua3 WANG Guoyin2
1. Center of Statistical Research, Southwestern University of Finance and Economics, Chengdu 611130, China;
2. Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;
3. School of Finance, Southwestern University of Finance and Economics, Chengdu 611130, China
关键词:
算法伦理算法偏见公平性公平性机器学习公平性指标公平性设计公平性数据集动态性
Keywords:
algorithmic ethicsalgorithmic discriminationfairnessfair machine learningfair indicatorfair designfair datasetdynamicity
分类号:
TP181
DOI:
10.11992/tis.202007004
摘要:
随着机器学习在社会中的广泛使用,带来的歧视问题引发广泛的社会争议,这逐步引起了产业界和学术界对机器学习算法公平性问题的浓厚兴趣。目前对公平性度量和机器学习公平性机制的研究仍然处于初级阶段。本文对公平性机器学习的研究进行了调研,首先从公平性的定义出发,比较了衡量公平性指标的方法,然后调研了公平性数据集,对公平性问题的产生进行了分析,接下来对现有的公平性机器学习算法进行归类和比较,最后总结了当前公平性机器学习研究中存在的问题,并对关键问题和重大挑战进行了讨论。
Abstract:
With the widespread applications of machine learning in our society, the problems of discrimination have caused widespread social controversy. It gradually arouses strong interests in fair machine learning in the industry and academia. Nowdays the deep understanding of the basic issues related to fairness and mechanism of fair machine learning is still in their infancy. We makes a survey on fair machine learning. Starting from the definitions of fairness, it compares the different difinitions on fairness in different problems. Common datasets are also summarized. And the issues of fairness is analyzed. We classify and compare the existing methods of achieving fairness. Finally, we summarizes the problems in current fairness machine learning research and propose the key problems and important challenges in the future.

参考文献/References:

[1] 高庆吉, 赵志华, 徐达, 等. 语音情感识别研究综述[J]. 智能系统学报, 2020, 15(1): 1-13
GAO Qingji, ZHAO Zhihua, XU Da, et al. Review on speech emotion recognition research[J]. CAAI transactions on intelligent systems, 2020, 15(1): 1-13
[2] YOCHUM P, 常亮, 古天龙, 等. 基于位置和开放链接数据的旅游推荐系统综述[J]. 智能系统学报, 2020, 15(1): 25-32
YOCHUM P, CHANG Liang, GU Tianlong, et al. A review of linked open data in location-based recommendation system in the tourism domain[J]. CAAI transactions on intelligent systems, 2020, 15(1): 25-32
[3] 常乐, 杨忠, 张秋雁, 等. 悬挂负载空中机器人的抗摆控制[J]. 应用科技, 2020, 47(2): 17-22
CHANG Le, YANG Zhong, ZHANG Qiuyan, et al. Anti-swing control research of aerial robot with suspended load[J]. Applied science and technology, 2020, 47(2): 17-22
[4] KHANDANI A E, KIM A J, LO A W. Consumer credit-risk models via machine-learning algorithms[J]. Journal of banking and finance, 2010, 34(11): 2767-2787.
[5] BRENNAN T, DIETERICH W, EHRET B. Evaluating the predictive validity of the compas risk and needs assessment system[J]. Criminal justice and behavior, 2009, 36(1): 21-40.
[6] MAHONEY J F, MOHEN J M. Method and system for loan origination and underwriting[P]. US: 7287008.1, 2007-10-23.
[7] KEARNS M, ROTH A. The ethical algorithm: the science of socially aware algorithm design[M]. New York: Oxford University Press, 2019: 11.
[8] IEEE新版“人工智能设计的伦理准则”白皮书全球重磅发布[EB/OL]. (2017-12-15)[2020-07-26] https://www.sohu.com/a/210646713_468720.
[9] Publications Office of the EU[EB/OL]. (2018-03-09)[2020-07-26] https://op.europa.eu/en/publication-detail/-/publication/dfebe62e-4ce9-11e8-be1d-01aa75ed71a1/language-en/format-PDF/source-78120382.
[10] 吴沈括, 周洁, 杨滢滢. 人工智能伦理与数据保护宣言[EB/OL]. (2018-10-30)[2020-07-26]. http://www.yidianzixun.com/m/article/0KOD5oLY.
[11] OECD Principles on AI[EB/OL]. [2020-07-26] https://www.oecd.org/going-digital/ai/principles/.
[12] G20 ministerial statement on trade and digital economy[EB/OL]. (2019-06-09)[2020-07-26] http://www.g20.utoronto.ca/2019/2019-g20-trade.html.
[13] 国家新一代人工智能治理专业委员会. 发展负责任的人工智能: 新一代人工智能治理原则发布[EB/OL]. (2019-06-17)[2020-07-26] http://www.most.gov.cn/kjbgz/201906/t20190617_147107.htm.
[14] FRIEDLER S A, SCHEIDEGGER C, VENKATASUBRAMANIAN S, et al. A comparative study of fairness-enhancing interventions in machine learning[C]//Proceedings of the Conference on Fairness, Accountability, and Transparency. New York, USA, 2019: 329-338.
[15] KUSNER M, LOFTUS J, RUSSEL C, et al. Counterfactual fairness[C]//Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, USA, 2017.
[16] GRGI?-HLA?A N, ZAFAR M B, GUMMADI K P, et al. The case for process fairness in learning: feature selection for fair decision making[C]//Symposium on Machine Learning and the Law at the 29th Conference on Neural Information Processing Systems. Barcelona, Spain, 2016: 1.
[17] DWORK C, HARDT M, PITASSI T, et al. Fairness through awareness[C]//Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. New York, USA, 2012: 214-226.
[18] JOSEPH M, KEARNS M, MORGENSTERN J, et al. Rawlsian fairness for machine learning [DB/OL]. (2017-06-29)[2020-08-07] arXiv preprint arXiv:1610. 09559V2, arxiv.org/abs/1610.09559v2, 2016.
[19] LOUIZOS C, SWERSKY K, LI Yujia, et al. The variational fair autoencoder[C]//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico, 2016.
[20] ZEMEL R, WU Yu, SWERSKY K, et al. Learning fair representations[C]//Proceedings of the 30th International Conference on International Conference on Machine Learning. Atlanta, USA, 2013: 325-333.
[21] KIM M P, KOROLOVA A, ROTHBLUM G N, et al. Preference-informed fairness[C]//Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. New York, USA, 2020: 546.
[22] ZAFA M B, VALERA I, ROGRIGUEZ M G, et al. Fairness constraints: mechanisms for fair classification[C]//Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. Lille, France, 2017: 962-970.
[23] ZAFAR M B, VALERA I, RODRIGUEZ M G, et al. Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment[C]//Proceedings of the 26th International Conference on World Wide Web. Perth, Australia, 2017: 1171-1180.
[24] BERETTA E, SANTANGELO A, LEPRI B, et al. The invisible power of fairness. How machine learning shapes democracy [DB/OL]. (2019-03-22)[2020-07-26] arXiv preprint arXiv:1903.09493v1, https://arxiv.org/abs/1903.09493, 2019.
[25] CHOULDECHOVA A. Fair prediction with disparate impact: a study of bias in recidivism prediction instruments[J]. Big data, 2017, 5(2): 153-163.
[26] BAROCAS S, SELBST A D. Big data’s disparate impact[J]. California law review, 2016, 104: 671-732.
[27] KEARNS M, ROTH A, WU Z S. Meritocratic fairness for cross-population selection[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia, 2017: 1828-1836.
[28] KLEINBERG J, MULLAINATHAN S, RAGHAVAN M. Inherent trade-offs in the fair determination of risk scores[C]//Proceedings of the 8th Innovations in Theoretical Computer Science Conference. Dagstuhl, Germany, 2017.
[29] Supreme Court of the United States. Ricci v. DeStefano [EB/OL]. (2009-06-29)[ 2020-08-07]. 557 U.S. 557,https://supreme.justia.com/cases/federal/us/557/557/, 2009.
[30] Adult data[EB/OL]. [2020-07-26]. http://tinyurl.com/UCI-Adult, 1996.
[31] LICHMAN M. UCI machine learning repository[EB/OL]. (2013)[2020-07-26]. http://archive.ics.uci.edu/ml, 2013.
[32] ANGWIN J, LARSON J, MATTU S, et al. Machine bias. risk assessments in criminal sentencing[EB/OL]. (2016-05-23)[2020-07-26] https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing, 2016.
[33] Bank Marketing Data Set [EB/OL]. (2012-02-14) [2020-07-26] https://archive.ics.uci.edu/ml/datasets/Bank%2BMarketing, 2012.
[34] KHADEMI A, LEE S, FOLEY D, et al. Fairness in algorithmic decision making: an excursion through the lens of causality[C]//The World Wide Web Conference. San Francisco, USA, 2019: 2907-2914.
[35] FELDMAN M, FRIEDLER S A, MOELLER J, et al. Certifying and removing disparate impact[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA, 2015: 259-268.
[36] KAMIRAN F, CALDERS T. Data preprocessing techniques for classification without discrimination[J]. Knowledge and information systems, 2012, 33(1): 1-33.
[37] CALMON F P, WEI D, VINZAMURI B, et al. Optimized pre-processing for discrimination prevention[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, USA, 2017: 3995-4004.
[38] KAMISHIMA T, AKAHO S, ASOH H, et al. Fairness-aware classifier with prejudice remover regularizer[M]//FLACH P A, DE BIE T, CRISTIANINI N. Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2012: 35-50.
[39] CALDERS T, VERWER S. Three naive Bayes approaches for discrimination-free classification[J]. Data mining and knowledge discovery, 2010, 21(2): 277-292.
[40] BOSE A J, HAMILTON W. Compositional fairness constraints for graph embeddings [DB/OL]. (2019-07-16)[2020-07-07] https://arxiv.org/abs/1905.10674, 2019.
[41] HARDT M, PRICE E, SREBRO N. Equality of opportunity in supervised learning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. Red Hook, USA, 2016: 3315-3323.
[42] KAMIRAN F, CALDERS T. Classifying without discriminating[C]//Proceedings of 2009 2nd International Conference on Computer, Control and Communication. Karachi, Pakistan, 2009.
[43] WOODWORTH B, GUNASEKAR S, OHANNESSIAN M I, et al. Learning non-discriminatory predic-tors [EB/OL]. (2017-11-01)[2020-07-07] https://arxiv.org/abs/1702.06081, 2017.
[44] CORBETT-DAVIES S, GOEL S. The measure and mismeasure of fairness: a critical review of fair machine learning [DB/OL]. (2018-08-14)[2020-07-07] https://arxiv.org/abs/1808. 00023, 2018.
[45] KANNAN S, KEARNS M, MORGENSTERN J, et al. Fairness incentives for myopic agents[C]//Proceedings of the 2017 ACM Conference on Economics and Computation. New York, USA, 2017: 369-386.
[46] CORBETT-DAVIES S, PIERSON E, FELLER A, et al. Algorithmic decision making and the cost of fairness[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA, 2017: 797-806.
[47] D’AMOUR A, SRINIVASAN H, ATWOOD J, et al. Fairness is not static: deeper understanding of long term fairness via simulation studies[C]//Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. Barcelona, Spain, 2020: 525-534.
[48] Google/ml-fairness-gym[EB/OL]. [2020-07-26] https://github.com/google/ml-fairness-gym/.
[49] KUPPAM S, MCKENNA R, PUJOL D, et al. Fair decision making using privacy-protected data [DE/OL]. (2020-01-24)[2020-08-07] https://arxiv.org/abs/1905.12744, 2020.
[50] SLACK D, FRIEDLER S A, GIVENTAL E. Fairness warnings and fair-MAML: learning fairly with minimal data[C]//Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. Barcelona, Spain, 2019: 200-209.
[51] GANCHEV K, KEARNS M, NEVMYVAKA Y, et al. Censored exploration and the dark pool problem[C]//Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Arlington, USA, 2009: 185-194.
[52] DONAHUE K, KLEINBERG J. Fairness and utilization in allocating resources with uncertain demand[C]//Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. New York, USA, 2020: 658-668.
[53] DEVRIES T, MISRA I, WANG C, et al. 2019. Does object recognition work for everyone? [EB/OL]. (2019-06-18)[2020-07-07] https://arxiv.org/abs/1906.02659, 2019.
[54] STOCK P, CISSE M. ConvNets and ImageNet beyond accuracy: understanding mistakes and uncovering biases[C]//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany, 2018: 498-512.
[55] DULHANTY C, WONG A. Auditing imageNet: towards a model-driven framework for annotating demographic attributes of large-scale image datasets [EB/OL]. (2019-06-04)[2020-07-07] https://arxiv.org/abs/1905.01347, 2019.
[56] YANG Kaiyu, QINAMI K, FEI-FEI L, et al. Towards fairer datasets: filtering and balancing the distribution of the people subtree in the ImageNet hierarchy[C]//Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. New York, USA, 2020: 547-558.
[57] BORDIA S, BOWMAN S R. Identifying and reducing gender bias in word-level language models[C]//Proceedings of the 9th American Chapter of the Association for Computational Linguistics. Minneapolis, Minnesota, 2019: 7-15.
[58] GREEN B, CHEN Yiling. Disparate interactions: an algorithm-in-the-loop analysis of fairness in risk assessments[C]//Proceedings of the Conference on Fairness, Accountability, and Transparency. Atlanta, USA, 2019: 90-99.
[59] SONG Jiaming, KALLURI P, GROVER A, et al. Learning Controllable Fair Representations[C]//Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. Naha, Japan, 2019: 2164-2173.
[60] LIU L T, DEAN S, ROLF E, et al. Delayed impact of fair machine learning[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden, 2018: 3150-3158.

备注/Memo

备注/Memo:
收稿日期:2020-07-02。
基金项目:国家自然科学基金重点项目(61936001)
作者简介:邓蔚,讲师,博士后,主要研究方向为知识图谱、机器行为学、计算社会科学与算法伦理。近年来参与国家自然科学基金重点项目、国家重点研发计划等国家级项目3项。申请国家发明专利10余项,发表学术论文30余篇,出版学术著作1部;邢钰晗,硕士研究生,主要研究方向为公平性机器学习和数据科学;王国胤,教授,博士生导师,重庆邮电大学副校长,研究生院院长,人工智能学院院长,中国人工智能学会副理事长,主要研究方向为粗糙集、粒计算和认知计算。近年来承担多个国家重点研发计划、国家自然科学基金重点项目等。入选教育部“长江学者”特聘教授、“万人计划”领军人才。发表学术论文300余篇,出版专著10余部
通讯作者:王国胤.E-mail:wanggy@cqupt.edu.cn
更新日期/Last Update: 1900-01-01