[1]王鼎,门昌骞,王文剑.一种核的上下文多臂赌博机推荐算法[J].智能系统学报,2022,17(3):625-633.[doi:10.11992/tis.202105039]
WANG Ding,MEN Changqian,WANG Wenjian.A kernel contextual bandit recommendation algorithm[J].CAAI Transactions on Intelligent Systems,2022,17(3):625-633.[doi:10.11992/tis.202105039]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
17
期数:
2022年第3期
页码:
625-633
栏目:
人工智能院长论坛
出版日期:
2022-05-05
- Title:
-
A kernel contextual bandit recommendation algorithm
- 作者:
-
王鼎1, 门昌骞1, 王文剑1,2
-
1. 山西大学 计算机与信息技术学院,山西 太原 030006;
2. 山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
- Author(s):
-
WANG Ding1, MEN Changqian1, WANG Wenjian1,2
-
1. College of Computer and Information Technology, Shanxi University, Taiyuan 030006, China;
2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
-
- 关键词:
-
个性化推荐; 变化场景; 多臂赌博机; 线性上下文多臂赌博机; 核方法; 点击率; 非线性; 探索–利用困境
- Keywords:
-
personalized recommendation; changing scenarios; multi-armed bandits; linear contextual bandits; kernel method; click-through rate; nonlinear; exploration-exploitation dilemma
- 分类号:
-
TP181
- DOI:
-
10.11992/tis.202105039
- 摘要:
-
个性化推荐服务在当今互联网时代越来越重要,但是传统推荐算法不适应一些高度变化场景。将线性上下文多臂赌博机算法(linear upper confidence bound, LinUCB)应用于个性化推荐可以有效改善传统推荐算法存在的问题,但遗憾的是准确率并不是很高。本文针对LinUCB算法推荐准确率不高这一问题,提出了一种改进算法K-UCB(kernel upper confidence bound)。该算法突破了LinUCB算法中不合理的线性假设前提,利用核方法拟合预测收益与上下文间的非线性关系,得到了一种新的在非线性数据下计算预测收益置信区间上界的方法,以解决推荐过程中的探索–利用困境。实验表明,本文提出的K-UCB算法相比其他基于多臂赌博机推荐算法有更高的点击率(click-through rate, CTR),能更好地适应变化场景下个性化推荐的需求。
- Abstract:
-
Personalized recommendations are becoming increasingly significant in the Internet era; however, conventional recommendation algorithms cannot adapt to the highly changing scenarios. Applying the linear contextual bandit algorithm (linear upper confidence bound, LinUCB) to personalized recommendations can effectively overcome the limitations of conventional recommendation algorithms; however, the accuracy is not sufficiently high. Herein, an improved kernel upper confidence bound (K-UCB) algorithm is proposed to handle the insufficient recommended accuracy of the LinUCB algorithm. The proposed algorithm breaks through the unreasonable linear hypothesis of the LinUCB algorithm and uses the kernel method to fit the nonlinear relation between the expected reward and context. A new method for calculating the upper confidence bound of estimate rewards under nonlinear data is established to the exploration–exploitation balance in the recommendation process. Experiments show that the proposed K-UCB algorithm exhibits higher recommended accuracy than other recommendation algorithms based on multiarmed bandits and can better adapt to the need for personalized recommendations in changing scenarios.
备注/Memo
收稿日期:2021-05-26。
基金项目:国家自然科学基金项目(62076154,U1805263);中央引导地方科技发展资金项目(YDZX20201400001224);山西省自然科学基金项目(201901D111030);山西省国际科技合作重点研发计划项目(201903D421050).
作者简介:王鼎,硕士研究生,主要研究方向为机器学习;门昌骞,讲师,主要研究方向为支持向量机、机器学习理论、核方法;王文剑,教授,博士生导师,山西大学计算机与信息技术学院院长,主要研究方向为计算智能、机器学习与数据挖掘。主持国家自然科学基金项目4项。发表学术论文150余篇
通讯作者:王文剑.E-mail:wjwang@sxu.edu.cn
更新日期/Last Update:
1900-01-01