[1]文家燕,王怡博,辛华健,等.基于改进深度Q网络的智能网联汽车路径规划[J].智能系统学报,2026,21(1):226-235.[doi:10.11992/tis.202502010]
WEN Jiayan,WANG Yibo,XIN Huajian,et al.Intelligent connected vehicle path planning based on optimized deep Q-network[J].CAAI Transactions on Intelligent Systems,2026,21(1):226-235.[doi:10.11992/tis.202502010]
点击复制
《智能系统学报》[ISSN 1673-4785/CN 23-1538/TP] 卷:
21
期数:
2026年第1期
页码:
226-235
栏目:
吴文俊人工智能科学技术奖论坛
出版日期:
2026-03-05
- Title:
-
Intelligent connected vehicle path planning based on optimized deep Q-network
- 作者:
-
文家燕1,2, 王怡博1,2, 辛华健3, 谢广明4
-
1. 广西科技大学 自动化学院, 广西 柳州 545616;
2. 广西科技大学 智能协同与交叉应用研究中心, 广西 柳州 545616;
3. 广西工业职业技术学院 广西 南宁 530001;
4. 北京大学 工学院, 北京 100871
- Author(s):
-
WEN Jiayan1,2, WANG Yibo1,2, XIN Huajian3, XIE Guangming4
-
1. School of Automation, Guangxi University of Science and Technology, Liuzhou 545616, China;
2. The Research Center for Intelligent Cooperation and Cross-application, Guangxi University of Science and Technology, Liuzhou 545616, China;
3. Guangxi Vocational and Technical College of Industry, Nanning 530001, China;
4. College of Engineering, Peking University, Beijing 100871, China
-
- 关键词:
-
智能网联汽车; 路径规划; 非结构化环境; 注意力机制; 经验回放; 避障; 深度Q网络; 深度强化学习
- Keywords:
-
intelligent connected vehicles; path planning; unstructured environment; attention mechanism; experience replay; obstacle avoidance; deep Q-network; deep reinforcement learning
- 分类号:
-
TP183;TP2
- DOI:
-
10.11992/tis.202502010
- 摘要:
-
针对非结构环境中的智能网联汽车路径规划问题,传统的深度Q网络(deep Q-network,DQN)算法存在规划效率低、收敛速度慢、泛化性差等问题,本文提出了一种结合注意力机制和经验分类的DQN规划方法。通过结合注意力机制设计经验回放池,通过动态权重分配解决多目标优化冲突,提升相似环境中的经验利用率,降低规划时间,加快收敛;构建非稀疏奖励约束,结合交通环境特性优化状态空间,以便适应多目标场景和实现多场景泛化。仿真表明,优化后的算法平均规划速度提升了28.6%,行进路程较优化前缩短了25.2%,且在不同场景下通过载入训练数据,首次规划成功的耗时缩短了32.8%。
- Abstract:
-
Aiming at the path planning problem of intelligent connected vehicles in unstructured environment, the traditional deep Q-network (DQN) algorithm has problems such as low planning efficiency, slow convergence speed, poor generalization, etc. This paper proposes a DQN planning method combining attention mechanism and empirical classification. The experience playback pool is designed by combining the attention mechanism, and the multi-objective optimization conflict is solved by dynamic weight allocation, so as to improve the experience utilization rate in similar environments, reduce the planning time, and accelerate the convergence; Build non sparse reward constraints, and optimize the state space in combination with the characteristics of the traffic environment, so as to adapt to multi-objective scenarios and achieve multi scenario generalization. The simulation shows that the average planning speed of the optimized algorithm is increased by 28.6%, and the travel distance is shortened by 25.2% compared with that before optimization. In addition, the time for the first successful planning is shortened by 32.8% by loading training data in different scenarios.
备注/Memo
收稿日期:2025-2-24。
基金项目:国家自然科学基金(62541306,619630060); 广西科技重大专项 (桂科 AA24206054).
作者简介:文家燕,教授,博士生导师,中国自动化学会青年工作委员会委员。主要研究方向为多智能体系统协同控制、智能网联汽车队列控制。现主持国家自然科学基金及省部级基金项目 8 项,获专利授权 10 项,发表学术论文35篇。E-mail:wenjiayan2012@126.com。;辛华健,副教授,中国仿真学会机器人专委会委员,主持完成了广西职业教学改革重点项目1项,广西教育科学规划课题重点项目1项,广西中青年教师科研项目2项。发表学术论文20余篇,主编教材2部。E-mail:13659619535@163.com。;谢广明,教授,博士生导师,主要研究方向为智能仿生机器人、复杂系统与多机器人控制和水下特种机器人技术,作为核心负责人主持多项国家自然科学基金重点项目、面上项目等国家级科研课题,获发明专利授权10余项,获国家自然科学奖二等奖、教育部自然科学奖一等奖、吴文俊人工智能科学技术创新奖二等奖,发表学术论文200余篇。E-mail:xiegming@pku.edu.cn。
通讯作者:辛华健. E-mail:13659619535@163.com
更新日期/Last Update:
2026-01-05