[1]陶鑫钰,王艳,纪志成.基于深度强化学习的节能工艺路线发现方法[J].智能系统学报,2023,18(1):23-35.[doi:10.11992/tis.202112030]
 TAO Xinyu,WANG Yan,JI Zhicheng.Energy-saving process route discovery method based on deep reinforcement learning[J].CAAI Transactions on Intelligent Systems,2023,18(1):23-35.[doi:10.11992/tis.202112030]
点击复制

基于深度强化学习的节能工艺路线发现方法

参考文献/References:
[1] HALIM A H, ISMAIL I. Combinatorial optimization: comparison of heuristic algorithms in travelling salesman problem[J]. Archives of computational methods in engineering, 2019, 26(2): 367–380.
[2] REZOUG A, BADER-EL-DEN M, BOUGHACI D. Guided genetic algorithm for the multidimensional knapsack problem[J]. Memetic computing, 2018, 10(1): 29–42.
[3] KIEFFER E, DANOY G, BRUST M R, et al. Tackling large-scale and combinatorial bi-level problems with a genetic programming hyper-heuristic[J]. IEEE transactions on evolutionary computation, 2020, 24(1): 44–56.
[4] 陈科胜, 鲜思东, 郭鹏. 求解旅行商问题的自适应升温模拟退火算法[J]. 控制理论与应用, 2021, 38(2): 245–254
CHEN Kesheng, XIAN Sidong, GUO Peng. Adaptive heating simulation annealing algorithm for solving the traveling salesman problem[J]. Control theory & applications, 2021, 38(2): 245–254
[5] 何庆, 吴意乐, 徐同伟. 改进遗传模拟退火算法在TSP优化中的应用[J]. 控制与决策, 2018, 33(2): 219–225
HE Qing, WU YIle, XU Tongwei. Improve the application of genetic simulation annealing algorithm in TSP optimization[J]. Control and decision, 2018, 33(2): 219–225
[6] JOY J, RAJEEV S, ABRAHAM E C. Particle swarm optimization for multi resource constrained project scheduling problem with varying resource levels[J]. Materials today:proceedings, 2021, 47: 5125–5129.
[7] PETROVI? M, VUKOVI? N, MITI? M, et al. Integration of process planning and scheduling using chaotic particle swarm optimization algorithm[J]. Expert systems with applications, 2016, 64: 569–588.
[8] VAFADAR A, HAYWARD K, TOLOUEI-RAD M. Drilling reconfigurable machine tool selection and process parameters optimization as a function of product demand[J]. Journal of manufacturing systems, 2017, 45: 58–69.
[9] WU Xiuli, LI Jing. Two layered approaches integrating harmony search with genetic algorithm for the integrated process planning and scheduling problem[J]. Computers & industrial engineering, 2021, 155: 107194.
[10] MA G H, ZHANG Y F, NEE A Y C. A simulated annealing-based optimization algorithm for process planning[J]. International journal of production research, 2000, 38(12): 2671–2687.
[11] 施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报, 2021, 47(7): 1610–1623
SHI Wei, FENG Yanghe, CHENG Guangquan, et al. Research on multi-aircraft collaborative air combat method based on deep reinforcement learning[J]. Acta automatica sinica, 2021, 47(7): 1610–1623
[12] ZHOU Wenhong, LIU Zhihong, LI Jie, et al. Multi-target tracking for unmanned aerial vehicle swarms using deep reinforcement learning[J]. Neurocomputing, 2021, 466: 285–297.
[13] 王云鹏, 郭戈. 基于深度强化学习的有轨电车信号优先控制[J]. 自动化学报, 2019, 45(12): 2366–2377
WANG Yunpeng, GUO Ge. Tram signal priority control based on deep reinforcement learning[J]. Acta automatica sinica, 2019, 45(12): 2366–2377
[14] GUO Ge, WANG Yunpeng. An integrated MPC and deep reinforcement learning approach to trams-priority active signal control[J]. Control engineering practice, 2021, 110(5): 104758.
[15] PENG Bile, KESKIN M F, Kulcsár B, et al. Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning[J]. Communications in transportation research, 2021, 1: 100017.
[16] 吴晓光, 刘绍维, 杨磊, 等. 基于深度强化学习的双足机器人斜坡步态控制方法[J]. 自动化学报, 2021, 47(8): 1976–1987
WU Xiaoguang LIU Shaowei, YANG Lei, et al. A slope gait control method for bipedal robots based on deep reinforcement learning[J]. Acta automatica sinica, 2021, 47(8): 1976–1987
[17] JIANG Rong, WANG Zhipeng, HE Bin, et al. A data-efficient goal-directed deep reinforcement learning method for robot visuomotor skill[J]. Neurocomputing, 2021, 462: 389–401.
[18] SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of go with deep neural networks and tree search[J]. Nature, 2016, 529: 484–489.
[19] VINYALS O, BABUSCHKIN I, CZARNECKI W M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575: 350–354.
[20] BERNER C, et al. Dota 2 with large scale deep reinforcement learning[EB/OL]. (2019?10?1)[2021?12?14].https://arxiv.org/abs/1912.06680.
[21] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[EB/OL]. (2013?12?19) [2021?12?14]. https://arxiv.org/abs/1312.5602.
[22] LUO Shu, ZHANG Linxuan, FAN Yushun. Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning[J]. Computers & industrial engineering, 2021, 159: 107489.
[23] SCHAUL T, QUAN J, ANTONOGLOU I, et al. Prioritized experience replay[EB/OL]. (2016?2?25) [2021?12?14]. https://arxiv.org/abs/1511.05952.
[24] VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning[EB/OL]. (2015?12?8)[2021?12?14]. https://arxiv.org/abs/1509.06461.
[25] LIU Xiaojun, YI Hong, NI Zhonghua. Application of ant colony optimization algorithm in process planning optimization[J]. Journal of intelligent manufacturing, 2013, 24(1): 1–13.
相似文献/References:
[1]周文吉,俞扬.分层强化学习综述[J].智能系统学报,2017,12(5):590.[doi:10.11992/tis.201706031]
 ZHOU Wenji,YU Yang.Summarize of hierarchical reinforcement learning[J].CAAI Transactions on Intelligent Systems,2017,12():590.[doi:10.11992/tis.201706031]
[2]王作为,徐征,张汝波,等.记忆神经网络在机器人导航领域的应用与研究进展[J].智能系统学报,2020,15(5):835.[doi:10.11992/tis.202002020]
 WANG Zuowei,XU Zheng,ZHANG Rubo,et al.Research progress and application of memory neural network in robot navigation[J].CAAI Transactions on Intelligent Systems,2020,15():835.[doi:10.11992/tis.202002020]
[3]杨瑞,严江鹏,李秀.强化学习稀疏奖励算法研究——理论与实验[J].智能系统学报,2020,15(5):888.[doi:10.11992/tis.202003031]
 YANG Rui,YAN Jiangpeng,LI Xiu.Survey of sparse reward algorithms in reinforcement learning — theory and experiment[J].CAAI Transactions on Intelligent Systems,2020,15():888.[doi:10.11992/tis.202003031]
[4]赵玉新,杜登辉,成小会,等.基于强化学习的海洋移动观测网络观测路径规划方法[J].智能系统学报,2022,17(1):192.[doi:10.11992/tis.202106004]
 ZHAO Yuxin,DU Denghui,CHENG Xiaohui,et al.Path planning for mobile ocean observation network based on reinforcement learning[J].CAAI Transactions on Intelligent Systems,2022,17():192.[doi:10.11992/tis.202106004]
[5]欧阳勇平,魏长赟,蔡帛良.动态环境下分布式异构多机器人避障方法研究[J].智能系统学报,2022,17(4):752.[doi:10.11992/tis.202106044]
 OUYANG Yongping,WEI Changyun,CAI Boliang.Collision avoidance approach for distributed heterogeneous multirobot systems in dynamic environments[J].CAAI Transactions on Intelligent Systems,2022,17():752.[doi:10.11992/tis.202106044]
[6]王竣禾,姜勇.基于深度强化学习的动态装配算法[J].智能系统学报,2023,18(1):2.[doi:10.11992/tis.202201006]
 WANG Junhe,JIANG Yong.Dynamic assembly algorithm based on deep reinforcement learning[J].CAAI Transactions on Intelligent Systems,2023,18():2.[doi:10.11992/tis.202201006]
[7]张钰欣,赵恩娇,赵玉新.规则耦合下的多异构子网络MADDPG博弈对抗算法[J].智能系统学报,2024,19(1):190.[doi:10.11992/tis.202303037]
 ZHANG Yuxin,ZHAO Enjiao,ZHAO Yuxin.MADDPG game confrontation algorithm of polyisomer network based on rule coupling based on rule coupling[J].CAAI Transactions on Intelligent Systems,2024,19():190.[doi:10.11992/tis.202303037]

备注/Memo

收稿日期:2021-12-14。
基金项目:国家重点研发计划项目(2018YFB1701903).
作者简介:陶鑫钰,硕士研究生,主要研究方向为深度强化学习在工艺路线中的应用;王艳,教授,博士生导师,工业物联网技术集成应用方向技术带头人,主要研究方向为基于大数据知识自动化的离散制造能耗网络协同优化。承担国家自然科学基金项目2项、中国博士后特别资助项目1项、江苏省自然科学基金项目1项、教育部人文社科规划基金项目1项,发表学术论文近百篇;纪志成,教授,博士生导师,主要研究方向为制造物联集成与优化。申请及授权发明专利40余项,登记软件著作权100余项,发表学术论文200余篇,出版学术著作1部
通讯作者:王艳.E-mail:wangyan88@jiangnan.edu.cn

更新日期/Last Update: 1900-01-01
Copyright © 《 智能系统学报》 编辑部
地址:(150001)黑龙江省哈尔滨市南岗区南通大街145-1号楼 电话:0451- 82534001、82518134 邮箱:tis@vip.sina.com