Differential Variable Speed Limit Control Strategy Based on Reinforcement Learning
-
摘要: 针对高速公路合流区主线各车道交通流运行状况受合流车辆影响的差异性,研究了1种基于强化学习的车道级可变限速(differential variable speed limit, DVSL)控制策略。由于DVSL控制问题存在高维动作空间求解困难,本文利用限速变化值优化动作空间,确定状态空间以及考虑多因素的奖励函数;在求解过程中,使用优质经验回放技术(prioritized experience replay,PER)进行改进,以提高训练效率和模型性能;同时提出1种车道间的安全检测机制辅助PER-DDQN展开训练,保证车道级可变限速模型可实施性。利用SUMO仿真软件测试所提出策略的控制效果,结果表明:所提出的车道级可变限速策略相较于未实施可变限速控制场景,全程行程时间降低41.88%、平均速度提高5.65%,合流区行程时间降低66.91%、平均速度提高43.42%;且车道级可变限速控制策略下合流区内各车道拥堵时间明显缩短,速度变化更加平稳。此外,还测试了智能网联车(connected-automated vehicles,CAVs)在不同渗透率场景对所提出策略的影响,渗透率在低于60%时实施车道级可变限速策略控制效果明显优于未实施可变限速控制策略,在渗透率为20%、40%和60%的场景中平均全程行程时间分别降低了41.88%、13.38%和7.46%,平均速度提高了6.08%、2.36%和1.61%;当渗透率达到80%以上时,鉴于CAVs车辆能明显改善交通流状况,实施车道级可变限速控制策略改善效果不明显。Abstract: In addressing the challenges posed by variable traffic conditions within highway merging lanes impacted by merging vehicles, a reinforcement learning (RL) model is developed for differential variable speed limit (DVSL) control. Due to the difficulty of solving the DVSL control problem with high-dimensional action space, this paper optimizes the action space by using the speed limit change value, determines the state space as well as the reward function considering multiple factors; in the solution process, it is improved by using the Prioritized Experience Replay (PER) technique in order to improve the training efficiency and model performance; and at the same time, it proposes an inter-lane safety detection mechanism to assist the PER-DDQN to unfold the training and ensure the implementability of the lane-level variable velocity limit model. Furthermore, the merging area is simulated with SUMO to examine the performance of the DVSL controller. The results reveal that, compared with the no-control scenario, the proposed method yields a 41.88% reduction in overall travel time and a 5.65% increase in average speed. In the merging zone, a notable 66.91% reduction in travel time and a 43.42% increase in average speed are achieved. And the RL based DVSL control strategy effectively minimizes congestion time for each lane due to smoother speed changes. Furthermore, when evaluating the impact of varying penetration scenarios on the proposed method, the RL based DVSL control strategy outperforms the no-control scenario particularly when the penetration of connected-automated vehicles (CAVs) is below 60%. In scenarios with 20%, 40%, and 60% penetration rates, the average travel time is reduced by 41.88%, 13.38%, and 7.46%, with corresponding average speed improvements of 6.08%, 2.36%, and 1.61%, respectively. However, at penetration rate of 80% or higher, there is no significant improvement in the DVSL control strategy due to the improvement of CAVs to the traffic flow.
-
表 1 PER-DDQN相关参数
Table 1. The parameters of PER-DDQN
参数名称 数值 参数名称 数值 学习率 0.01 最大仿真回合数N 200 折扣系数 0.9 每回合仿真步长T 150 批次大小 128 软更新速率τ 0.01 经验池数 10 000 每回合ε衰减速率k0 0.98 单个回合最大探索步数 1 最小ε取值εmin 0.05 隐藏层层数 2 α1 -0.001 25 隐藏层神经元数量 64 α2 0.000 1 表 2 4种控制策略场景中评价指标对比
Table 2. Comparison of evaluation factors under four control strategy scenarios
评价指标 控制策略 平均值 相对无控制情况下变化率/% 全程行程时间/s 未实施可变限速 234.96 PER-DDQN-VSL 141.28 -39.87 DDPG-DVSL 138.47 -41.06 PER-DDQN-DVSL 136.55 -41.88 全程速度/(m/s) 未实施可变限速 20.35 PER-DDQN-VSL 21.17 +4.03 DDPG-DVSL 21.31 +4.72 PER-DDQN-DVSL 22.00 +5.65 合流区行程时间/s 未实施可变限速 29.92 PER-DDQN-VSL 11.23 -62.47 DDPG-DVSL 10.61 -64.54 PER-DDQN-DVSL 9.90 -66.91 合流区速度/(m/s) 未实施可变限速 15.50 PER-DDQN-VSL 19.77 +27.55 DDPG-DVSL 21.04 +35.74 PER-DDQN-DVSL 22.23 +43.42 -
[1] MENG X H, ZHANG Z Z, SHI Y Y. Research on traffic safety on freeway merging sections based on TTC and PET[C]. 2014 Trans Tech Publications, Switzerland: Applied Mechanics and Materials, 2014. [2] ALONSO B, PÓRTILLA Á I, MUSOLINO G, et al. Network Fundamental diagram(NFD)and traffic signal control: first empirical evidences from the city of Santander[J]. Transportation Research Procedia, 2017, 27: 27-34. doi: 10.1016/j.trpro.2017.12.112 [3] 李海舰, 刘中华, 陈开群, 等. 高速公路风险防控设施应用研究综述与展望[J]. 交通运输工程与信息学报, 2024, 22(1): 54-78. https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202401004.htmLI H J, LIU Z H, CHEN K Q, et al. An overview and prospect of the application of risk prevention and control facilities on high-ways[J]. Journal of Transportation Engineering and Information, 2024, 22(1): 54-78. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202401004.htm [4] SADAT M, CELIKOGLU H B. Simulation-based variable speed limit systems modelling: an overview and a case study on Istanbul freeways[J]. Transportation Research Procedia, 2017, 22: 607-614. doi: 10.1016/j.trpro.2017.03.051 [5] HADIUZZAMAN M, QIU T Z. Cell transmission model based variable speed limit control for freeways[J]. Canadian Journal of Civil Engineering, 2013, 40(1): 46-56. doi: 10.1139/cjce-2012-0101 [6] KHONDAKER B, KATTAN L. Variable speed limit: A micro-scopic analysis in a connected vehicle environment[J]. Transportation Research Part C: Emerging Technologies, 2015, 58: 146-159. doi: 10.1016/j.trc.2015.07.014 [7] PIAO J, MCDONALD M. Safety impacts of variable speed limits-a simulation study[C]. 2008 11th International IEEE Conference on Intelligent Transportation Systems, Beijing, China: IEEE, 2008. [8] SORIGUERA F, TORNÉ J M, ROSAS D. Assessment of dynamic speed limit management on metropolitan freeways[J]. Journal of Intelligent Transportation Systems, 2013, 17(1): 78-90. doi: 10.1080/15472450.2012.719455 [9] KIANFAR J, EDARA P, SUN C. Operational analysis of a free-way variable speed limit system in St. Louis, Missouri[J]. Journal of Intelligent Transportation Systems, 2015, 19(4): 385-398. doi: 10.1080/15472450.2014.989718 [10] HEGYI A, DE SCHUTTER B, HELLENDOORN J. Optimal coordination of variable speed limits to suppress shock waves[J]. IEEE Transactions on Intelligent Transportation Systems, 2005, 6(1): 102-112. doi: 10.1109/TITS.2004.842408 [11] HAN Y, HEGYI A, ZHANG L, et al. A new reinforcement learning-based variable speed limit control approach to improvetraffic efficiency against freeway jam waves[J]. Transportation Research Part C: Emerging Technologies, 2022, 144: 103900. doi: 10.1016/j.trc.2022.103900 [12] 韩雨, 郭延永, 张乐, 等. 消除高速公路运动波的可变限速控制方法[J]. 中国公路学报, 2022, 35(1): 151-158. doi: 10.3969/j.issn.1001-7372.2022.01.013HAN Y, GUO Y Y, ZHANG L, et al. An optimal variable speed limit control approach against freeway jam waves[J]. China Journal of Highway and Transport, 2022, 35(1): 151-158. (in Chinese) doi: 10.3969/j.issn.1001-7372.2022.01.013 [13] 段荟, 刘攀, 李志斌, 等. 基于强化学习的汇流瓶颈区可变限速策略研究[J]. 交通运输系统工程与信息, 2015, 15(1): 55-61. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201501011.htmDUAN H, LIU P, LI Z B, et al. Variable speed limit control at freeway merge bottlenecks based on reinforcement learning[J]. Journal of Transportation Systems Engineering and Information Technology, 2015, 15(1): 55-61. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201501011.htm [14] GREGURIĆ M, KUŠIĆ K, VRBANIĆ F, et al. Variable speed limit control based on deep reinforcement learning: A possible implementation[C]. 62nd International Symposium ELMAR-2020, Zadar, Croatia: IEEE, 2020. [15] 韩磊, 张轮, 郭为安. 混合交通流环境下基于改进强化学习的可变限速控制策略[J]. 交通运输系统工程与信息, 2023, 23(3): 110-122. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202303013.htmHAN L, ZHANG L, GUO W A. Variable speed limit control based on improved dueling double deep Q Network under mixed traffic environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(3): 110-122. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202303013.htm [16] GREGURIĆ M, KUŠIĆ K, IVANJKO E. Impact of deep reinforcement learning on variable speed limit strategies in connected vehicles environments[J]. Engineering Applications of Artificial Intelligence, 2022, 112: 104850. doi: 10.1016/j.engappai.2022.104850 [17] WU Y K, TAN H C, QIN L Q, et al. Differential variable speed limits control for freeway recurrent bottlenecks via deep actor-critic algorithm[J]. Transportation Research Part C: Emerging Technologies, 2020, 117: 102649. doi: 10.1016/j.trc.2020.102649 [18] HAN L, ZHANG L, GUO W. Optimal differential variable speed limit control in a connected and autonomous vehicle environment for freeway off-ramp bottlenecks[J]. Journal of Transportation Engineering, Part A: Systems, 2023, 149(4): 04023009. doi: 10.1061/JTEPBS.TEENG-7456 [19] 李松, 张开碧, 李永福, 等. 理想诱导环境下的网联车与网联自动驾驶车混合交通流建模研究[J]. 交通运输工程与信息学报, 2023, 21(3): 31-58. https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202303003.htmLI S, ZHANG K B, LI Y F, et al. Modeling a mixed traffic flow of connected vehicles and connected autonomous vehicles in an ideal induction environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 21(3): 31-58. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202303003.htm [20] 秦严严. 智能网联环境下异质交通流特性分析方法研究[D]. 南京: 东南大学, 2019.QIN Y Y. Study on analytical method of heterogeneous traffic flow characteristics under connected and autonomous environment[D]. Nanjing: Southeast University, 2019. (in Chinese) [21] GUÉRIAU M, DUSPARIC I. Quantifying the impact of connected and autonomous vehicles on traffic efficiency and safety in mixed traffic[C]. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems, Rhodes, Greece: IEEE, 2020. [22] 中华人民共和国交通运输部. 公路限速标志设计规范: JTG/T 3381-02—2020[S]. 北京: 交通运输部公路科学研究院, 2020.Ministry of Transport of the People's Republic of China. Design specifications for highway speed limit signs: JTG/T 3381-02—2020[S]. Beijing: Research Institute of Highway Ministry of Transport, 2020. (in Chinese) [23] XIAO Z, GUO X, GUO X, et al. Impact of cooperative adaptive cruise control on a multilane highway under a differentiated per-lane speed limit policy[J]. Transportation Research Record, 2021, 2675(10): 353-366. doi: 10.1177/03611981211011475 [24] 谢雨梅. 高速公路车道级可变限速控制策略优化研究[D]. 成都: 西南交通大学, 2022.XIE Y M. Study on optimization of lane-level variable speed limit control strategy on highway[D]. Chengdu: Southwest Jiaotong University, 2022. (in Chinese) [25] 吴文静, 战勇斌, 杨丽丽, 等. 考虑安全间距的合流区可变限速协调控制方法[J]. 吉林大学学报(工学版), 2022, 52 (6): 1315-1323. https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202206009.htmWU W J, ZHAN Y B, YANG L L, et al. Coordinated control method of variable speed limit in on-ramp area considering safety distance[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1315-1323. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202206009.htm [26] GUO Y, XU H, ZHANG Y, et al. Integrated variable speed limits and lane-changing control for freeway lane-drop bottle-necks[J]. IEEE Access, 2020(8)8: 54710-54721. [27] TREIBER M, HENNECKE A, HELBING D. Congested traffic states in empirical observations and microscopic simulations[J]. Physical Review E, 2000, 62(2): 1805-1824. [28] 交通运输部公路局, 中交第一公路勘察设计研究院有限公司. 公路工程技术标准: JTG BO1—2014[S]. 北京: 人民交通出版社股份有限公司, 2014.Ministry of Transport of the People's Republic of China, CCCC First Highway Consultants Co. Ltd. Technical standard of highway engineering[S]. Beijing: China Communications Press Co., Ltd. 2014. (in Chinese) [29] MILANÉS V, SHLADOVER S E. Modeling cooperative and autonomous adaptive cruise control dynamic responses using experimental data[J]. Transportation Research Part C: Emerging Technologies, 2014, 48: 285-300. [30] MILANÉS V, SHLADOVER S E, SPRING J, et al. Cooperative adaptive cruise control in real traffic situations[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 15(1): 296-305.