基于强化学习的车道级可变限速控制策略

白如玉; 焦朋朋; 陈越; 张瑶

doi:10.3963/j.jssn.1674-4861.2024.01.012

基于强化学习的车道级可变限速控制策略

doi: 10.3963/j.jssn.1674-4861.2024.01.012

北京建筑大学通用航空技术北京实验室北京 100044

基金项目:

国家自然科学基金项目 52172301

国家社科基金项目 21ZDA029

北京市社会科学基金项目 21GLA010

详细信息

作者简介:
白如玉(2000—)，硕士研究生. 研究方向：智能交通、自动驾驶. E-mail: bairuyu2021@163.com

通讯作者:
焦朋朋(1980—)，博士，教授. 研究方向：智能交通、交通管理、交通规划与管理、交通安全等. E-mail: jiaopengpeng@bucea.edu.cn

中图分类号: U491.4
计量
- 文章访问数: 209
- HTML全文浏览量: 123
- PDF下载量: 23
- 被引次数: 0
出版历程
- 收稿日期: 2023-08-01
- 网络出版日期: 2024-05-31

Differential Variable Speed Limit Control Strategy Based on Reinforcement Learning

Beijing Key Laboratory of General Aviation Technology, Beijing University of Civil Engineering and Architecture, Beijing 100044, China

摘要

摘要: 针对高速公路合流区主线各车道交通流运行状况受合流车辆影响的差异性，研究了1种基于强化学习的车道级可变限速（differential variable speed limit, DVSL）控制策略。由于DVSL控制问题存在高维动作空间求解困难，本文利用限速变化值优化动作空间，确定状态空间以及考虑多因素的奖励函数；在求解过程中，使用优质经验回放技术（prioritized experience replay，PER）进行改进，以提高训练效率和模型性能；同时提出1种车道间的安全检测机制辅助PER-DDQN展开训练，保证车道级可变限速模型可实施性。利用SUMO仿真软件测试所提出策略的控制效果，结果表明：所提出的车道级可变限速策略相较于未实施可变限速控制场景，全程行程时间降低41.88%、平均速度提高5.65%，合流区行程时间降低66.91%、平均速度提高43.42%；且车道级可变限速控制策略下合流区内各车道拥堵时间明显缩短，速度变化更加平稳。此外，还测试了智能网联车（connected-automated vehicles，CAVs）在不同渗透率场景对所提出策略的影响，渗透率在低于60%时实施车道级可变限速策略控制效果明显优于未实施可变限速控制策略，在渗透率为20%、40%和60%的场景中平均全程行程时间分别降低了41.88%、13.38%和7.46%，平均速度提高了6.08%、2.36%和1.61%；当渗透率达到80%以上时，鉴于CAVs车辆能明显改善交通流状况，实施车道级可变限速控制策略改善效果不明显。
- 智能交通 /
- 车道级可变限速 /
- 控制策略 /
- 强化学习 /
- 高速合流区 /
- 异质交通流
Abstract: In addressing the challenges posed by variable traffic conditions within highway merging lanes impacted by merging vehicles, a reinforcement learning (RL) model is developed for differential variable speed limit (DVSL) control. Due to the difficulty of solving the DVSL control problem with high-dimensional action space, this paper optimizes the action space by using the speed limit change value, determines the state space as well as the reward function considering multiple factors; in the solution process, it is improved by using the Prioritized Experience Replay (PER) technique in order to improve the training efficiency and model performance; and at the same time, it proposes an inter-lane safety detection mechanism to assist the PER-DDQN to unfold the training and ensure the implementability of the lane-level variable velocity limit model. Furthermore, the merging area is simulated with SUMO to examine the performance of the DVSL controller. The results reveal that, compared with the no-control scenario, the proposed method yields a 41.88% reduction in overall travel time and a 5.65% increase in average speed. In the merging zone, a notable 66.91% reduction in travel time and a 43.42% increase in average speed are achieved. And the RL based DVSL control strategy effectively minimizes congestion time for each lane due to smoother speed changes. Furthermore, when evaluating the impact of varying penetration scenarios on the proposed method, the RL based DVSL control strategy outperforms the no-control scenario particularly when the penetration of connected-automated vehicles (CAVs) is below 60%. In scenarios with 20%, 40%, and 60% penetration rates, the average travel time is reduced by 41.88%, 13.38%, and 7.46%, with corresponding average speed improvements of 6.08%, 2.36%, and 1.61%, respectively. However, at penetration rate of 80% or higher, there is no significant improvement in the DVSL control strategy due to the improvement of CAVs to the traffic flow.
- intelligent traffic /
- differential variable speed limit control /
- control strategy /
- reinforcement learning /
- high-way merging area /
- mixed traffic flow

HTML全文

图 1 限速控制流程

Figure 1. Process of variable speed limit control

下载: 全尺寸图片幻灯片

图 2 探测器位置

Figure 2. Location of the detectors

下载: 全尺寸图片幻灯片

图 3 动作空间设计

Figure 3. Designed action space

下载: 全尺寸图片幻灯片

图 4 PER-DDQN-DVSL控制策略流程

Figure 4. Process of PER-DDQN-DVSL control

下载: 全尺寸图片幻灯片

图 5 交通量设置

Figure 5. Traffic demand used for traffic scenario

下载: 全尺寸图片幻灯片

图 6 合流区平均速度

Figure 6. Average speed of merging area

下载: 全尺寸图片幻灯片

图 7 奖励值变化情况

Figure 7. Cumulative reward value

下载: 全尺寸图片幻灯片

图 8 限速值变化

Figure 8. Speed limits

下载: 全尺寸图片幻灯片

图 9 不同车道中4种控制场景的速度值变化情况

Figure 9. Variation of speed values for four control strategies in each lane

下载: 全尺寸图片幻灯片

图 10 全程平均行程时间

Figure 10. Average overall travel time

下载: 全尺寸图片幻灯片

图 11 全程平均速度

Figure 11. Average overall travel speed

下载: 全尺寸图片幻灯片

表 1 PER-DDQN相关参数

Table 1. The parameters of PER-DDQN

参数名称	数值	参数名称	数值
学习率	0.01	最大仿真回合数N	200
折扣系数	0.9	每回合仿真步长T	150
批次大小	128	软更新速率τ	0.01
经验池数	10 000	每回合ε衰减速率k₀	0.98
单个回合最大探索步数	1	最小ε取值ε_min	0.05
隐藏层层数	2	α₁	－0.001 25
隐藏层神经元数量	64	α₂	0.000 1

下载: 导出CSV

表 2 4种控制策略场景中评价指标对比

Table 2. Comparison of evaluation factors under four control strategy scenarios

评价指标	控制策略	平均值	相对无控制情况下变化率/%
全程行程时间/s	未实施可变限速	234.96
	PER-DDQN-VSL	141.28	－39.87
	DDPG-DVSL	138.47	－41.06
	PER-DDQN-DVSL	136.55	－41.88
全程速度/(m/s)	未实施可变限速	20.35
	PER-DDQN-VSL	21.17	+4.03
	DDPG-DVSL	21.31	+4.72
	PER-DDQN-DVSL	22.00	+5.65
合流区行程时间/s	未实施可变限速	29.92
	PER-DDQN-VSL	11.23	－62.47
	DDPG-DVSL	10.61	－64.54
	PER-DDQN-DVSL	9.90	－66.91
合流区速度/(m/s)	未实施可变限速	15.50
	PER-DDQN-VSL	19.77	+27.55
	DDPG-DVSL	21.04	+35.74
	PER-DDQN-DVSL	22.23	+43.42

下载: 导出CSV

参考文献(30)

[1]	MENG X H, ZHANG Z Z, SHI Y Y. Research on traffic safety on freeway merging sections based on TTC and PET[C]. 2014 Trans Tech Publications, Switzerland: Applied Mechanics and Materials, 2014.
[2]	ALONSO B, PÓRTILLA Á I, MUSOLINO G, et al. Network Fundamental diagram(NFD)and traffic signal control: first empirical evidences from the city of Santander[J]. Transportation Research Procedia, 2017, 27: 27-34. doi: 10.1016/j.trpro.2017.12.112
[3]	李海舰, 刘中华, 陈开群, 等. 高速公路风险防控设施应用研究综述与展望[J]. 交通运输工程与信息学报, 2024, 22(1): 54-78. https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202401004.htm LI H J, LIU Z H, CHEN K Q, et al. An overview and prospect of the application of risk prevention and control facilities on high-ways[J]. Journal of Transportation Engineering and Information, 2024, 22(1): 54-78. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202401004.htm
[4]	SADAT M, CELIKOGLU H B. Simulation-based variable speed limit systems modelling: an overview and a case study on Istanbul freeways[J]. Transportation Research Procedia, 2017, 22: 607-614. doi: 10.1016/j.trpro.2017.03.051
[5]	HADIUZZAMAN M, QIU T Z. Cell transmission model based variable speed limit control for freeways[J]. Canadian Journal of Civil Engineering, 2013, 40(1): 46-56. doi: 10.1139/cjce-2012-0101
[6]	KHONDAKER B, KATTAN L. Variable speed limit: A micro-scopic analysis in a connected vehicle environment[J]. Transportation Research Part C: Emerging Technologies, 2015, 58: 146-159. doi: 10.1016/j.trc.2015.07.014
[7]	PIAO J, MCDONALD M. Safety impacts of variable speed limits-a simulation study[C]. 2008 11th International IEEE Conference on Intelligent Transportation Systems, Beijing, China: IEEE, 2008.
[8]	SORIGUERA F, TORNÉ J M, ROSAS D. Assessment of dynamic speed limit management on metropolitan freeways[J]. Journal of Intelligent Transportation Systems, 2013, 17(1): 78-90. doi: 10.1080/15472450.2012.719455
[9]	KIANFAR J, EDARA P, SUN C. Operational analysis of a free-way variable speed limit system in St. Louis, Missouri[J]. Journal of Intelligent Transportation Systems, 2015, 19(4): 385-398. doi: 10.1080/15472450.2014.989718
[10]	HEGYI A, DE SCHUTTER B, HELLENDOORN J. Optimal coordination of variable speed limits to suppress shock waves[J]. IEEE Transactions on Intelligent Transportation Systems, 2005, 6(1): 102-112. doi: 10.1109/TITS.2004.842408
[11]	HAN Y, HEGYI A, ZHANG L, et al. A new reinforcement learning-based variable speed limit control approach to improvetraffic efficiency against freeway jam waves[J]. Transportation Research Part C: Emerging Technologies, 2022, 144: 103900. doi: 10.1016/j.trc.2022.103900
[12]	韩雨, 郭延永, 张乐, 等. 消除高速公路运动波的可变限速控制方法[J]. 中国公路学报, 2022, 35(1): 151-158. doi: 10.3969/j.issn.1001-7372.2022.01.013 HAN Y, GUO Y Y, ZHANG L, et al. An optimal variable speed limit control approach against freeway jam waves[J]. China Journal of Highway and Transport, 2022, 35(1): 151-158. (in Chinese) doi: 10.3969/j.issn.1001-7372.2022.01.013
[13]	段荟, 刘攀, 李志斌, 等. 基于强化学习的汇流瓶颈区可变限速策略研究[J]. 交通运输系统工程与信息, 2015, 15(1): 55-61. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201501011.htm DUAN H, LIU P, LI Z B, et al. Variable speed limit control at freeway merge bottlenecks based on reinforcement learning[J]. Journal of Transportation Systems Engineering and Information Technology, 2015, 15(1): 55-61. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT201501011.htm
[14]	GREGURIĆ M, KUŠIĆ K, VRBANIĆ F, et al. Variable speed limit control based on deep reinforcement learning: A possible implementation[C]. 62nd International Symposium ELMAR-2020, Zadar, Croatia: IEEE, 2020.
[15]	韩磊, 张轮, 郭为安. 混合交通流环境下基于改进强化学习的可变限速控制策略[J]. 交通运输系统工程与信息, 2023, 23(3): 110-122. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202303013.htm HAN L, ZHANG L, GUO W A. Variable speed limit control based on improved dueling double deep Q Network under mixed traffic environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(3): 110-122. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202303013.htm
[16]	GREGURIĆ M, KUŠIĆ K, IVANJKO E. Impact of deep reinforcement learning on variable speed limit strategies in connected vehicles environments[J]. Engineering Applications of Artificial Intelligence, 2022, 112: 104850. doi: 10.1016/j.engappai.2022.104850
[17]	WU Y K, TAN H C, QIN L Q, et al. Differential variable speed limits control for freeway recurrent bottlenecks via deep actor-critic algorithm[J]. Transportation Research Part C: Emerging Technologies, 2020, 117: 102649. doi: 10.1016/j.trc.2020.102649
[18]	HAN L, ZHANG L, GUO W. Optimal differential variable speed limit control in a connected and autonomous vehicle environment for freeway off-ramp bottlenecks[J]. Journal of Transportation Engineering, Part A: Systems, 2023, 149(4): 04023009. doi: 10.1061/JTEPBS.TEENG-7456
[19]	李松, 张开碧, 李永福, 等. 理想诱导环境下的网联车与网联自动驾驶车混合交通流建模研究[J]. 交通运输工程与信息学报, 2023, 21(3): 31-58. https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202303003.htm LI S, ZHANG K B, LI Y F, et al. Modeling a mixed traffic flow of connected vehicles and connected autonomous vehicles in an ideal induction environment[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 21(3): 31-58. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JTGC202303003.htm
[20]	秦严严. 智能网联环境下异质交通流特性分析方法研究[D]. 南京: 东南大学, 2019. QIN Y Y. Study on analytical method of heterogeneous traffic flow characteristics under connected and autonomous environment[D]. Nanjing: Southeast University, 2019. (in Chinese)
[21]	GUÉRIAU M, DUSPARIC I. Quantifying the impact of connected and autonomous vehicles on traffic efficiency and safety in mixed traffic[C]. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems, Rhodes, Greece: IEEE, 2020.
[22]	中华人民共和国交通运输部. 公路限速标志设计规范: JTG/T 3381-02—2020[S]. 北京: 交通运输部公路科学研究院, 2020. Ministry of Transport of the People's Republic of China. Design specifications for highway speed limit signs: JTG/T 3381-02—2020[S]. Beijing: Research Institute of Highway Ministry of Transport, 2020. (in Chinese)
[23]	XIAO Z, GUO X, GUO X, et al. Impact of cooperative adaptive cruise control on a multilane highway under a differentiated per-lane speed limit policy[J]. Transportation Research Record, 2021, 2675(10): 353-366. doi: 10.1177/03611981211011475
[24]	谢雨梅. 高速公路车道级可变限速控制策略优化研究[D]. 成都: 西南交通大学, 2022. XIE Y M. Study on optimization of lane-level variable speed limit control strategy on highway[D]. Chengdu: Southwest Jiaotong University, 2022. (in Chinese)
[25]	吴文静, 战勇斌, 杨丽丽, 等. 考虑安全间距的合流区可变限速协调控制方法[J]. 吉林大学学报(工学版), 2022, 52 (6): 1315-1323. https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202206009.htm WU W J, ZHAN Y B, YANG L L, et al. Coordinated control method of variable speed limit in on-ramp area considering safety distance[J]. Journal of Jilin University (Engineering and Technology Edition), 2022, 52(6): 1315-1323. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JLGY202206009.htm
[26]	GUO Y, XU H, ZHANG Y, et al. Integrated variable speed limits and lane-changing control for freeway lane-drop bottle-necks[J]. IEEE Access, 2020(8)8: 54710-54721.
[27]	TREIBER M, HENNECKE A, HELBING D. Congested traffic states in empirical observations and microscopic simulations[J]. Physical Review E, 2000, 62(2): 1805-1824.
[28]	交通运输部公路局, 中交第一公路勘察设计研究院有限公司. 公路工程技术标准: JTG BO1—2014[S]. 北京: 人民交通出版社股份有限公司, 2014. Ministry of Transport of the People's Republic of China, CCCC First Highway Consultants Co. Ltd. Technical standard of highway engineering[S]. Beijing: China Communications Press Co., Ltd. 2014. (in Chinese)
[29]	MILANÉS V, SHLADOVER S E. Modeling cooperative and autonomous adaptive cruise control dynamic responses using experimental data[J]. Transportation Research Part C: Emerging Technologies, 2014, 48: 285-300.
[30]	MILANÉS V, SHLADOVER S E, SPRING J, et al. Cooperative adaptive cruise control in real traffic situations[J]. IEEE Transactions on Intelligent Transportation Systems, 2013, 15(1): 296-305.

施引文献

资源附件(0)

访问统计

点击查看大图

图(11) / 表(2)

计量

文章访问数: 209
HTML全文浏览量: 123
PDF下载量: 23
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于强化学习的车道级可变限速控制策略

doi: 10.3963/j.jssn.1674-4861.2024.01.012

作者简介:
白如玉(2000—)，硕士研究生. 研究方向：智能交通、自动驾驶. E-mail: bairuyu2021@163.com

通讯作者:
焦朋朋(1980—)，博士，教授. 研究方向：智能交通、交通管理、交通规划与管理、交通安全等. E-mail: jiaopengpeng@bucea.edu.cn

计量

Differential Variable Speed Limit Control Strategy Based on Reinforcement Learning

计量

目录

留言板

基于强化学习的车道级可变限速控制策略

doi: 10.3963/j.jssn.1674-4861.2024.01.012

作者简介: 白如玉(2000—)，硕士研究生. 研究方向：智能交通、自动驾驶. E-mail: bairuyu2021@163.com

通讯作者: 焦朋朋(1980—)，博士，教授. 研究方向：智能交通、交通管理、交通规划与管理、交通安全等. E-mail: jiaopengpeng@bucea.edu.cn

计量

出版历程

Differential Variable Speed Limit Control Strategy Based on Reinforcement Learning

计量

出版历程

目录

作者简介:
白如玉(2000—)，硕士研究生. 研究方向：智能交通、自动驾驶. E-mail: bairuyu2021@163.com

通讯作者:
焦朋朋(1980—)，博士，教授. 研究方向：智能交通、交通管理、交通规划与管理、交通安全等. E-mail: jiaopengpeng@bucea.edu.cn