A Multi-objective Traffic Control Method for Connected and Automated Vehicle at Signalized Intersection Based on Reinforcement Learning
-
摘要: 针对传统控制方法下的智能网联车辆(connected and autonomous vehicle,CAV)在动态交通环境中通行能耗较高且效率较低等问题,研究了基于强化学习的CAV通行控制方法,旨在降低车辆能源消耗,提升车辆通行效率以及行驶舒适度。通过考虑CAV与交叉口信控系统的信息交互和物理环境,收集信号相位和信号配时(SPaT)以及前车速度和位置等信息,构建强化学习框架的状态空间。以电池能量回收的上限作为边界条件,建立CAV的行驶能耗模型,并基于车辆行驶的关键特征指标,如单位时间电能能耗、通行距离以及加速度变化率,设计多目标加权奖励函数。利用层次分析法确定各指标的权重,进而采用深度确定性策略梯度算法对模型进行训练,并通过梯度下降方法对算法参数进行调整和更新。采用SUMO平台开展仿真实验,实验结果表明:在设计的算法控制下的CAV各方面行驶性能最为均衡,相较于DQN算法电能消耗和加速度变化率均值分别降低了9.22%和18.77%;相较于Krauss跟驰模型行程时间缩短了8.39%。本研究提出的CAV通行控制方法在降低车辆能耗、提高行驶效率和舒适性等方面具有较好的可行性和有效性。Abstract: To address the issue of high energy consumption and low efficiency of connected and autonomous vehicles (CAV) in dynamic traffic environments under traditional control methods, a reinforcement learning-based control approach for CAV is proposed, aiming at reducing energy consumption, improving travel efficiency, and enhancing driving comfort. By considering the interactions between CAV and traffic signal control systems, as well as physical environmental factors, we collect signal phase and timing (SPaT), preceding vehicle speed and position, and other information to establish the state space of the reinforcement learning framework. Furthermore, an energy consumption model is established with the limit of battery energy recovery, and a multi-objective weighted reward function is designed based on key performance indicators such as energy consumption per unit time, travel distance, and acceleration change rate. The optimal weights for each performance indicator are determined using the analytic hierarchy process, and the model is trained using a deep deterministic policy gradient algorithm, with the algorithm parameters optimized through gradient descent. Simulation experiments were carried out using the SUMO platform the results demonstrate that the proposed algorithm achieves the most balanced travel performance, with a 9.22% reduction in energy consumption and an 18.77% reduction in change rate of acceleration compared to the DQN algorithm, as well as an 8.39% reduction in travel time compared to the Krauss car-following model. In conclusion, the results validate the feasibility and effectiveness of the proposed CAV control approach in reducing energy consumption, improving travel efficiency, and enhancing driving comfort.
-
表 1 状态空间的参数及含义
Table 1. Parameters and description of state space
参数 含义说明 车辆速度v(t) 涉及车辆的能耗和效率 车辆行驶距离d(t) 涉及车辆的能耗和效率 车辆加速度at 涉及车辆的舒适性。 前后车速度差Δvt 涉及车辆的安全性 前后车间隔距离Δxt 涉及车辆的安全性 交叉口当前相位绿灯剩余时长σ(t) 涉及车辆的效率和安全性。若剩余时长小于车辆以最高允许速度通过交叉口所需时间,则车辆需缓慢减速至停车,否则车辆可适当加速以更快通过交叉口 表 2 各指标相对重要性系数
Table 2. Relative importance coefficient of each index
指标 电能消耗 通行效率 驾驶舒适度 安全性 电能消耗 1 3 2 1/3 通行效率 1/3 1 1/2 1/3 驾驶舒适度 1/2 2 1 1/3 安全性 3 3 3 1 表 3 仿真参数设置
Table 3. Simulation parameter settings
参数 取值 道路总长L/m 2 200 相邻交叉口间距D/m 800 HV设计小时交通量q/(veh/h) 1 600 HV车身长度lHV/m 5 HV车体重量mHV/kg 2 000 HV驾驶人熟练度sigma 0.5 HV驾驶人反应时间tau/s 1 CAV车身长度lCAV/m 5 CAV车体重量mCAV/kg 2 000 CAV车辆前表面积SCAV/m2 2.600 车辆速度v(t)/(km/h) (0, 30) 车辆行驶距离d(t)/m (0,2 200) 车辆加速度at/(m/s2) (-4.500,4.500) 前后车相对速度Δvt/(km/h) (0,30) 前后车间距Δxt/m (0,300) 当前相位绿灯剩余时长σ(t)/s (0,40) 空气阻力系数cd 0.250 滚动阻力系数cr 0.005 弯道阻力系数cc 0.300 能量回收因子μ 0.350 重力加速度g/(m/s2) 9.800 表 4 不同跟驰模式下的仿真数据
Table 4. Simulation data under different car-following modes
跟驰模式 电能总消耗/Wh 行程时间/s 平均速度/(km/h) 加速度变化率均值/(m/s3) Krauss 211.326 441 17.959 1.249 DDPG 217.627 404 19.639 1.199 DQN 239.185 442 17.918 1.476 A2C 316.511 478 16.565 2.917 -
[1] GABRIEL R D C, PAOLO F, ROBERT H, et al. Traffic coor-dination at road intersections: autonomous decision-making algorithms using model-based heuristics[J]. IEEE Intelligent Transportation Systems Magazine, 2017, 9(1): 8-21. doi: 10.1109/MITS.2016.2630585 [2] SABOOHI Y, FARZANEH H. Model for developing an eco-driving strategy of a passenger vehicle based on the least fuel consumption[J]. Applied Energy, 2008, 86 (10): 1925-1932. [3] 袁伟, 张雅丽, 王虹霞, 等. 纯电动公交车交叉口节能驾驶策略[J]. 中国公路学报, 2021, 34(7): 54-66. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202107005.htmYUAN W, ZHANG Y L, WANG H X, et al. Energy-saving driving technique for pure electric buses in intersection[J]. China Journal of Highway and Transport, 2021, 34(7): 54-66. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202107005.htm [4] XIA H, BORIBOONSOMSIN K, BARTH M. Dynamic eco-driving for signalized arterial corridors and its indirect network-wide energy/emissions benefits[J]. Journal of Intelligent Transportation Systems, 2013, 17(1): 31-41. doi: 10.1080/15472450.2012.712494 [5] WU X K, HE X Z, YU G Z, et al. Energy-optimal speed control for electric vehicles on signalized arterials[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(5): 2786-2796. doi: 10.1109/TITS.2015.2422778 [6] YANG J, ZHAO D, JIANG J, et al. A less-disturbed ecological driving strategy for connected and automated vehicles[J]. IEEE Transactions on Intelligent Vehicles. 2023, 8(1): 413-424. doi: 10.1109/TIV.2021.3112499 [7] LI M, WU X K, HE X Z, et al. An eco-driving system for electric vehicles with signal control under V2X environment[J]. Transportation Research Part C: Emerging Technologies, 2018, 93: 335-350. doi: 10.1016/j.trc.2018.06.002 [8] MOUSA S R, ISHAK S, MOUSA R M, et al. Deep reinforcement learning agent with varying actions strategy for solving the eco-approach and departure problem at signalized intersections[J]. Transportation Research Record: Journal of the Transportation Research Board, 2020, 2674(8): 119-131. doi: 10.1177/0361198120931848 [9] 吴超仲, 冷姚, 陈志军, 等. 基于强化学习的智能车人机共融转向驾驶决策方法[J]. 交通运输工程学报, 2022, 22(3): 55-67. https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202203004.htmWU C Z, LENG Y, CHEN Z J, et al. Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning[J]. Journal of Traffic and Transportation Engineering, 2022, 22(3): 55-67. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JYGC202203004.htm [10] SHI J Q, QIAO F X, LI Q, et al. Application and evaluation of the reinforcement learning approach to eco-driving at intersections under infrastructure-to-vehicle communications[J]. Transportation Research Record: Journal of the Transportation Research Board, 2018, 2672(25): 89-98. doi: 10.1177/0361198118796939 [11] 陆丽萍, 程垦, 褚端峰, 等. 基于竞争循环双Q网络的自适应交通信号控制[J]. 中国公路学报, 2022, 35(8): 267-277. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202208025.htmLU L P, CHENG K, CHU D F, et al. Adaptive traffic signal control based on dueling recurrent double Q network[J]. China Journal of Highway and Transport, 2022, 35(8): 267-277. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202208025.htm [12] 陈越, 焦朋朋, 白如玉, 等. 基于深度强化学习的自动驾驶车辆跟驰行为建模[J]. 交通信息与安全, 2023, 41(2): 67-75, 102. doi: 10.3963/j.jssn.1674-4861.2023.02.007CHEN Y, JIAO P P, BAI R Y, et al. Modeling car following behavior of autonomous driving vehicles based on deep reinforcement learning[J]. Journal of Transport Information and Safety, 2023, 41(2): 67-75, 102. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.02.007 [13] WU T, YUAN Y L. Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (8): 8243-8256. doi: 10.1109/TVT.2020.2997896 [14] ZHOU M F, YU Y, QU X B. Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: a reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21(1): 433-443. doi: 10.1109/TITS.2019.2942014 [15] GUO Q Q, OHAY A, LIU Z J, et al. Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors[J]. Transportation Research Part C: Emerging Technologies, 2021, 124: 2-18. [16] KURCZVEIL T, LÓPEZ P Á, SCHNIEDER E. Implementation of an energy model and a charging infrastructure in SUMO[C]. Simulation of Urban Mobility User Conference, Berlin, Germany: Springer, 2013. [17] ZHAO W M, DONG N, SIMON S, et al. A platoon based co-operative eco-driving model for mixed automated and human-driven vehicles at a signalized intersection[J]. Transportation Research Part C: Emerging Technologies, 2018, 95: 802-821. [18] 吕能超, 王玉刚, 周颖, 等. 道路交通安全分析与评价方法综述[J]. 中国公路报, 2023, 36(4): 183-201. https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202304016.htmLYU N C, WANG Y G, ZHOU Y, et al. Review on road traffic safety analysis and evaluation method[J]. China Journal of Highway and Transport, 2023, 36(4): 183-201. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-ZGGL202304016.htm [19] ZHANG J, WU K R, CHENG M, et al. Safety evaluation for connected and autonomous vehicles' exclusive lanes considering penetrate ratios and impact of trucks using surrogate safety measures[J]. Journal of Advanced Transportation, 2020(2): 1-16. [20] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[J]. Computer Science, 2015, 8(6): 1-14. [21] GARCIA A G, TRIA L A R, TALAMPAS M C R. Development of an energy-efficient routing algorithm for electric vehicles[C]. 2019 IEEE Transportation Electrification Conference and Expo(ITEC), Michigan, USA: IEEE, 2019.