基于改进YOLOv7的码头作业人员检测算法

张孝杰; 张艳伟; 邹鹰; 尹学成; 程祈文; 沈汝超

doi:10.3963/j.jssn.1674-4861.2024.02.007

基于改进YOLOv7的码头作业人员检测算法

doi: 10.3963/j.jssn.1674-4861.2024.02.007

1.
武汉理工大学交通与物流工程学院武汉 430063
2.
上海国际港务(集团)股份有限公司上海 200080
3.
连云港新圩港码头有限公司江苏连云港 222248

基金项目:

国家科技重大专项项目 2022ZD0119303

详细信息

作者简介:
张孝杰(1999-), 硕士研究生. 研究方向: 港口安防、计算机视觉等. E-mail: zhangxiaojie0220@163.com

通讯作者:
张艳伟(1977-), 博士, 副教授. 研究方向: 智慧港口、智能决策与算法等. E-mail: zywtg@whut.edu.cn

中图分类号: TP391.4;U698.5
计量
- 文章访问数: 50
- HTML全文浏览量: 32
- PDF下载量: 2
- 被引次数: 0
出版历程
- 收稿日期: 2023-11-05
- 网络出版日期: 2024-09-14

An Improved YOLOv7 Algorithm for Workers Detection in Port Terminals

1.
School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan 430063, China
2.
Shanghai International Port (Group) Co., Ltd., Shanghai 200080, China
3.
Lianyungang Xinwei Port Terminal Co., Ltd., Lianyungang 222248, Jiangsu, China

摘要

摘要: ：广角监控图像中人员目标检测对于码头智能安防具有重要意义。针对传统YOLOv7算法在码头广角监控图像识别中，存在小目标特征提取能力弱、人员检测准确率低等问题，研究了基于改进YOLOv7的码头作业人员检测算法。为提升人员目标多尺度特征的检测性能及鲁棒性，设计了平衡码头人员分类与定位任务的上下文解耦（task-specific context decoupling，TSCODE）结构并联合聚集-分发机制（gather-and-distribute，GD），增强网络多尺度特征融合能力；为增强网络对作业人员等小目标的特征提取能力，在主干网络末端引入了基于双层路由注意力机制（bi-level routing attention，BRA）的视觉transformer模型（BRA-ViT），捕捉小目标人员的位置、方向与跨通道等信息；为提升检测速度并保持检测精度，提出了基于slim-neck的颈部层网络轻量化方法，降低参数量与计算量；为降低漏检率与误检率，引入了基于最小点距离的交并比损失函数（minimum-point-distance-based intersection over union，MPDIoU）计算边界框的坐标预测损失，提升边界框回归的准确性与计算效率。为验证算法效果，采集白天、夜晚不同时段下码头前沿、堆场、卡口等场景的广角监控图像，构造标注数据集并设计消融与对比实验。实验结果显示：所提算法对码头作业人员检测的平均准确率为90.6%，平均检测速度为39 fps；与Faster R-CNN、SSD、YOLOv3、YOLOv5、YOLOv7、YOLOv8等算法相比，其平均准确率分别提升了13.8%、15.8%、8.5%、5.2%、2.7%和3.5%，平均检测速度与基准YOLOv7算法性能相当。所提算法对码头作业人员识别具有较高的检测精度与检测速度，满足码头安防场景中作业人员检测准确性与实时性的要求。
- 交通安全 /
- 广角监控图像 /
- 码头作业人员检测定位 /
- YOLOv7
Abstract: Accurate detection of workers in wide-angle surveillance images is significant for intelligent surveillance in port terminals. However, the traditional YOLOv7 algorithm has limitations on the recognition of workers in wide-angle surveillance images, such as weak feature extraction ability, low detection accuracy, etc. To fill these gaps, an algorithm for terminal worker detection based on improved YOLOv7 is proposed. A task-specific context decoupling (TSCODE) structure balancing the classification and localization tasks is designed, and the gather-and-distribute mechanism (GD) improving the fusion of multi-scale features is applied, which improves the performance and robustness of multiscale features detection from various workers'images. To strengthen the feature extraction of small targets, the vision transformer with bi-level routing attention (BRA-ViT) is introduced into the end of the backbone network, capturing the position, direction, and cross-channel information of small objects. The slim-neck is used to lighten the neck of the network, refine the number of parameters, and reduce computational complexity, enhancing detection speed while maintaining detection accuracy. Fourthly, a loss function with minimum-point-distance-based intersection over union (MPDIoU) is used to calculate the prediction loss of the bounding box, reducing the rates of false negatives and false positives. To validate the proposed algorithm, wide-angle surveillance images in different areas of the port (quay, yard, chokepoint, and other locations) at different times (day and night) are collected and annotated in the dataset, and ablation and comparison experiments are implemented. The results show that the average detection precision (AP) and average detection speed of the proposed algorithm are 90.6% and 39 fps, respectively. Compared with Faster R-CNN, SSD, YOLOv3, YOLOv5, YOLOv7, and YOLOv8, AP of the proposed algorithm is improved by 13.8%, 15.8%, 8.5%, 5.2%, 2.7%, and 3.5%, respectively; FPS of the proposed algorithm is similar to the baseline YOLOv7 algorithm. In summary, the proposed algorithm has higher AP than existing algorithms with responsible detection speed, which is suitable for real-time safety and security surveillance in port terminals.
- transport safety /
- wide-angle surveillance images /
- terminal's workers detection and localization /
- YOLOv7

HTML全文

图 1 改进的YOLOv7网络结构

Figure 1. Network structure of improved YOLOv7

下载: 全尺寸图片幻灯片

图 2 TSCODE结构

Figure 2. Structure of TSCODE

下载: 全尺寸图片幻灯片

图 3 GD分支部署

Figure 3. Deployment of GD

下载: 全尺寸图片幻灯片

图 4 BiFormer模块

Figure 4. BiFormer module

下载: 全尺寸图片幻灯片

图 5 SPPCSPC_BRA模块

Figure 5. SPPCSPC_BRA module

下载: 全尺寸图片幻灯片

图 6 GSConv与VoV-GSCSP结构

Figure 6. Structure of GSConv and VoV-GSCSP

下载: 全尺寸图片幻灯片

图 7 slim-neck结构

Figure 7. Structure of slim-neck

下载: 全尺寸图片幻灯片

图 8 码头作业人员数据集部分图像示例

Figure 8. Partial images of the terminal's workers dataset

下载: 全尺寸图片幻灯片

图 9 模型训练曲线图

Figure 9. Curve diagram of model training

下载: 全尺寸图片幻灯片

图 10 YOLOv7与改进YOLOv7检测对比

Figure 10. Comparison of YOLOv7 and improved YOLOv7 detection

下载: 全尺寸图片幻灯片

表 1 码头作业人员数据集统计数据

Table 1. Statistical data of the terminal's workers dataset

类别	子类别	数量/个
场景	码头前沿	978
	堆场	639
	仓库	43
	卡口	452
时间	白天	1 711
时间	夜晚	401
目标类型	大目标	923
	中目标	288
	小目标	3 930

下载: 导出CSV

表 2 改进的YOLOv7算法消融实验结果

Table 2. Ablation experimental results of improved YOLOv7 algorithm

组别	TSCODE	BiFormer	GD	slim-neck	MPDIoU	Params/M	FLOPs/G	AP/%
1						37.2	105.1	87.9
2	√					55.5	121.3	89.6
3		√				38.3	105.1	88.6
4			√			40.7	109.0	89.3
5				√		25.9	42.9	88.9
6					√	37.2	105.1	88.5
7	√	√				56.6	121.3	89.9
8	√	√	√			60.1	125.3	90.3
9	√	√	√	√		55.1	113.3	90.2
10	√	√	√	√	√	55.1	113.3	90.6

下载: 导出CSV

表 3 不同目标检测算法实验结果对比

Table 3. Comparison of experimental results of different object detection algorithms

检测算法	AP/%	FPS/（f/s）
Faster R-CNN	76.8	5
SSD	74.8	32
YOLOv3	82.1	38
YOLOv5	85.4	43
YOLOv7	87.9	41
YOLOv8	87.1	44
本文算法	90.6	39

下载: 导出CSV

参考文献(22)

[1]	雷富成, 黄同, 陈俊宏. 基于事故致因理论的港口事故因素统计分析及安全管理[J]. 珠江水运, 2024(1): 64-67. LEI F C, HUANG T, CHEN J H, et al. Statistical analysis of port accident factors and safety management based on accident causation theory[J]. Pearl River Water Transport, 2024 (1): 64-67. (in Chinese)
[2]	KAUR R, SINGH S. A comprehensive review of object detection with deep learning[J]. Digital Signal Processing, 2023, 132: 103812. doi: 10.1016/j.dsp.2022.103812
[3]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 27th IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Columbus, OH: IEEE, 2014.
[4]	GIRSHICK R. Fast r-cnn[C]. 2015 IEEE International Conference on Computer Vision(ICCV), Santiago, Chile: IEEE, 2015.
[5]	REN S, HE K, GIRSHICK R, et al. Faster r-cnn: towards real-time object detection with region proposal networks[C]. 28th International Conference on Neural Information Proceeding System, Montreal, Canada: MIT Press, 2015.
[6]	LIU W, ANGUELOV D, ERHAN D, et al. Ssd: single shot multibox detector[C]. 14th European Conference on Computer Vision(ECCV), Amsterdam, Netherlands: Springer, 2016.
[7]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA: IEEE, 2016.
[8]	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Vancouver, Canada: IEEE, 2023.
[9]	马浩为, 张笛, 李玉立, 等. 基于改进YOLOv5的雾霾环境下船舶红外图像检测算法[J]. 交通信息与安全, 2023, 41 (1): 95-104. doi: 10.3963/j.jssn.1674-4861.2023.01.010 MA H W, ZHANG D, LI Y L, et al. A ship detection for infrared images under hazy environment based on an improved YOLOv5 algorithm[J]. Journal of Transport Information and Safety, 2023, 41(1): 95-104. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.01.010
[10]	ZHAO J, CHEN C, WANG W. Port container detection in foggy weather scenarios based on YOLOv5[C]. International Conference on Artificial Intelligence in China, Baishan, China: Springer Nature, 2023.
[11]	王曼菲, 李志明. 基于深度学习的港口移动目标识别技术研究[J]. 中国水运, 2022(10): 59-60. WANG M F, LI Z M. Research on port moving target recognition technology based on deep learning[J]. China Water Transport, 2022(10): 9-60. (in Chinese)
[12]	XU X, CHEN X, WU B, et al. Exploiting high-fidelity kinematic information from port surveillance videos via a YOLO-based framework[J]. Ocean & Coastal Management, 2022, 222: 106117.
[13]	郭晓晗, 彭理群, 马定辉. 基于车联网BSM数据与路侧视频融合的港口集装箱卡车碰撞危险辨识方法[J]. 交通信息与安全, 2023, 41(1): 1-12. doi: 10.3963/j.jssn.1674-4861.2023.01.001 GUO X H, PENG L Q, MA D H. A method of identifying collision risk of container trucks in port terminal areas under an integrated connected vehicle BSM and roadside video surveillance data[J]. Journal of Transport Information and Safety, 2023, 41(1): 1-12. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2023.01.001
[14]	张旭仁, 高力. 基于人工智能图像识别的散货码头天网智慧平台[J]. 港口科技, 2021(6): 25-32. ZHANG X R, GAO L. Skynet intelligent platform for bulk cargo terminal based on artificial intelligence image recognition[J]. Port Science & Technology, 2021(6): 25-32. (in Chinese)
[15]	赵芷嫣, 孙维维. 虚拟电子围栏在危货港口安防中的应用[J]. 水上消防, 2021(6): 12-15. ZHAO Z Y, SUN W W. Application of virtual electronic fence in dangerous cargo port security[J]. Maritime Safety, 2021(6): 12-15. (in Chinese)
[16]	陈信强, 郑金彪, 凌峻, 等. 基于异步交互聚合网络的港船作业区域人员异常行为识别[J]. 交通信息与安全, 2022, 40(2): 22-29. doi: 10.3963/j.jssn.1674-4861.2022.02.003 CHEN X Q, ZHENG J B, LING J, et al. Detecting abnormal behaviors of workers at ship working fields via asynchronous interaction aggregation network[J]. Journal of Transport Information and Safety, 2022, 40(2): 22-29. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.02.003
[17]	陈信强, 王美琳, 李朝锋, 等. 基于深度学习与多级匹配机制的港区人员轨迹提取[J]. 交通运输系统工程与信息, 2023, 23(4): 70-79. CHEN X Q, WANG M L, LI C F, et al. Port staff trajectory extraction based on deep learning and multi-level matching mechanism[J]. Journal of Transportation Systems Engineering and Information Technology, 2023, 23(4): 70-79. (in Chinese)
[18]	ZHUANG J, QIN Z, YU H, et al. Task-spe-cific context decoupling for object detection[OL]. (2023-03-02)[2024-04- 26]. http://arxiv.org/abs/2303.01047.
[19]	ZHU L, WANG X, KE Z, et al. BiFormer: vision transformer with bi-level routing attention[C]. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada: IEEE, 2023.
[20]	WANG C, HE W, NIE Y, et al. Gold-YOLO: efficient object detector via gather-and- distribute mechanism[OL]. (2023-10-23)[2024-04-30]. http://arxiv.org/abs/2309.11331.
[21]	LI H, LI J, WEI H, et al. Slim-neck by GS-Conv: a better design paradigm of detector architectures for autonomous vehicles[OL]. (2022-08-17)[2024-04-30]. http://arxiv.org/abs/2206.02424.
[22]	SILIANG M, YONG X. MPDIoU: a loss for efficient and accurate bounding box regression[OL]. (2023-07-14)[2024- 05-01]. http://arxiv.org/abs/2307.07662.