基于改进YOLOv5的雾霾环境下船舶红外图像检测算法

马浩为; 张笛; 李玉立; 范亮

doi:10.3963/j.jssn.1674-4861.2023.01.010

基于改进YOLOv5的雾霾环境下船舶红外图像检测算法

doi: 10.3963/j.jssn.1674-4861.2023.01.010

马浩为^{1, 2, 3,},
张笛^{1, 3, 4, 5},
李玉立^{1, 2, 3},
范亮^{1, 2, 5, ,}

1.
武汉理工大学国家水运安全工程技术研究中心武汉 430063
2.
武汉理工大学智能交通系统研究中心武汉 430063
3.
武汉理工大学交通与物流工程学院武汉 430063
4.
水路交通控制全国重点实验室，武汉理工大学武汉 430063
5.
广东省内河港航产业研究有限公司广东韶关 512100

基金项目:

国家重点研发计划项目 2017YFC0804904

湖北省科技创新人才及服务专项国际科技合作项目 2021EHB007

韶关市创新创业团队引进项目 201212176230928

详细信息

作者简介:
马浩为（1996—），硕士研究生. 研究方向：船舶行为识别. E-mail: hwma@whut.edu.cn

通讯作者:
范亮（1990—），博士. 研究方向：水上交通态势感知等. E-mail: fanliang@whut.edu.cn

中图分类号: U676.1
计量
- 文章访问数: 970
- HTML全文浏览量: 407
- PDF下载量: 72
- 被引次数: 0
出版历程
- 收稿日期: 2022-09-26
- 网络出版日期: 2023-05-13

A Ship Detection Algorithm for Infrared Images under Hazy Environment based on an Improved YOLOv5 Algorithm

MA Haowei^{1, 2, 3
,},
ZHANG Di^{1, 3, 4, 5},
LI Yuli^{1, 2, 3},
FAN Liang^{1, 2, 5
, ,}

1.
National Engineering Research Center for Water Transport Safety, Wuhan University of Technology, Wuhan 430063, China
2.
Intelligent Transportation Systems Research Center, Wuhan University of Technology, Wuhan 430063, China
3.
School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan 430063, China
4.
State Key Laboratory of Waterway Traffic Control and Safety, Wuhan University of Technology, Wuhan 430063, China
5.
Guangdong Inland Port and Shipping Industry Research Co., Ltd, Shaoguan 512100, Guangdong, China

摘要

摘要: 从监控图像中准确检测船舶对于港区水域船舶交通智能监管具有重要意义。为解决雾霾条件下传统YOLOv5目标检测算法对船舶红外图像检测准确率低、小目标特征提取能力弱等问题，提出了基于Swin Transformer的改进YOLOv5船舶红外图像检测算法。为扩大原始数据集的多样性，综合考虑船舶红外图像轮廓特征模糊、对比度低、抗云雾干扰能力强等特点，改进算法提出基于大气散射模型的数据集增强方法；为增强特征提取过程中全局特征的关注能力，改进算法的主干网络采用Swin Transformer提取船舶红外图像特征，并通过滑动窗口多头自注意力机制扩大窗口视野范围；为增强网络对密集小目标空间特征提取能力，通过改进多尺度特征融合网络（PANet），引入底层特征采样模块和坐标注意力机制（CA），在注意力中捕捉小目标船舶的位置、方向和跨通道信息，实现小目标的精确定位；为降低漏检率和误检率，采用完全交并比损失函数（CIoU）计算原始边界框的坐标预测损失，结合非极大抑制算法（NMS）判断并筛选候选框多次循环结构，提高目标检测结果的可靠性。实验结果表明：在一定浓度的雾霾环境下，改进算法的平均识别精度为93.73%，平均召回率为98.10%，平均检测速率为每秒38.6帧；与RetinaNet、Faster R-CNN、YOLOv3 SPP、YOLOv4、YOLOv5和YOLOv6-N算法相比，其平均识别精度分别提升了13.90%、11.53%、8.41%、7.21%、6.20%和3.44%，平均召回率分别提升了11.81%、9.67%、6.29%、5.53%、4.87%和2.39%。综上，所提的Swin-YOLOv5s改进算法对不同大小的船舶目标识别均具备较强的泛化能力，并具有较高的检测精度，有助于提升港区水域船舶的监管能力。
- 交通安全 /
- 红外图像 /
- 船舶目标检测 /
- YOLOv5 /
- Swin Transformer /
- 坐标注意力
Abstract: Accurately detecting ships from surveillance images is crucial for intelligent ship traffic surveillance around port waters. To address the issues of low accuracy and capability of small target feature extraction from traditional YOLOv5 object detection algorithms from the infrared images under hazy weather, an improved YOLOv5 algorithm based on Swin Transformer is proposed. To expand the diversity of the original dataset, the improved algorithm considers the characteristics of ship infrared images with strong resistance to cloud and fog interference but blurred image contour features and low contrast, and enhances the dataset based on an atmospheric scattering model. To enhance the algorithm's attention to global features during feature extraction, the backbone network of the improved algorithm uses Swin Transformer to extract ship infrared image features and expands the window view range using a multi-head self-attention mechanism controlled by a sliding window. To enhance the capability of extracting spatial features of dense small targets, a multi-scale feature fusion Path Aggregation Network (PANet) is improved by adding a bottom-up feature sampling module and a coordinate attention (CA) mechanism, in order to capture the position, direction, and cross-channel information of small target ships. To reduce false negatives and false positives, a complete intersection over union loss function (CIoU) is used to calculate the coordinate prediction loss of the original bounding box and combined with the non-maximum suppression algorithm (NMS) to judge and filter candidate boxes in a multi-loop structure to improve the reliability of object detection. Study results show that under certain concentrations of haze, the average recognition accuracy, recall rate, and detection rate of the improved algorithm is 93.73%, 98.10%, and 38.6 frames per second, respectively. Compared with the following algorithms: RetinaNet, Faster R-CNN, YOLOv3 SPP, YOLOv4, YOLOv5, and YOLOv6-N, the average recognition accuracy of the proposed algorithm is improved by 13.90%, 11.53%, 8.41%, 7.21%, 6.20%, and 3.44% respectively; and the average recall rate is improved by 11.81%, 9.67%, 6.29%, 5.53%, 4.87%, and 2.39%, respectively. The proposed Swin-YOLOv5s algorithm has a strong generalization ability for ship target recognition of different sizes and has a high detection accuracy, which helps to improve the surveillance capability of ships around port waters.
- transport safety /
- infrared images /
- ship target detection /
- YOLOv5 /
- Swin Transformer /
- coordinate attention

HTML全文

图 1 Swin-YOLOv5s架构

Figure 1. Swin-YOLOv5s framework

下载: 全尺寸图片幻灯片

图 2 合成雾霾图像（i为雾霾浓度系数）

Figure 2. Synthetic haze image(i is the haze concentration factor)

下载: 全尺寸图片幻灯片

图 3 Swin-YOLOv5s主干网络

Figure 3. Swin-YOLOv5s backbone

下载: 全尺寸图片幻灯片

图 4 多尺度特征融合网络

Figure 4. Path aggregation network

下载: 全尺寸图片幻灯片

图 5 CA注意力模块

Figure 5. Coordinate attention modules

下载: 全尺寸图片幻灯片

图 6 数据集分布特点

Figure 6. Dataset distribution characteristics

下载: 全尺寸图片幻灯片

图 7 Loss-Epoch变化曲线图

Figure 7. Loss-Epoch variation graphs

下载: 全尺寸图片幻灯片

图 8 YOLOv5s与Swin-YOLOv5s检测对比

Figure 8. Comparison of yolov5s and swin-yolov5s detection

下载: 全尺寸图片幻灯片

图 9 不同雾霾浓度下YOLOv5s与Swin-YOLOv5s的红外船舶图像检测结果对比

Figure 9. Comparison of infrared ship image detection results between YOLOv5s and Swin-YOLOv5s at different haze concentrations

下载: 全尺寸图片幻灯片

表 1 实验训练参数

Table 1. Experimental training parameters

参数	取值
Learning rate（学习率）	0.01
Optimizer（优化器）	Adam
Batch size（每批数据量大小）	8
Epoch（训练次数）	300

下载: 导出CSV

表 2 消融实验结果

Table 2. Ablation experiment results

实验序号	ST	CA	CIoU	参数量/× 10⁶	mAP/%	FPS/（帧/s）
1				5.7	87.53	42.1
2	√			6.2	90.34	38.9
3		√		6.0	89.27	41.9
4			√	5.7	88.38	42.0
5	√	√		6.6	92.89	38.8
6	√		√	6.2	91.16	38.7
7		√	√	6.0	90.12	41.8
8	√	√	√	6.6	93.73	38.6

下载: 导出CSV

表 3 主流算法mAP对比结果

Table 3. Mainstream algorithm map comparison results

主流检测算法	平均精度（AP）/%							mAP/%	召回率/%	FPS/（帧/s）
主流检测算法	帆船	艇型船	邮轮	军舰	散货船	集装箱船	渔船	mAP/%	召回率/%	FPS/（帧/s）
RetinaNet	76.21	66.45	76.73	86.87	87.68	89.21	75.69	79.83	86.29	18.4
Faster R-CNN	78.95	68.86	78.68	89.77	88.86	90.32	79.94	82.20	88.43	11.3
YOLOv3 SPP	85.44	73.72	82.86	90.77	91.73	92.52	80.21	85.32	91.81	22.7
YOLOv4	86.12	78.21	83.28	92.34	91.41	92.86	81.39	86.52	92.57	21.6
YOLOv5s	87.07	79.32	83.84	93.57	92.42	93.09	83.42	87.53	93.23	42.1
YOLOv6-N	89.24	81.33	89.44	96.28	94.14	96.94	84.69	90.39	95.71	49.2
Swin-YOLOv5s	91.53	89.81	90.38	98.83	97.37	98.39	89.83	93.73	98.10	38.6

下载: 导出CSV

参考文献(26)

[1]	王岩, 孙寿保, 徐峰, 等. 提升尹公洲段航道通过能力的探讨[J]. 水运工程, 2020(12): 161-165, 190. WANG Y, SUN S B, XU F, et al. Discussion on improving passage capacity of Yingongzhou channel[J]. Port & Waterway Engineering, 2020(12): 161-165, 190. (in Chinese)
[2]	郝姝馨, 郝增周, 黄海清, 等. 基于Himawari-8数据的夜间海雾识别[J]. 海洋学报, 2021, 43(11): 166-180. HAO S X, HAO Z Z, HUANG H Q, et al. Nighttime sea fog recognition based on Himawari-8 data[J]. Acta OceanologicaSinca, 2021, 43(11): 166-180. (in Chinese)
[3]	李云红, 刘宇栋, 苏雪平, 等. 红外与可见光图像配准技术研究综述[J]. 红外技术, 2022, 44(7): 641-651. https://www.cnki.com.cn/Article/CJFDTOTAL-HWJS202207001.htm LI Y H, LIU Y D, SU X P, et al. Review of infrared and visible image registration[J]. Infrared Technology, 2022, 44(7): 641-651. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HWJS202207001.htm
[4]	SHU Q, WU C, ZHONG Q, et al. Alternating minimization algorithm for hybrid regularized variational image dehazing[J]. Optik, 2019(185): 943-956.
[5]	ZHANG J, FENG F, SONG W. A compensation textures dehazing method for water alike area[J]. The Journal of Supercomputing, 2021, 77(4): 3555-3570. doi: 10.1007/s11227-020-03406-8
[6]	MA Z, WEN J, ZHANG C, et al. An effective fusion defogging approach for single sea fog image[J]. Neurocomputing, 2016(173): 1257-1267.
[7]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbia, USA: IEEE, 2014.
[8]	车凯, 向郑涛, 陈宇峰, 等. 基于改进Fast R-CNN的红外图像行人检测研究[J]. 红外技术, 2018, 40(6): 578-584. https://www.cnki.com.cn/Article/CJFDTOTAL-HWJS201806010.htm CHE K, XIANG Z T, CHEN Y F, et al. Research on infrared image pedestrian detection based on improved Fast R-CNN[J]. Infrared Technology, 2018, 40(6): 578-584. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HWJS201806010.htm
[9]	顾燕, 李臻, 杨锋, 等. 基于改进Faster R-CNN的复杂背景红外车辆检测算法[J]. 激光与红外, 2022, 52(4): 614-619. https://www.cnki.com.cn/Article/CJFDTOTAL-JGHW202204022.htm GU Y, LI Z, YANG F, et al. Infrared vehicle detection algorithm with complex background based on improved Fast R-CNN[J]. Laser & Infrared, 2022, 52(4): 614-619. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JGHW202204022.htm
[10]	CAI Z, VASCONCELOS N. Cascade R-CNN: delving into high quality object detection[C]. 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018.
[11]	ZHANG C, XIONG B, KUANG G. Ship detection and recognition in optical remote sensing images based on scale enhancement rotating Cascade R-CNN networks[C]. 2021 IEEE International Geoscience and Remote Sensing Symposium, Brussels, Belgium: IEEE, 2021.
[12]	LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C]. 2016 IEEE European Conference on Computer Vision, Amsterdam: IEEE, 2016.
[13]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016.
[14]	ZOU Y, ZHAO L, QIN S, et al. Ship target detection and identification based on SSD_MobilenetV2[C]. 2020 IEEE Information Technology and Mechatronics Engineering Conference, Changsha, China: IEEE, 2020.
[15]	CHANG Y L, ANAGAW A, CHANG L, et al. Ship detection based on YOLOv2 for SAR imagery[J]. Remote Sensing, 2019, 11(7): 786-800.
[16]	陈信强, 郑金彪, 凌峻, 等. 基于异步交互聚合网络的港船作业区域人员异常行为识别[J]. 交通信息与安全, 2022, 40 (2): 22-29. doi: 10.3963/j.jssn.1674-4861.2022.02.003 CHEN X Q, ZHENG J B, LING J, et al. Detecting abnormal behaviors of workers at ship working fields via asynchronous interaction aggregation network[J]. Journal of Transport Information and Safety, 2022, 40(2): 22-29. (in Chinese) doi: 10.3963/j.jssn.1674-4861.2022.02.003
[17]	LIU R W, YUAN W, CHEN X, et al. An enhanced CNN-enabled learning method for promoting ship detection in maritime surveillance system[J]. Ocean Engineering, 2021, (235): 109435.
[18]	LIU W, REN G, YU R, et al. Image-adaptive YOLO for object detection in adverse weather conditions[C]. The AAAI Conference onArtificial Intelligence, Beijing, China: AAAI, 2022.
[19]	LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]. 2021 IEEE International Conference on Computer Vision, Montreal, Canada: IEEE, 2021.
[20]	HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]. 2021 IEEE Conference on Computer Vision and Pattern Recognition, Nashville, USA: IEEE, 2021.
[21]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]. 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018.
[22]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]. 2018 IEEE European Conference on Computer Vision, Munich, Germany: IEEE, 2018.
[23]	PARK J, WOO S, LEE J Y, et al. BAM: bottleneck attention module[C]. 2018 British Machine Vision Conference, Newcastle, UK: IAPR, 2018.
[24]	Infiray. Infiray infrared open source offshore vessel dataset[R/OL]. (2021-12)[2022-10-30]. http://iray.iraytek.com:7813/apply/E_Sea_shipping.html/
[25]	CHEN X, LING J, WANG S, et al. Ship detection from coastal surveillance videos via an ensemble Canny-Gaussian-morphology framework[J]. The Journal of Navigation, 2021, 74 (6): 1252-1266.
[26]	CHEN Z, CHEN D, ZHANG Y, et al. Deep learning for autonomous ship-oriented small ship detection[J]. Safety Science, 2020, (130): 104812.