Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss

Zhu, Hufei; Huang, Yonghui; Xu, Ying; Zhou, Jianhong; Deng, Fuqin; Zhai, Yikui

doi:10.1007/s11554-024-01444-6

Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss

Research
Published: 25 March 2024

Volume 21, article number 58, (2024)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Hufei Zhu¹^na1,
Yonghui Huang¹^na1,
Ying Xu¹^na1,
Jianhong Zhou¹,
Fuqin Deng¹ &
…
Yikui Zhai¹

287 Accesses
1 Citation
Explore all metrics

Abstract

Recently, significant progress has been made in the research field of unmanned aerial vehicle (UAV) object detection through deep learning. The proliferation of unmanned aerial vehicles has notably facilitated the acquisition of corresponding data. However, the presence of substantial rotated objects in various orientations within UAV data sets poses challenges for traditional horizontal box object detection methods. These conventional approaches struggle to precisely locate rotated objects. Consequently, algorithms for rotated bounding-box object detection have been proposed; however, some of these existing methods exhibit issues, including periodicity of angle and exchangeability of edges. We propose a joint key point representation and rotated distance loss object detection network to solve the above problems. It is mainly composed of the key point representation module and the rotated distance-IoU loss. The key-points representation is used to indirectly represent the angle parameter of the rotated bounding box. It accomplishes this by measuring the angle between the line connecting the center point of the rotated bounding box to a specific boundary center point and the horizontal line. Next, the coordinates of the center points of anchor and the center points of its boundary are used to obtain the height dimension of the rotated bounding box and the width dimension of a rotated bounding box is introduced. Like this, the rotated bounding box can be represented by two points and a width dimension. Also, based on the traditional rotated IoU loss which does not incorporate the distance between the center point of the prediction box and the center point of ground truth in the regression process, the rotated distance-IoU loss is proposed to replace the traditional rotated IoU loss, which speeds up the convergence of the network. We have conducted extensive experiments on the DOTA data set and the DroneVehicle data set and have demonstrated the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LODNU: lightweight object detection network in UAV vision

Article 01 February 2023

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection

Article 04 June 2024

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Article 30 October 2024

Data availibility statement

These data were derived from the following resources available in the public domain: DOTA v1.0 and DroneVehicle: https://captain-whu.github.io/DOTA/dataset.html https://github.com/VisDrone/DroneVehicle

References

Feng, J., Yi, C.: Lightweight Detection Network for Arbitrary-Oriented Vehicles in UAV Imagery via Global Attentive Relation and Multi-Path Fusion. Drones. 6, 108 (2022)
Article Google Scholar
Taheri Tajar, A., Ramazani, A., Mansoorizadeh, M.: A lightweight Tiny-YOLOv3 vehicle detection approach. J. Real-Time Image Proc. 18, 2389–2401 (2021). https://doi.org/10.1007/s11554-021-01131-w
Article Google Scholar
Zerrouk, I., Moumen, Y., Khiati, W.: Evolutionary algorithm for optimized CNN architecture search applied to real-time boat detection in aerial images. J. Real-Time Image Proc. 20, 78 (2023). https://doi.org/10.1007/s11554-023-01332-5
Article Google Scholar
Zeng, T., Fang, J., Yin, C., Li, Y., Fu, W., Zhang, H., Wang, J., Zhang, X.: Recognition of Rubber Tree Powdery Mildew Based on UAV Remote Sensing with Different Spatial Resolutions. Drones. 7, 533 (2023)
Article Google Scholar
Wang, S., Zhao, J., Ta, N., et al.: A real-time deep learning forest fire monitoring algorithm based on an improved Pruned + KD model. J. Real-Time Image Proc. 18, 2319–2329 (2021). https://doi.org/10.1007/s11554-021-01124-9
Article Google Scholar
Marx, A., Chou, Y.-H., Mercy, K., Windisch, R.: A lightweight, robust exploitation system for temporal Stacks of UAS data: use case for forward-deployed military or emergency responders. Drones. 3, 29 (2019)
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 779-788, (2016)
Redmon, J., Farhadi, A.: "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6517-6525, (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Zhai, S., Shang, D., Wang, S., Dong, S.: DF-SSD: An improved SSD object detection algorithm based on DenseNet and feature fusion. IEEE Access 8, 24344–24357 (2020)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, pp. 580-587, (2014)
Girshick, R.: "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, pp. 1440-1448, (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Patt. Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Wang, C. Y., Bochkovskiy, A., Liao, H. Y. M.: "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464-7475 (2023)
Talaat, F.M., ZainEldin, H.: An improved fire detection approach based on YOLO-v8 for smart cities. Neural Comput. Appl. 35, 20939–20954 (2023)
Article Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: "FCOS: Fully Convolutional One-Stage Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 9626-9635, (2019)
Xu, S., Wang, X., Lv, W., et al.: PP-YOLOE: An evolved version of YOLO. arXiv preprint arXiv:2203.16250 (2022)
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: "RepPoints: Point Set Representation for Object Detection," 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp. 9656-9665, (2019)
Yang, X., Zhou, Y., Zhang, G., et al.: The KFIoU loss for rotated object detection. arXiv preprint arXiv:2201.12558 (2022)
Han, J., Ding, J., Li, J., et al.: Align deep features for oriented object detection. IEEE Trans. Geosci. Remote Sens. 60, 1–11 (2021)
Google Scholar
Li, W., Chen, Y., Hu, K., Zhu, J.: "Oriented RepPoints for Aerial Object Detection," 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 1819-1828, (2022)
Lyu, C., Zhang, W., Huang, H., et al.: RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv preprint arXiv:2212.07784 (2022)
Yang, X., Yan, J., Feng, Z., et al.: R3det: Refined single-stage detector with feature refinement for rotating object. Proceed. AAAI Conf Artif. Intell. 35(4), 3163–3171 (2021)
Google Scholar
Xu, Y., Fu, M., Wang, Q., et al.: Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Trans. Patt. Analys. Mach. Intell. 43(4), 1452–1459 (2020)
Article Google Scholar
Yang, X., Yan, J.: Arbitrary-oriented object detection with circular smooth label. Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part VIII 16. Springer International Publishing, 677-694 (2020)
Yang, X., Yan, J., Ming, Q., et al.: Rethinking rotated object detection with gaussian wasserstein distance loss. International conference on machine learning. PMLR, 11830-11841 (2021)
Yang, X., Yang, X., Yang, J., et al.: Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Adv. Neural. Inf. Process. Syst. 34, 18381–18394 (2021)
Google Scholar
Ge, Z., Liu, S., Wang, F., et al.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Li, X., Lv, C., Wang, W., et al.: Generalized focal loss: Towards efficient representation learning for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(3), 3139–3153 (2022)
Google Scholar
Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., Savarese, S.: "Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression," 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, pp. 658-666, (2019)
Yang, X., Yan, J., Liao, W., et al.: Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing. Trans. Patt. Anal. Mach. Intell. 45(2), 2384–2399 (2022)
Article Google Scholar
Lin, T. -Y., Goyal, P., Girshick, R., He, K., Dollár, P.: "Focal Loss for Dense Object Detection," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, pp. 2999-3007, (2017)
Zheng, Z., Wang, P., Liu, W., et al.: Distance-IoU loss: Faster and better learning for bounding box regression. Proceed. AAAI Conf. Artif. Intell. 34(07), 12993–13000 (2020)
Google Scholar
Zhou, D., et al.: "IoU Loss for 2D/3D Object Detection," 2019 International Conference on 3D Vision (3DV), Quebec City, QC, Canada, pp. 85-94, (2019)
Xia, G. -S., et al.: "DOTA: A Large-Scale Dataset for Object Detection in Aerial Images," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 3974-3983, (2018)
Sun, Y., Cao, B., Zhu, P., et al.: Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning. IEEE Trans. Circuits Syst. Video Technol. 32(10), 6700–6713 (2022)
Article Google Scholar
Xie, X., Cheng, G., Wang, J., et al.: "Oriented R-CNN for object detection," Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 3520-3529, (2021)

Download references

Acknowledgements

This study was funded by Guangdong Basic and Applied Basic Research Foundation (No. 2021A1515011576), Guangdong Science and Technology Planning Project (No. 2021A0505030080, No. 2021A0505060011), Guangdong Higher Education Innovation and Strengthening School Project (No. 2020ZDZX3031, No. 2022ZDZX1032, No. 2023ZDZX1029), Wuyi University Hong Kong and Macao Joint Research and Development Fund (No. 2022WGALH19), Guangdong Jiangmen Science and Technology Research Project (No. 2220002000246, No. 2023760300070008390), Guangdong Science and Technology Innovation Strategy Special Fund (pdjh2022b0528, pdjh2024a374).

Author information

Hufei Zhu, Yonghui Huang and Ying Xu are contributed equally to this work.

Authors and Affiliations

The School of Electronics and Information Engineering, Wuyi University, Guangdong, China
Hufei Zhu, Yonghui Huang, Ying Xu, Jianhong Zhou, Fuqin Deng & Yikui Zhai

Authors

Hufei Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yonghui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jianhong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Fuqin Deng
View author publications
You can also search for this author in PubMed Google Scholar
Yikui Zhai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Methodology, H.Z. and Y.H.; investigation, Y.Z. and H.Z.; data curation, J.Z. and F.D.; validation, Y.X.; writing-original draft preparation, Y.H. and J.Z.; writing-review and editing, Y.Z. and Y.X. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Yikui Zhai.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhu, H., Huang, Y., Xu, Y. et al. Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss. J Real-Time Image Proc 21, 58 (2024). https://doi.org/10.1007/s11554-024-01444-6

Download citation

Received: 13 December 2023
Accepted: 22 February 2024
Published: 25 March 2024
DOI: https://doi.org/10.1007/s11554-024-01444-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LODNU: lightweight object detection network in UAV vision

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Data availibility statement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Unmanned aerial vehicle (UAV) object detection algorithm based on keypoints representation and rotated distance-IoU loss

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

LODNU: lightweight object detection network in UAV vision

ARF-YOLOv8: a novel real-time object detection model for UAV-captured images detection

LightUAV-YOLO: a lightweight object detection model for unmanned aerial vehicle image

Data availibility statement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation