Abstract
Because of the difficulty in feature extraction of infrared pedestrian images, the traditional methods of object detection usually make use of the labor to obtain pedestrian features, which suffer from the low-accuracy problem. With the development and the progress of science and technology, deep learning has gradually stepped into the problem of object detection, and achieved good results. In this paper, aiming at the defects of deep convolutional neural network, such as the high cost on training time and slow convergence, a new algorithm of MoblieNet V2(1.4) + SSD infrared image pedestrian detection based on transfer learning is proposed, which adopts a transfer learning method and the Adam optimization algorithm to accelerate network convergence. For the experiments, we augmented the OUS thermal infrared pedestrian dataset and our solution enjoys a higher mAP of 94.8% on the test dataset. The experimental results show that our proposed method has the characteristics of fast convergence, high detection accuracy and short detection time.
Similar content being viewed by others
References
Dianwei W, Yanhui H, Daxiang L et al (2018) Improved yolov3 infrared video image pedestrian detection algorithm. J Xi'an Univ Posts Telecommun 23(4):48–67
Girshick R, Donahue J, Darrell T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition 580–588
Jianting S, Guiqiang Z (2020) Improved yolov3 infrared image pedestrian detection algorithm. J Heilong-jiang Univ Sci Technol 30(4):442–447
Jifeng D, Yi L, Kaiming H et al (2016) R-FCN: object detection via region-based fully Convolu- tional networks. Conference on Neural Information Processing Systems
Junyu Z, Yanming Z (2017) Overview of convolution neural network in image classification and target detection. Comput Eng Appl 53(13):34–41
Kai C, Zhengtao X, Yufen GC et al (2018) Research on infrared image pedestrian detection based on improved fast r-cnn. Infrared Technol 40(6):578–584
Kaiming H, Xiangyu Z, Shaoqing R et al (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. 2014 European Conference on Computer Vision 1904–1916
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Liang C, Xiaoming D, Mingquan Z (2016) Convolutional neural network in image understanding. Acta Automat Sin 42(9):1300–1312
Liu X, Li FM, Liu SJ (2020) An infrared image pedestrian detection algorithm based on improved SSD algorithm. Electr Opt Control 27(1):42–46 59
Ming X, Xiaosheng Y, Dongyue C et al (2018) Pedestrian detection in complex thermal infrared monitoring scene. Chin J Image Graph 23(12):1829–1837
Qi L (2018) Fruit image recognition system based on deep learning. Agric Eng 8(10):31–34
Redmon J, Divvala S, Girshick R et al (2016) You only look once: Unified, real-time object detection. 2016 IEEE Conf Comput Vis Pattern Recogn 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection withRegion proposal networks. Adv Neural Inf Proces Syst 28:1137–1149
Simon M, Rodner E (2015) Neural activation constellations: Unsupervised part model discovery with convolutional network. 2015 IEEE International Conference on Computer Vision and Pattern Recognition 1143–1151
Song W, Shumin F (2019) Research and improvement of SSD (single shot multibox detector) target detection algorithm. Ind Control Comput 32(4):103–105
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition 1–9
Wang P, Wang Z, Lv D et al (2021) Low illumination color image enhancement based on Gabor filtering and Retinex theory. Multimed Tools Appl 80(12):17705–17719
Xudong X, Liqian M (2018) Control chart recognition based on transfer learning and convolutional neural network. Comput Appl 38(S2):290–295
Xudong L, Mao Y, Tao L (2017) A review of target detection based on convolutional neural network. Comput Appl Res 34(10):2881–2891
Yancheng W, Hongchang C, Shaomei L, Gao G (2018) Pedestrian recognition neural network model based on heterogeneity of pedestrian attributes. Comput Eng 44(10):196–203
Yandong L, Zongbo H, Hang L (2016) A review of convolutional neural networks. Comput Appl 36(9):2508–2515
Yong LT, Ping L, Xiao GW et al (2015) Deep learning strong parts for pedestrian detection. 2015 IEEE International Conference on Computer Vision 1904–1912
Zhihua Z (2016) Machine learning. Tsinghua Univ Press 121–139
Zhihua Z (2016) Machine learning. Tsinghua University Press
Acknowledgments
This research was funded by the National Natural Science Foundation of China, grant number 6192007, 61462008, 61751213, 61866004; the Key projects of Guangxi Natural Science Foundation, grant number 2018GXNSFDA294001,2018GXNSFDA281009; the Natural Science Foundation of Guangxi, grant number 2018GXNSFAA294050, 2017GXNSFAA198365; 2015 Innovation Team Project of Guangxi University of Science and Technology, grant number gxkjdx201504; Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security, grant number MIMS19-04; Natural Science School-level Project of Software Engineering Institute of Guangzhou, grant number ky202108; Guangxi Postgraduate Education Innovation Project, grant number GKYC202106, GKYC202104, YCSW2021320; College Students’ innovation and Entrepreneurship Project 202110594133, 202110594134.
Author information
Authors and Affiliations
Contributions
Conceptualization, J.F and Z.w.W.; methodology, J.F; software, Y.h.W.; validation, J.F,Z.w.W.; formal analysis, J.F; investigation, Y.f.Z; data curation, Y.f.Z; writing—original draft preparation, J.F; writing—review and editing, Z.w.W; visualization, J.F; supervision, Y.f.Z; project administration, Z.w.W; funding acquisition, Z.w.W. All authors have read and agreed to the published version of the manuscript.”
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Wang, Z., Feng, J. & Zhang, Y. Pedestrian detection in infrared image based on depth transfer learning. Multimed Tools Appl 81, 39655–39674 (2022). https://doi.org/10.1007/s11042-022-13058-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13058-w