Abstract
Recently, social security surveillance has posed a new AI challenge, i.e., Visible-X-ray baggage Re-Identification (VX-ReID), which aims to re-identify and retrieve baggage between visible and X-ray imaging modalities. Compared with cross-modality person re-identification, VX-ReID has two distinctive bottlenecks: shape deformation and feature entanglement. For the former, the shape of the baggage can change largely, resulting in serious feature unrobustness. For the latter, the X-ray images often contain the contents of the baggage, which are not visible in daylight images. These will greatly affect the performance of representational learning loss functions (like ID Loss) in the Re-ID task. In this paper, we propose a cross-modality multi-scale feature correspondence model (CMMFC) for VX-ReID. Specifically, we devise and calculate multiple feature correspondences between modalities on multiple-scale feature maps endowed to overcome the deformation problem. We also utilize a novel feature restriction mechanism (FRM) to alleviate the feature entanglement problem, which imposes different constraints on features at different scales and accurately drives networks to distinctive modality-irrelevant features. Finally, CMMFC is extensively evaluated on our dataset RX01. Experiments show that our proposed method achieves state-of-the-art performance on dataset RX01.
Similar content being viewed by others
Availability of data and materials
This study did not report any data. The proposed method was evaluated on the available dataset: RX01(S. Chan, J. Cui, Y. Wu, H. Wang and C. Bai, "Visible-Xray Cross-Modality Package Re-Identification," 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 2023, pp. 2579-2584, doi: 10.1109/ICME55011.2023.00439).
References
Brown, A., Xie, W., Kalogeiton, V., et al.: Smooth-ap: smoothing the path towards large-scale image retrieval. In: European Conference on Computer Vision, Springer, pp 677–694 (2020)
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2010)
Chan, S., Cui, J., Wu, Y., et al.: Visible-xray cross-modality package re-identification. In: IEEE International Conference on Multimedia and Expo, ICME 2023, Brisbane, Australia, July 10–14, 2023. IEEE, pp 2579–2584 (2023)
Chen, Y., Wan, L., Li, Z., et al.: Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 587–597 (2021)
Choi, S., Lee, S., Kim, Y., et al.: Hi-cmd: hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10257–10266 (2020)
Dosovitskiy, A., Fischer, P., Ilg, E., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2758–2766 (2015)
Fan, X., Jiang, W., Luo, H., et al.: Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal person re-identification. Vis. Comput. 38, 279–294 (2020)
Gao, Y., Liang, T., Jin, Y., et al.: Mso: multi-feature space joint optimization network for rgb-infrared person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 5257–5265 (2021)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778 (2016)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)
Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4610–4617 (2020)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125 (2017)
Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)
Liu, H., Tan, X., Zhou, X.: Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans. Multimed. 23, 4414–4425 (2020)
Lu, H., Zou, X., Zhang, P.: Learning progressive modality-shared transformers for effective visible-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 1835–1843 (2023)
Mery, D., Riffo, V., Zscherpel, U., et al.: Gdxray: the database of x-ray images for nondestructive testing. J. Nondestr. Eval. 34(4), 42 (2015)
Miao, C., Xie, L., Wan, F., et al.: Sixray: a large-scale security inspection x-ray benchmark for prohibited item discovery in overlapping images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2119–2128 (2019)
Park, H., Lee, S., Lee, J., et al.: Learning by aligning: visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 12046–12055 (2021)
Pu, N., Chen, W., Liu, Y., et al.: Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 2149–2158 (2020)
Sun, H., Liu, J., Zhang, Z., et al.: Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the 30th ACM International Conference on Multimedia, pp 5333–5341 (2022)
Sun, Y., Zheng, L., Yang, Y., et al.: Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp 480–496 (2018)
Szeliski, R.: Image alignment and stitching: a tutorial. Found Trends Comput Graph Vis 2(1), 1–104 (2006)
Tao, R., Wei, Y., Jiang, X., et al.: Towards real-world x-ray security inspection: a high-quality benchmark and lateral inhibition module for prohibited items detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10923–10932 (2021)
Wang, G., Yuan, Y., Chen, X., et al.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 274–282 (2018)
Wang, G., Zhang, T., Cheng, J., et al.: Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3623–3632 (2019)
Wang, G.A., Zhang, T., Yang, Y., et al.: Cross-modality paired-images generation for rgb-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12144–12151 (2020)
Wei, X., Li, D., Hong, X., et al.: Co-attentive lifting for infrared-visible person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 1028–1037 (2020)
Wei, Y., Tao, R., Wu, Z., et al.: Occluded prohibited items detection: An x-ray security inspection benchmark and de-occlusion attention module. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 138–146 (2020)
Wu, A., Zheng, W.S., Yu, H.X., et al.: Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5380–5389 (2017)
Wu, Q., Dai, P., Chen, J., et al.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4330–4339 (2021)
Yang, M., Huang, Z., Hu, P., et al.: Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 14308–14317 (2022)
Ye, H., Liu, H., Meng, F., et al.: Bi-directional exponential angular triplet loss for rgb-infrared person re-identification. IEEE Trans. Image Process. 30, 1583–1595 (2020)
Ye, M., Lan, X., Li, J., et al.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Ye, M., Shen, J., J Crandall, D., et al.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: European Conference on Computer Vision, Springer, pp 229–247 (2020)
Ye, M., Shen, J., Lin, G., et al.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2022)
Zhang, Y., Wang, H.: Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 2153–2162 (2023)
Zhou, B., Khosla, A., Lapedriza, A., et al.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2921–2929 (2016)
Zhu, Y., Yang, Z., Wang, L., et al.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97–109 (2020)
Acknowledgements
This work is partially supported by the Zhejiang Provincial Natural Science Foundation of China (No. LY23F020023), Anhui key Laboratory of Bionic Sensing and AdvancedRobot Technology Project (AHFS2024KF04) and the National Natural Science Foundation of China under Grant (No. U20A20196, 61906168).
Author information
Authors and Affiliations
Contributions
All authors reviewed the manuscript. Conceptualization: Sixian Chan, Jiaao Cui and Hongqiang Wang. Investigation: Sixian Chan, Jiaao Cui, Yonggan Wu,Hongqiang Wang and Cong Bai. Software and validation: Jiaao Cui, Yonggan Wu and Sixian Chan. Writing—original draft preparation: Sixian Chan, Jiaao Cui and Hongqiang Wang. Formal analysis: Sixian Chan, Jiaao Cui, Yonggan Wu, Cong Bai and Hongqiang Wang. Funding acquisition: Cong Bai and Sixian Chan. Prepared figures: Sixian Chan, Jiaao Cui and Yonggan Wu. Interpretation of data: Sixian Chan, Jiaao Cui, Yonggan Wu and Hongqiang Wang.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Communicated by Bing-kun Bao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chan, S., Cui, J., Wu, Y. et al. Multi-scale feature correspondence and restriction mechanism for visible X-ray baggage re-Identification. Multimedia Systems 30, 315 (2024). https://doi.org/10.1007/s00530-024-01513-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-024-01513-7