Abstract
To obtain high-quality positron emission tomography (PET) scans while reducing potential radiation hazards brought to patients, various generative adversarial network (GAN)-based methods have been developed to reconstruct high-quality standard-dose PET (SPET) images from low-dose PET (LPET) images. However, due to the intrinsic locality of convolution operator, these methods have failed to explore global contexts of the entire 3D PET image. In this paper, we propose a novel 3D convolutional vision transformer GAN framework, named 3D CVT-GAN, for SPET reconstruction using LPET images. Specifically, we innovatively design a generator with a hierarchical structure that uses multiple 3D CVT blocks as the encoder for feature extraction and also multiple 3D transposed CVT (TCVT) blocks as the decoder for SPET restoration, capturing both local spatial features and global contexts from different network layers. Different from the vanilla 2D vision transformer that uses linear embedding and projection, our 3D CVT and TCVT blocks employ 3D convolutional embedding and projection instead, allowing the model to overcome semantic ambiguity problem caused by the attention mechanism and further preserve spatial details. In addition, residual learning and a patch-based discriminator embedded with 3D CVT blocks are added inside and after the generator, facilitating the training process while mining more discriminative feature representations. Validation on the clinical PET dataset shows that our proposed 3D CVT-GAN outperforms the state-of-the-art methods qualitatively and quantitatively with minimal parameters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Feng, Q., Liu, H.: Rethinking PET image reconstruction: ultra-low-dose, sinogram and deep learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12267, pp. 783–792. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59728-3_76
Wang, Y., Ma, G., An, L., et al.: Semi-supervised tripled dictionary learning for standard-dose PET image prediction using low-dose PET and multimodal MRI. IEEE Trans. Biomed. Eng. 64(3), 569–579 (2016)
Kim, K., Wu, D., Gong, K., et al.: Penalized PET reconstruction using deep learning prior and local linear fitting. IEEE Trans. Med. Imaging 37(6), 1478–1487 (2018)
Wang, Y., Zhou, L., Yu, B., et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38(6), 1328–1339 (2018)
Xiang, L., Qiao, Y., Nie, D., et al.: Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing 267, 406–416 (2017)
Spuhler, K., Serrano-Sosa, M., Cattell, R., et al.: Full-count PET recovery from low-count image using a dilated convolutional neural network. Med. Phys. 47(10), 4928–4938 (2020)
Wang, Y., Yu, B., Wang, L., et al.: 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 174, 550–562 (2018)
Gong, K., Guan, J., Kim, K., et al.: Iterative PET image reconstruction using convolutional neural network representation. IEEE Trans. Med. Imaging 38(3), 675–685 (2018)
Zhan, B., Xiao, J., Cao, C., et al.: Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med. Image Anal. 77, 102339 (2022)
Häggström, I., Schmidtlein, C.R., et al.: DeepPET: a deep encoder-decoder network for directly solving the PET image reconstruction inverse problem. Med. Image Anal. 54, 253–262 (2019)
Hu, L., Li, J., Peng, X., et al.: Semi-supervised NPC segmentation with uncertainty and attention guided consistency. Knowl.-Based Syst. 239, 108021 (2022)
Mehranian, A., Reader, A.J.: Model-based deep learning PET image reconstruction using forward-backward splitting expectation-maximization. IEEE Trans. Radiat. Plasma Med. Sci. 5(1), 54–64 (2020)
Tang, P., Yang, P., et al.: Unified medical image segmentation by learning from uncertainty in an end-to-end manner. Knowl. Based Syst. 241, 108215 (2022)
Zhou, L., Schaefferkoetter, J.D., et al.: Supervised learning with cyclegan for low-dose FDG PET image denoising. Med. Image Anal. 65, 101770 (2020)
Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335 (2022)
Wang, K., Zhan, B., Zu, C., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447 (2022)
Nie, D., Wang, L., Adeli, E., et al.: 3D fully convolutional networks for multimodal isointense infant brain image segmentation. IEEE Trans. Cybern. 49(3), 1123–1136 (2018)
Shi, Y., Zu, C., Hong, M., et al.: ASMFS: Adaptive-similarity-based multi-modality feature selection for classification of Alzheimer’s disease. Pattern Recogn. 126, 108566 (2022)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 x 16 words: Transformers for image recognition at scale. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Venice (2020)
Hugo T., Matthieu C., et al.: Training data-efficient image transformers & distillation through attention. In: Proceedings of the 38th International Conference on Machine Learning, pp. 10347–10357. PMLR, Vienna (2021)
Wang, W., Xie, E., Li, X., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578. IEEE, Montreal (2021)
Zhang, Z., Yu, L., Liang, X., Zhao, W., Xing, L.: TransCT: dual-path transformer for low dose computed tomography. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 55–64. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_6
Luo, Y., et al.: 3D transformer-GAN for high-quality PET reconstruction. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 276–285. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_27
Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11
Zhang, Y., Liu, H., Hu, Q.: Transfuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chen, J., Lu, Y., Yu, Q., et al.: TransuNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 61–71. Springer, Cham (2021)
Luthra, A., Sulakhe, H., Mittal, T., et al.: Eformer: edge enhancement based transformer for medical image denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021 (2021)
Wu, H., Xiao, B., Codella, N., et al.: CVT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)
Ye, R., Liu, F., Zhang, L.: 3D depthwise convolution: reducing model parameters in 3D vision tasks. In: Meurs, M., Rudzicz, F. (eds.) Canadian AI 2019. LNCS (LNAI), vol. 11489, pp. 186–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18305-9_15
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, p. 30 (2017)
Acknowledgement
This work is supported by National Natural Science Foundation of China (NFSC 62071314).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zeng, P. et al. (2022). 3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13436. Springer, Cham. https://doi.org/10.1007/978-3-031-16446-0_49
Download citation
DOI: https://doi.org/10.1007/978-3-031-16446-0_49
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16445-3
Online ISBN: 978-3-031-16446-0
eBook Packages: Computer ScienceComputer Science (R0)