3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction

Zeng, Pinxian; Zhou, Luping; Zu, Chen; Zeng, Xinyi; Jiao, Zhengyang; Wu, Xi; Zhou, Jiliu; Shen, Dinggang; Wang, Yan

doi:10.1007/978-3-031-16446-0_49

Pinxian Zeng¹²,
Luping Zhou¹³,
Chen Zu¹⁴,
Xinyi Zeng¹²,
Zhengyang Jiao¹²,
Xi Wu¹⁵,
Jiliu Zhou^12,15,
Dinggang Shen^16,17 &
…
Yan Wang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13436))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

9807 Accesses
18 Citations

Abstract

To obtain high-quality positron emission tomography (PET) scans while reducing potential radiation hazards brought to patients, various generative adversarial network (GAN)-based methods have been developed to reconstruct high-quality standard-dose PET (SPET) images from low-dose PET (LPET) images. However, due to the intrinsic locality of convolution operator, these methods have failed to explore global contexts of the entire 3D PET image. In this paper, we propose a novel 3D convolutional vision transformer GAN framework, named 3D CVT-GAN, for SPET reconstruction using LPET images. Specifically, we innovatively design a generator with a hierarchical structure that uses multiple 3D CVT blocks as the encoder for feature extraction and also multiple 3D transposed CVT (TCVT) blocks as the decoder for SPET restoration, capturing both local spatial features and global contexts from different network layers. Different from the vanilla 2D vision transformer that uses linear embedding and projection, our 3D CVT and TCVT blocks employ 3D convolutional embedding and projection instead, allowing the model to overcome semantic ambiguity problem caused by the attention mechanism and further preserve spatial details. In addition, residual learning and a patch-based discriminator embedded with 3D CVT blocks are added inside and after the generator, facilitating the training process while mining more discriminative feature representations. Validation on the clinical PET dataset shows that our proposed 3D CVT-GAN outperforms the state-of-the-art methods qualitatively and quantitatively with minimal parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

3D Transformer-GAN for High-Quality PET Reconstruction

Fully Convolutional Transformer-Based GAN for Cross-Modality CT to PET Image Synthesis

Locality Adaptive Multi-modality GANs for High-Quality PET Image Synthesis

References

Feng, Q., Liu, H.: Rethinking PET image reconstruction: ultra-low-dose, sinogram and deep learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12267, pp. 783–792. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59728-3_76
Chapter Google Scholar
Wang, Y., Ma, G., An, L., et al.: Semi-supervised tripled dictionary learning for standard-dose PET image prediction using low-dose PET and multimodal MRI. IEEE Trans. Biomed. Eng. 64(3), 569–579 (2016)
Article Google Scholar
Kim, K., Wu, D., Gong, K., et al.: Penalized PET reconstruction using deep learning prior and local linear fitting. IEEE Trans. Med. Imaging 37(6), 1478–1487 (2018)
Article Google Scholar
Wang, Y., Zhou, L., Yu, B., et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38(6), 1328–1339 (2018)
Article Google Scholar
Xiang, L., Qiao, Y., Nie, D., et al.: Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing 267, 406–416 (2017)
Article Google Scholar
Spuhler, K., Serrano-Sosa, M., Cattell, R., et al.: Full-count PET recovery from low-count image using a dilated convolutional neural network. Med. Phys. 47(10), 4928–4938 (2020)
Article Google Scholar
Wang, Y., Yu, B., Wang, L., et al.: 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 174, 550–562 (2018)
Article Google Scholar
Gong, K., Guan, J., Kim, K., et al.: Iterative PET image reconstruction using convolutional neural network representation. IEEE Trans. Med. Imaging 38(3), 675–685 (2018)
Article Google Scholar
Zhan, B., Xiao, J., Cao, C., et al.: Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med. Image Anal. 77, 102339 (2022)
Article Google Scholar
Häggström, I., Schmidtlein, C.R., et al.: DeepPET: a deep encoder-decoder network for directly solving the PET image reconstruction inverse problem. Med. Image Anal. 54, 253–262 (2019)
Article Google Scholar
Hu, L., Li, J., Peng, X., et al.: Semi-supervised NPC segmentation with uncertainty and attention guided consistency. Knowl.-Based Syst. 239, 108021 (2022)
Article Google Scholar
Mehranian, A., Reader, A.J.: Model-based deep learning PET image reconstruction using forward-backward splitting expectation-maximization. IEEE Trans. Radiat. Plasma Med. Sci. 5(1), 54–64 (2020)
Article Google Scholar
Tang, P., Yang, P., et al.: Unified medical image segmentation by learning from uncertainty in an end-to-end manner. Knowl. Based Syst. 241, 108215 (2022)
Article Google Scholar
Zhou, L., Schaefferkoetter, J.D., et al.: Supervised learning with cyclegan for low-dose FDG PET image denoising. Med. Image Anal. 65, 101770 (2020)
Article Google Scholar
Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335 (2022)
Article Google Scholar
Wang, K., Zhan, B., Zu, C., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447 (2022)
Article Google Scholar
Nie, D., Wang, L., Adeli, E., et al.: 3D fully convolutional networks for multimodal isointense infant brain image segmentation. IEEE Trans. Cybern. 49(3), 1123–1136 (2018)
Article Google Scholar
Shi, Y., Zu, C., Hong, M., et al.: ASMFS: Adaptive-similarity-based multi-modality feature selection for classification of Alzheimer’s disease. Pattern Recogn. 126, 108566 (2022)
Article Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 x 16 words: Transformers for image recognition at scale. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Venice (2020)
Google Scholar
Hugo T., Matthieu C., et al.: Training data-efficient image transformers & distillation through attention. In: Proceedings of the 38th International Conference on Machine Learning, pp. 10347–10357. PMLR, Vienna (2021)
Google Scholar
Wang, W., Xie, E., Li, X., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578. IEEE, Montreal (2021)
Google Scholar
Zhang, Z., Yu, L., Liang, X., Zhao, W., Xing, L.: TransCT: dual-path transformer for low dose computed tomography. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 55–64. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_6
Chapter Google Scholar
Luo, Y., et al.: 3D transformer-GAN for high-quality PET reconstruction. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 276–285. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_27
Chapter Google Scholar
Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11
Chapter Google Scholar
Zhang, Y., Liu, H., Hu, Q.: Transfuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2
Chapter Google Scholar
Chen, J., Lu, Y., Yu, Q., et al.: TransuNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 61–71. Springer, Cham (2021)
Google Scholar
Luthra, A., Sulakhe, H., Mittal, T., et al.: Eformer: edge enhancement based transformer for medical image denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021 (2021)
Google Scholar
Wu, H., Xiao, B., Codella, N., et al.: CVT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)
Google Scholar
Ye, R., Liu, F., Zhang, L.: 3D depthwise convolution: reducing model parameters in 3D vision tasks. In: Meurs, M., Rudzicz, F. (eds.) Canadian AI 2019. LNCS (LNAI), vol. 11489, pp. 186–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18305-9_15
Chapter Google Scholar
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49
Chapter Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, p. 30 (2017)
Google Scholar

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (NFSC 62071314).

Author information

Authors and Affiliations

School of Computer Science, Sichuan University, Chengdu, China
Pinxian Zeng, Xinyi Zeng, Zhengyang Jiao, Jiliu Zhou & Yan Wang
School of Electrical and Information Engineering, University of Sydney, Camperdown, Australia
Luping Zhou
Department of Risk Controlling Research, JD.COM, Beijing, China
Chen Zu
School of Computer Science, Chengdu University of Information Technology, Chengdu, China
Xi Wu & Jiliu Zhou
School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
Dinggang Shen
Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
Dinggang Shen

Authors

Pinxian Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Luping Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Chen Zu
View author publications
You can also search for this author in PubMed Google Scholar
Xinyi Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Zhengyang Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Xi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiliu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Dinggang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Dinggang Shen or Yan Wang .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zeng, P. et al. (2022). 3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13436. Springer, Cham. https://doi.org/10.1007/978-3-031-16446-0_49

Download citation

DOI: https://doi.org/10.1007/978-3-031-16446-0_49
Published: 17 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16445-3
Online ISBN: 978-3-031-16446-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Transformer-GAN for High-Quality PET Reconstruction

Fully Convolutional Transformer-Based GAN for Cross-Modality CT to PET Image Synthesis

Locality Adaptive Multi-modality GANs for High-Quality PET Image Synthesis

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

3D Transformer-GAN for High-Quality PET Reconstruction

Fully Convolutional Transformer-Based GAN for Cross-Modality CT to PET Image Synthesis

Locality Adaptive Multi-modality GANs for High-Quality PET Image Synthesis

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation