iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-031-16446-0_49
3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction | SpringerLink
Skip to main content

3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Abstract

To obtain high-quality positron emission tomography (PET) scans while reducing potential radiation hazards brought to patients, various generative adversarial network (GAN)-based methods have been developed to reconstruct high-quality standard-dose PET (SPET) images from low-dose PET (LPET) images. However, due to the intrinsic locality of convolution operator, these methods have failed to explore global contexts of the entire 3D PET image. In this paper, we propose a novel 3D convolutional vision transformer GAN framework, named 3D CVT-GAN, for SPET reconstruction using LPET images. Specifically, we innovatively design a generator with a hierarchical structure that uses multiple 3D CVT blocks as the encoder for feature extraction and also multiple 3D transposed CVT (TCVT) blocks as the decoder for SPET restoration, capturing both local spatial features and global contexts from different network layers. Different from the vanilla 2D vision transformer that uses linear embedding and projection, our 3D CVT and TCVT blocks employ 3D convolutional embedding and projection instead, allowing the model to overcome semantic ambiguity problem caused by the attention mechanism and further preserve spatial details. In addition, residual learning and a patch-based discriminator embedded with 3D CVT blocks are added inside and after the generator, facilitating the training process while mining more discriminative feature representations. Validation on the clinical PET dataset shows that our proposed 3D CVT-GAN outperforms the state-of-the-art methods qualitatively and quantitatively with minimal parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Feng, Q., Liu, H.: Rethinking PET image reconstruction: ultra-low-dose, sinogram and deep learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12267, pp. 783–792. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59728-3_76

    Chapter  Google Scholar 

  2. Wang, Y., Ma, G., An, L., et al.: Semi-supervised tripled dictionary learning for standard-dose PET image prediction using low-dose PET and multimodal MRI. IEEE Trans. Biomed. Eng. 64(3), 569–579 (2016)

    Article  Google Scholar 

  3. Kim, K., Wu, D., Gong, K., et al.: Penalized PET reconstruction using deep learning prior and local linear fitting. IEEE Trans. Med. Imaging 37(6), 1478–1487 (2018)

    Article  Google Scholar 

  4. Wang, Y., Zhou, L., Yu, B., et al.: 3D auto-context-based locality adaptive multi-modality GANs for PET synthesis. IEEE Trans. Med. Imaging 38(6), 1328–1339 (2018)

    Article  Google Scholar 

  5. Xiang, L., Qiao, Y., Nie, D., et al.: Deep auto-context convolutional neural networks for standard-dose PET image estimation from low-dose PET/MRI. Neurocomputing 267, 406–416 (2017)

    Article  Google Scholar 

  6. Spuhler, K., Serrano-Sosa, M., Cattell, R., et al.: Full-count PET recovery from low-count image using a dilated convolutional neural network. Med. Phys. 47(10), 4928–4938 (2020)

    Article  Google Scholar 

  7. Wang, Y., Yu, B., Wang, L., et al.: 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 174, 550–562 (2018)

    Article  Google Scholar 

  8. Gong, K., Guan, J., Kim, K., et al.: Iterative PET image reconstruction using convolutional neural network representation. IEEE Trans. Med. Imaging 38(3), 675–685 (2018)

    Article  Google Scholar 

  9. Zhan, B., Xiao, J., Cao, C., et al.: Multi-constraint generative adversarial network for dose prediction in radiotherapy. Med. Image Anal. 77, 102339 (2022)

    Article  Google Scholar 

  10. Häggström, I., Schmidtlein, C.R., et al.: DeepPET: a deep encoder-decoder network for directly solving the PET image reconstruction inverse problem. Med. Image Anal. 54, 253–262 (2019)

    Article  Google Scholar 

  11. Hu, L., Li, J., Peng, X., et al.: Semi-supervised NPC segmentation with uncertainty and attention guided consistency. Knowl.-Based Syst. 239, 108021 (2022)

    Article  Google Scholar 

  12. Mehranian, A., Reader, A.J.: Model-based deep learning PET image reconstruction using forward-backward splitting expectation-maximization. IEEE Trans. Radiat. Plasma Med. Sci. 5(1), 54–64 (2020)

    Article  Google Scholar 

  13. Tang, P., Yang, P., et al.: Unified medical image segmentation by learning from uncertainty in an end-to-end manner. Knowl. Based Syst. 241, 108215 (2022)

    Article  Google Scholar 

  14. Zhou, L., Schaefferkoetter, J.D., et al.: Supervised learning with cyclegan for low-dose FDG PET image denoising. Med. Image Anal. 65, 101770 (2020)

    Article  Google Scholar 

  15. Luo, Y., Zhou, L., Zhan, B., et al.: Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med. Image Anal. 77, 102335 (2022)

    Article  Google Scholar 

  16. Wang, K., Zhan, B., Zu, C., et al.: Semi-supervised medical image segmentation via a tripled-uncertainty guided mean teacher model with contrastive learning. Med. Image Anal. 79, 102447 (2022)

    Article  Google Scholar 

  17. Nie, D., Wang, L., Adeli, E., et al.: 3D fully convolutional networks for multimodal isointense infant brain image segmentation. IEEE Trans. Cybern. 49(3), 1123–1136 (2018)

    Article  Google Scholar 

  18. Shi, Y., Zu, C., Hong, M., et al.: ASMFS: Adaptive-similarity-based multi-modality feature selection for classification of Alzheimer’s disease. Pattern Recogn. 126, 108566 (2022)

    Article  Google Scholar 

  19. Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16 x 16 words: Transformers for image recognition at scale. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Venice (2020)

    Google Scholar 

  20. Hugo T., Matthieu C., et al.: Training data-efficient image transformers & distillation through attention. In: Proceedings of the 38th International Conference on Machine Learning, pp. 10347–10357. PMLR, Vienna (2021)

    Google Scholar 

  21. Wang, W., Xie, E., Li, X., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578. IEEE, Montreal (2021)

    Google Scholar 

  22. Zhang, Z., Yu, L., Liang, X., Zhao, W., Xing, L.: TransCT: dual-path transformer for low dose computed tomography. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 55–64. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_6

    Chapter  Google Scholar 

  23. Luo, Y., et al.: 3D transformer-GAN for high-quality PET reconstruction. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12906, pp. 276–285. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87231-1_27

    Chapter  Google Scholar 

  24. Wang, W., Chen, C., Ding, M., Yu, H., Zha, S., Li, J.: TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 109–119. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_11

    Chapter  Google Scholar 

  25. Zhang, Y., Liu, H., Hu, Q.: Transfuse: fusing transformers and CNNs for medical image segmentation. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12901, pp. 14–24. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87193-2_2

    Chapter  Google Scholar 

  26. Chen, J., Lu, Y., Yu, Q., et al.: TransuNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)

  27. Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 61–71. Springer, Cham (2021)

    Google Scholar 

  28. Luthra, A., Sulakhe, H., Mittal, T., et al.: Eformer: edge enhancement based transformer for medical image denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021 (2021)

    Google Scholar 

  29. Wu, H., Xiao, B., Codella, N., et al.: CVT: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 22–31 (2021)

    Google Scholar 

  30. Ye, R., Liu, F., Zhang, L.: 3D depthwise convolution: reducing model parameters in 3D vision tasks. In: Meurs, M., Rudzicz, F. (eds.) Canadian AI 2019. LNCS (LNAI), vol. 11489, pp. 186–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18305-9_15

    Chapter  Google Scholar 

  31. Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-Net: learning dense volumetric segmentation from sparse annotation. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 424–432. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_49

    Chapter  Google Scholar 

  32. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, p. 30 (2017)

    Google Scholar 

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (NFSC 62071314).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Dinggang Shen or Yan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zeng, P. et al. (2022). 3D CVT-GAN: A 3D Convolutional Vision Transformer-GAN for PET Reconstruction. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13436. Springer, Cham. https://doi.org/10.1007/978-3-031-16446-0_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16446-0_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16445-3

  • Online ISBN: 978-3-031-16446-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics