iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://doi.org/10.1007/978-3-031-73471-7_8
SAT-Morph: Unsupervised Deformable Medical Image Registration Using Vision Foundation Models with Anatomically Aware Text Prompt | SpringerLink
Skip to main content

SAT-Morph: Unsupervised Deformable Medical Image Registration Using Vision Foundation Models with Anatomically Aware Text Prompt

  • Conference paper
  • First Online:
Foundation Models for General Medical AI (MedAGI 2024)

Abstract

Current unsupervised deformable medical image registration methods rely on image similarity measures. However, these methods are inherently limited by the difficulty of integrating important anatomy knowledge into registration. The development of vision foundation models (e.g., Segment Anything Model (SAM)) has attracted attention for their excellent image segmentation capabilities. Medical-based SAM aligns medical text knowledge with visual knowledge, enabling precise segmentation of organs. In this study, we propose a novel approach that leverages the vision foundation model to enhance medical image registration by integrating anatomical understanding of the vision foundation model into the medical image registration model. Specifically, we propose a novel unsupervised deformable medical image registration framework, called SAT-Morph, which includes Segment Anything with Text prompt (SAT) module and mask registration module. In the SAT module, the medical vision foundation model is utilized to segment anatomical regions within both moving and fixed images according to our designed text prompts. In the mask registration module, we take these segmentation results instead of traditionally used image pairs as the input of the registration model. Compared with utilizing image pairs as input, using segmentation mask pairs incorporates anatomical knowledge and improves the registration performance. Experiments demonstrate that SAT-Morph significantly outperforms existing state-of-the-art methods on both the Abdomen CT and ACDC cardiac MRI datasets. These results illustrate the effectiveness of integrating vision foundation models into medical image registration, showing the potential way for more accurate and anatomically-aware registration. Our code is available at https://github.com/HaoXu0507/SAT-Morph/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Avants, B.B., Epstein, C.L., Grossman, M., Gee, J.C.: Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Med. Image Anal. 12(1), 26–41 (2008)

    Article  Google Scholar 

  2. Balakrishnan, G., Zhao, A., Sabuncu, M.R., Guttag, J., Dalca, A.V.: Voxelmorph: A learning framework for deformable medical image registration. IEEE Trans. Med. Imaging 38(8), 1788–1800 (2019)

    Article  Google Scholar 

  3. Beg, M.F., Miller, M.I., Trouvé, A., Younes, L.: Computing large deformation metric mappings via geodesic flows of diffeomorphisms. Int. J. Comput. Vision 61, 139–157 (2005)

    Article  Google Scholar 

  4. Bigalke, A., Hansen, L., Mok, T.C., Heinrich, M.P.: Unsupervised 3d registration through optimization-guided cyclical self-training. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 677–687. Springer (2023)

    Google Scholar 

  5. Chen, J., Frey, E.C., He, Y., Segars, W.P., Li, Y., Du, Y.: Transmorph: Transformer for unsupervised medical image registration. Med. Image Anal. 82, 102615 (2022)

    Article  Google Scholar 

  6. Chen, Z., Zheng, Y., Gee, J.C.: Transmatch: A transformer-based multilevel dual-stream feature matching network for unsupervised deformable image registration. IEEE Trans. Med. Imaging 43(1), 15–27 (2024)

    Article  Google Scholar 

  7. Chen, Z., Wu, J., Wang, W., Su, W., Chen, G., Xing, S., Zhong, M., Zhang, Q., Zhu, X., Lu, L., et al.: Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 24185–24198 (2024)

    Google Scholar 

  8. Cheng, J., Ye, J., Deng, Z., Chen, J., Li, T., Wang, H., Su, Y., Huang, Z., Chen, J., Jiang, L., et al.: Sam-med2d. arXiv preprint arXiv:2308.16184 (2023)

  9. Dalca, A.V., Balakrishnan, G., Guttag, J., Sabuncu, M.R.: Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces. Med. Image Anal. 57, 226–236 (2019)

    Article  Google Scholar 

  10. Gu, T., Liu, D., Li, Z., Cai, W.: Complex organ mask guided radiology report generation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 7995–8004 (2024)

    Google Scholar 

  11. Gu, T., Yang, K., Liu, D., Cai, W.: Lapa: Latent prompt assist model for medical visual question answering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp. 4971–4980 (June 2024)

    Google Scholar 

  12. Heinrich, M.P., Maier, O., Handels, H.: Multi-modal multi-atlas segmentation using discrete optimisation and self-similarities. VISCERAL Challenge@ ISBI 1390, 27 (2015)

    Google Scholar 

  13. Huang, Y., Yang, X., Liu, L., Zhou, H., Chang, A., Zhou, X., Chen, R., Yu, J., Chen, J., Chen, C., et al.: Segment anything model for medical images? Med. Image Anal. 92, 103061 (2024)

    Article  Google Scholar 

  14. Jin, H., Che, H., Lin, Y., Chen, H.: Promptmrg: Diagnosis-driven prompts for medical report generation. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38, pp. 2607–2615 (2024)

    Google Scholar 

  15. Kim, B., Han, I., Ye, J.C.: Diffusemorph: unsupervised deformable image registration using diffusion model. In: European Conference on Computer Vision. pp. 347–364. Springer (2022)

    Google Scholar 

  16. Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., et al.: Segment anything. arXiv preprint arXiv:2304.02643 (2023)

  17. LeCun, Y., Bengio, Y., et al.: Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361(10), 1995 (1995)

    Google Scholar 

  18. Li, Z., Tian, L., Mok, T.C., Bai, X., Wang, P., Ge, J., Zhou, J., Lu, L., Ye, X., Yan, K., et al.: Samconvex: Fast discrete optimization for ct registration using self-supervised anatomical embedding and correlation pyramid. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 559–569. Springer (2023)

    Google Scholar 

  19. Liao, W., Liu, Z., Dai, H., Xu, S., Wu, Z., Zhang, Y., Huang, X., Zhu, D., Cai, H., Li, Q., et al.: Differentiating chatgpt-generated and human-written medical texts: quantitative study. JMIR Medical Education 9(1), e48904 (2023)

    Article  Google Scholar 

  20. Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024)

    Article  Google Scholar 

  21. Mazurowski, M.A., Dong, H., Gu, H., Yang, J., Konz, N., Zhang, Y.: Segment anything model for medical image analysis: an experimental study. Med. Image Anal. 89, 102918 (2023)

    Article  Google Scholar 

  22. Qin, Y., Li, X.: Fsdiffreg: Feature-wise and score-wise diffusion-guided unsupervised deformable image registration for cardiac images. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 655–665. Springer (2023)

    Google Scholar 

  23. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information processing systems 30 (2017)

    Google Scholar 

  24. Wang, H., Guo, S., Ye, J., Deng, Z., Cheng, J., Li, T., Chen, J., Su, Y., Huang, Z., Shen, Y., et al.: Sam-med3d. arXiv preprint arXiv:2310.15161 (2023)

  25. Wang, W., Chen, Z., Chen, X., Wu, J., Zhu, X., Zeng, G., Luo, P., Lu, T., Zhou, J., Qiao, Y., et al.: Visionllm: Large language model is also an open-ended decoder for vision-centric tasks. Advances in Neural Information Processing Systems 36 (2024)

    Google Scholar 

  26. Xu, J., Lu, L., Peng, X., Pang, J., Ding, J., Yang, L., Song, H., Li, K., Sun, X., Zhang, S., et al.: Data set and benchmark (medgpteval) to evaluate responses from large language models in medicine: Evaluation development and validation. JMIR Med. Inform. 12(1), e57674 (2024)

    Article  Google Scholar 

  27. Zhang, F., Wells, W.M., O’Donnell, L.J.: Deep diffusion mri registration (ddmreg): a deep learning method for diffusion mri registration. IEEE Trans. Med. Imaging 41(6), 1454–1467 (2021)

    Article  Google Scholar 

  28. Zhang, S., Metaxas, D.: On the challenges and perspectives of foundation models for medical image analysis. Medical Image Analysis p. 102996 (2023)

    Google Scholar 

  29. Zhao, Z., Zhang, Y., Wu, C., Zhang, X., Zhang, Y., Wang, Y., Xie, W.: One model to rule them all: Towards universal segmentation for medical images with text prompts. arXiv preprint arXiv:2312.17183 (2023)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weidong Cai .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 146 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, H. et al. (2025). SAT-Morph: Unsupervised Deformable Medical Image Registration Using Vision Foundation Models with Anatomically Aware Text Prompt. In: Deng, Z., et al. Foundation Models for General Medical AI. MedAGI 2024. Lecture Notes in Computer Science, vol 15184. Springer, Cham. https://doi.org/10.1007/978-3-031-73471-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-73471-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-73470-0

  • Online ISBN: 978-3-031-73471-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics