Abstract
The semantic segmentation of tongue image is a key problem in the development of TCM (Traditional Chinese Medicine) modernization, and there are a lot of research dedicated to the development of tongue segmentation. Although the performance improvement in tongue segmentation with the evolution of deep learning, there are major challenges in generalizing it to the diverse testing domain. As we known, the worse the consistency of cross-domain data distribution between source and target domain is, the lower the performance of model in test domain gets. Existing semantic segmentation methods based on supervised learning are difficult to deal with such problems when it is impossible to re-label the tongue image with poor generalization performance in the target domain. To address this problem, we design a adversarial training framework with regularizing entropy on target domain, aiming to enforce high certainty of model’s prediction on target domain during the trend of domain alignment. Specifically, we pre-trained the tongue image segmentation model with deep supervised method on the source domain. In addition to segmentation task, the segmentation model need to regularize entropy of output on target domain and maximally confuse the discriminator. The discriminator tries to distinguish whether the output of segmentation model from the source domain or the target domain. In this study, two datasets is constructed, and the five-fold cross-validation experiment is performed on it. Experimental results show that the tongue image segmentation performance in the open environment was improved by 21.5% mIOU (59.2% → 80.7%) after domain adaptation. As opposed to the pseudo label learning with different thresholds(0.6, 0.9), the mIOU of proposed method increased by 17%, 16.1%. Moreover, as opposed to MinEnt, the mIOU increased by 6%. The tongue images cross-domain segmentation method proposed in this paper significantly improves the segmentation accuracy in the unlabeled target domain by reducing the influence of the cross-domain discrepancy and enhancing the certainty of model output in target domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Shu-qiong, H., Yun-long, Z., Jing, Z., et al.: Research progress on the objectification, quantitation and standardization of tongue manifestation in traditional Chinese medicine. China J. Tradit. Chin. Med. Pharm. 32(4), 1625–1627 (2017)
San-Ii, Y., et al.: Maximum entropy image segmentation based on maximum interclass variance. Comput. Eng. Sci. 40(10), 1874 (2018)
Zhan-peng, H., et al.: An automatic tongue segmentation algorithm based on OTSU and region growing. Shizhen Guoyi Guoyao 28(12), 3062–3064 (2017)
Ling, Z., Jian, Q.: Tongue-image segmentation based on gray projection and threshold-adaptive method. Chin. J. Tissue Eng. Res. 14(9), 1638 1641 (2010)
Xuegang, H.U., Xiulan, Q.I.U.: Novel image segmentation algorithm based on Snake model. J. Comput. Appl. 37(12), 3523–3527 (2017)
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Zhao, H., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liran, W., et al.: Two-phase convolutional neural network design for tongue segmentation. 23(10), 1571–1581 (2018)
Lu, Y.-X., et al.: Review on tongue image segmentation technologies for traditional Chinese medicine: methodologies, performances and prospects. Acta Autom. Sinica 47(05), 1005–1016 (2021)
Ma, L., et al.: Research on tongue image segmentation algorithm based on high resolution feature. Comput. Eng. 46(10), 248–252 (2020)
Abraham, N., Khan, N.M.: A novel focal tversky loss function with improved attention U-Net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 683 687. IEEE (2019)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Vu, T.H., et al.: ADVENT: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2517–2526 (2019)
Cui, S., et al.: Towards discriminability and diversity: batch nuclear-norm maximization under label insufficient situations. arXiv preprint arXiv:2003.12237 (2020)
Zheng, Z., Yang, Y.: Rectifying Pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. arXiv preprint arXiv:2003.03773 (2020)
Hoffman, J., et al.: CyCADA: cycle-consistent adversarial domain adaptation. In: ICML (2018)
Wu, Z., et al.: DCAN: dual channel-wise alignment networks for unsupervised scene adaptation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 518–534 (2018)
Lee, D.H.: Pseudo-Label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML 3, 2 (2013)
Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., et al.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 289–305 (2018)
Hoffman, J., et al.: FCNs in the wild: pixel-level adversarial and constraint-based adaptation. arXiv preprint arXiv:1612.02649 (2016)
Tsai, Y.H., et al.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7472–7481 (2018)
Long, M., et al.: Unsupervised domain adaptation with residual transfer networks. In: Advances in Neural Information Processing Systems, pp. 136–144 (2016)
Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, L., Zhang, S., Zhao, X. (2023). Cross-domain Tongue Image Segmentation Based on Deep Adversarial Networks and Entropy Minimization. In: Lu, H., et al. Image and Graphics . ICIG 2023. Lecture Notes in Computer Science, vol 14359. Springer, Cham. https://doi.org/10.1007/978-3-031-46317-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-031-46317-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46316-7
Online ISBN: 978-3-031-46317-4
eBook Packages: Computer ScienceComputer Science (R0)