Abstract
The Multi-view Convolution Neural Network (MVCNN) has achieved considerable success in 3D shape recognition. However, 3D shape recognition using view-images from random viewpoints has not been yet exploited in depth. In addition, 3D shape recognition using a small number of view-images remains difficult. To tackle these challenges, we developed a novel Multi-view Convolution Neural Network, “Latent-MVCNN” (LMVCNN), that recognizes 3D shapes using multiple view-images from pre-defined or random viewpoints. The LMVCNN consists of three types of sub Convolution Neural Networks. For each view-image, the first type of CNN outputs multiple category probability distributions and the second type of CNN outputs a latent vector to help the first type of CNN choose the decent distribution. The third type of CNN outputs the transition probabilities from the category probability distributions of one view to the category probability distributions of another view, which further helps the LMVCNN to find the decent category probability distributions for each pair of view-images. The three CNNs cooperate with each other to the obtain satisfactory classification scores. Our experimental results show that the LMVCNN achieves competitive performance in 3D shape recognition on ModelNet10 and ModelNet40 for both the pre-defined and the random viewpoints and exhibits promising performance when the number of view-images is quite small.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bai S, Bai X, Zhou Z, Zhang Z, Latecki LJ (2016) Gift: a real-time and scalable 3d shape search engine. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5023–5032
Bronstein MM, Bruna J, Lecun Y, Szlam A, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34(4):18–42
Bruna J, Zaremba W, Szlam A, Lecun Y (2014) Spectral networks and locally connected networks on graphs. In: international conference on learning representations
Bu S, Wang L, Han P, Liu Z, Lib K (2017) 3d shape recognition and retrieval based on multi-modality deep learning. Neurocomputing 259:183–193
Charles RQ, Su H, Mo K, Guibas LJ (2016) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 77–85
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. In: British Machine Vision Conference
Chen X, Chen Y, Gupta K, Zhou J, Najjaran H (2018) Slicenet: a proficient model for real-time 3d shape-based recognition. Neurocomputing 316:144–155
Cohen TS, Geiger M, Koehler J, Welling M (2018) Spherical cnns. In: Proceedings of international conference on learning representations
Feng Y, Zhang Z, Zhao X, Ji R, Gao Y (2018) Gvcnn: Group-view convolutional neural networks for 3d shape recognition. In: Proceedings of IEEE international conference on computer vision, pp 264–272
Ghodrati H, Luciano L, Hamza AB (2019) Convolutional shape-aware representation for 3d object classification. Neural Process Lett 49(2):797–817
Hamza AB (2016) A graph-theoretic approach to 3d shape classification. Neurocomputing 211:11–21
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 770–778
He X, Yang Z, Zhou Z, Song B, Xiang B (2018) Triplet-center loss for multi-view 3d object retrieval. In: 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Sfikas K, Pratikakis TTI (2017) Exploiting the panorama representation for convolutional neural network classification and retrieval. In: 3DOR2017
Kanezaki A, Matsushita Y, Nishida Y (2018) Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of IEEE international conference on computer vision
Klokov R, Lempitsky V (2017) Escape from cells: deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 863–872
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
Monti F, Boscaini D, Masci J, Rodolà E, Svoboda J, Bronstein MM (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5425–5434
Nie W, Liu A, Hao Y, Su Y (2018) View-based 3d model retrieval via multi-graph matching. Neural Process Lett 48(3):1395–1404
Papadakis P, Pratikakis I, Theoharis T, Perantonis S (2010) Panorama: a 3d shape descriptor based on panoramic views for unsupervised 3d object retrieval. Int J Comput Vision 89(2–3):177–192
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 5648–5656
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of neural information processing systems
Shi B, Song B, Zhou Z, Xiang B (2015) Deeppano: deep panoramic representation for 3-d shape recognition. IEEE Signal Process Lett 22(12):2339–2343
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of international conference on learning representations
Sinha A, Bai J, Ramani K (2016) Deep learning 3d shape surfaces using geometry images. In: proceedings of European conference on computer vision, pp 223–240,
Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang MH, Kautz J (2018) Splatnet: sparse lattice networks for point cloud processing. In: Proceedings of IEEE international conference on computer vision
Su H, Maji S, Kalogerakis E (2015) Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of IEEE international conference on computer vision, pp 945–953
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of IEEE conference on computer vision and pattern recognition
Wang C, Cheng M, Sohelb F, Bennamounc M, Li J (2019) Normalnet: a voxel-based cnn for 3d object classification and retrieval. Neurocomputing 323:139–147
Wu J, Zhang C, Xue T, Freeman WT, Tenenbaum JB (2016) Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Proceedings of neural information processing systems
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1912–1920
Xie J, Zheng Z, Gao R, Wang W, Zhu SC, Wu YN (2018) Learning descriptor networks for 3d shape synthesis and analysis. In: Proceedings of IEEE conference on computer vision and pattern recognition
Yan Z, Zeng F (2017) 2d compressive sensing and multi-feature fusion for effective 3d shape retrieval. Inf Sci 409–410:101–120
Yi L, Su H, Guo X, Guibas L (2017) Syncspeccnn: synchronized spectral cnn for 3d shape segmentation. In: Proceedings of IEEE international conference on computer vision, pp 6584–6592
Yu T, Meng J, Yuan J (2018) Multi-view harmonized bilinear network for 3d object recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 186–194
Zhou Y, Zeng F, Qian J, Han X (2019) 3d shape classification and retrieval based on polar view. Inf Sci 474:205–220
Acknowledgements
We would like to thank the anonymous reviewers for their helpful suggestions. This work was supported by Natural Science Fund for Colleges and Universities in Jiangsu Province (Grant No. 18KJB520013), the Dual Creative Doctors of Jiangsu Province, National Nature Science Foundation of China (Grant Nos. 61902159 and 61771146), Zhejiang Provincial Natural Science Foundation of China (LQ19F020003) and Qing Lan Project of Jiangsu Province.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, Q., Yang, C., Fan, H. et al. Latent-MVCNN: 3D Shape Recognition Using Multiple Views from Pre-defined or Random Viewpoints. Neural Process Lett 52, 581–602 (2020). https://doi.org/10.1007/s11063-020-10268-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-020-10268-x