iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.3390/RS11050487
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,23]],"date-time":"2024-07-23T06:36:27Z","timestamp":1721716587394},"reference-count":32,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,2,27]],"date-time":"2019-02-27T00:00:00Z","timestamp":1551225600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"Refining raw disparity maps from different algorithms to exploit their complementary advantages is still challenging. Uncertainty estimation and complex disparity relationships among pixels limit the accuracy and robustness of existing methods and there is no standard method for fusion of different kinds of depth data. In this paper, we introduce a new method to fuse disparity maps from different sources, while incorporating supplementary information (intensity, gradient, etc.) into a refiner network to better refine raw disparity inputs. A discriminator network classifies disparities at different receptive fields and scales. Assuming a Markov Random Field for the refined disparity map produces better estimates of the true disparity distribution. Both fully supervised and semi-supervised versions of the algorithm are proposed. The approach includes a more robust loss function to inpaint invalid disparity values and requires much less labeled data to train in the semi-supervised learning mode. The algorithm can be generalized to fuse depths from different kinds of depth sources. Experiments explored different fusion opportunities: stereo-monocular fusion, stereo-ToF fusion and stereo-stereo fusion. The experiments show the superiority of the proposed algorithm compared with the most recent algorithms on public synthetic datasets (Scene Flow, SYNTH3, our synthetic garden dataset) and real datasets (Kitti2015 dataset and Trimbot2020 Garden dataset).<\/jats:p>","DOI":"10.3390\/rs11050487","type":"journal-article","created":{"date-parts":[[2019,2,27]],"date-time":"2019-02-27T16:41:03Z","timestamp":1551285663000},"page":"487","source":"Crossref","is-referenced-by-count":6,"title":["SDF-MAN: Semi-Supervised Disparity Fusion with Multi-Scale Adversarial Networks"],"prefix":"10.3390","volume":"11","author":[{"given":"Can","family":"Pu","sequence":"first","affiliation":[{"name":"School of Informatics, University of Edinburgh, Edinburgh EH8 9BT, UK"}]},{"given":"Runzi","family":"Song","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, Tsinghua University, Beijing 100084, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-6020-1141","authenticated-orcid":false,"given":"Radim","family":"Tylecek","sequence":"additional","affiliation":[{"name":"School of Informatics, University of Edinburgh, Edinburgh EH8 9BT, UK"}]},{"given":"Nanbo","family":"Li","sequence":"additional","affiliation":[{"name":"School of Informatics, University of Edinburgh, Edinburgh EH8 9BT, UK"}]},{"ORCID":"http:\/\/orcid.org\/0000-0001-6860-9371","authenticated-orcid":false,"given":"Robert B.","family":"Fisher","sequence":"additional","affiliation":[{"name":"School of Informatics, University of Edinburgh, Edinburgh EH8 9BT, UK"}]}],"member":"1968","published-online":{"date-parts":[[2019,2,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., and Cortelazzo, G.M. (2016). Time-Of-Flight and Structured Light Depth Cameras, Springer.","DOI":"10.1007\/978-3-319-30973-6"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Chang, J.R., and Chen, Y.S. (2018, January 18\u201322). Pyramid Stereo Matching Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00567"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Honegger, D., Sattler, T., and Pollefeys, M. (June, January 29). Embedded real-time multi-baseline stereo. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989615"},{"key":"ref_4","unstructured":"Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., and Brox, T. (July, January 26). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Horna, L., and Fisher, R.B. (2017, January 27). 3D Plane Labeling Stereo Matching with Content Aware Adaptive Windows. Proceedings of the VISIGRAPP (6: VISAPP), Porto, Portugal.","DOI":"10.5220\/0006105401620171"},{"key":"ref_6","unstructured":"Hirschmuller, H. (2005, January 20\u201326). Accurate and efficient stereo processing by semi-global matching and mutual information. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), San Diego, CA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Godard, C., Mac Aodha, O., and Brostow, G.J. (2017, January 21\u201326). Unsupervised monocular depth estimation with left-right consistency. Proceedings of the CVPR, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.699"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Nair, R., Ruhl, K., Lenzen, F., Meister, S., Sch\u00e4fer, H., Garbe, C.S., Eisemann, M., Magnor, M., and Kondermann, D. (2013). A survey on time-of-flight stereo fusion. Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications, Springer.","DOI":"10.1007\/978-3-642-44964-2_6"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2260","DOI":"10.1109\/TPAMI.2015.2408361","article-title":"Probabilistic tof and stereo data fusion based on mixed pixels measurement models","volume":"37","author":"Zanuttigh","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Marin, G., Zanuttigh, P., and Mattoccia, S. (2016, January 8\u201316). Reliable fusion of tof and stereo depth driven by confidence measures. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46478-7_24"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Agresti, G., Minto, L., Marin, G., and Zanuttigh, P. (2017, January 22\u201329). Deep Learning for Confidence Information in Stereo and ToF Data Fusion. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Venice, Italy.","DOI":"10.1109\/ICCVW.2017.88"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1994","DOI":"10.1109\/LRA.2017.2715400","article-title":"Single-View and Multi-View Depth Fusion","volume":"2","author":"Concha","year":"2017","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Poggi, M., and Mattoccia, S. (2016, January 25\u201328). Deep stereo fusion: combining multiple disparity hypotheses with deep-learning. Proceedings of the 2016 Fourth International Conference on 3D Vision (3DV), Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.22"},{"key":"ref_14","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., van der Maaten, L., and Weinberger, K.Q. (2017, January 22\u201329). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Venice, Italy.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_17","unstructured":"Arjovsky, M., Chintala, S., and Bottou, L. (2017, January 6\u201311). Wasserstein generative adversarial networks. Proceedings of the International Conference on Machine Learning, Sydney, Australia."},{"key":"ref_18","unstructured":"Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A.C. (pp. 5769\u20135779). Improved training of wasserstein gans. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Seki, A., and Pollefeys, M. (2016, January 19\u201322). Patch Based Confidence Prediction for Dense Disparity Map. Proceedings of the BMVC, York, UK.","DOI":"10.5244\/C.30.23"},{"key":"ref_20","unstructured":"Mirza, M., and Osindero, S. (arXiv, 2014). Conditional generative adversarial nets, arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","article-title":"Generative Adversarial Networks: An Overview","volume":"35","author":"Creswell","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Isola, P., Zhu, J.Y., Zhou, T., and Efros, A.A. (2017, January 21\u201326). Image-to-Image Translation with Conditional Adversarial Networks. Proceedings of the CVPR, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.632"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_24","unstructured":"Radford, A., Metz, L., and Chintala, S. (arXiv, 2015). Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv."},{"key":"ref_25","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). TensorFlow: A System for Large-Scale Machine Learning. Proceedings of the OSDI, Savannah, GA, USA."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Menze, M., Heipke, C., and Geiger, A. (2015, January 1\u20132). Joint 3D Estimation of Vehicles and Scene Flow. Proceedings of the ISPRS Workshop on Image Sequence Analysis (ISA), Berlin, Germany.","DOI":"10.5194\/isprsannals-II-3-W5-427-2015"},{"key":"ref_27","unstructured":"Kingma, D.P., and Ba, J. (arXiv, 2014). Adam: A method for stochastic optimization, arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"9153","DOI":"10.1007\/s11042-017-4654-5","article-title":"3D preservation of buildings\u2013reconstructing the past","volume":"77","author":"Fritsch","year":"2018","journal-title":"Multimed. Tools Appl."},{"key":"ref_29","unstructured":"Sattler, T., Brox, T., Pollefeys, M., Fisher, R.B., and Tylecek, R. (2017). 3D Reconstruction Meets Semantics\u2014Reconstruction Challenge, ICCV Workshops. Technical Report."},{"key":"ref_30","unstructured":"Sch\u00f6nberger, J.L., and Frahm, J.M. (July, January 26). Structure-From-Motion Revisited. Proceedings of the CVPR, Las Vegas, NV, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Sch\u00f6ps, T., Sch\u00f6nberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., and Geiger, A. (2017, January 21\u201326). A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.272"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/5\/487\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,16]],"date-time":"2024-06-16T06:26:39Z","timestamp":1718519199000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/5\/487"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,2,27]]},"references-count":32,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,3]]}},"alternative-id":["rs11050487"],"URL":"http:\/\/dx.doi.org\/10.3390\/rs11050487","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,2,27]]}}}