iBet uBet web content aggregator. Adding the entire web to your favor.
iBet uBet web content aggregator. Adding the entire web to your favor.



Link to original content: https://api.crossref.org/works/10.1145/3627818
{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,10,31]],"date-time":"2024-10-31T04:34:09Z","timestamp":1730349249911,"version":"3.28.0"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"1","funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["U22B2034, U21A20515, 62172416, 62172415, 62102418"],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association of the Chinese Academy of Sciences","doi-asserted-by":"crossref","award":["2022131"],"id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2024,2,29]]},"abstract":"Single image rectification of document deformation is a challenging task. Although some recent deep learning-based methods have attempted to solve this problem, they cannot achieve satisfactory results when dealing with document images with complex deformations. In this article, we propose a new efficient framework for document flattening. Our main insight is that most layout primitives in a document have rectangular outline shapes, making unwarping local layout primitives essentially homogeneous with unwarping the entire document. The former task is clearly more straightforward to solve than the latter due to the more consistent texture and relatively smooth deformation. On this basis, we propose a layout-aware deep model working in a divide-and-conquer manner. First, we employ a transformer-based segmentation module to obtain the layout information of the input document. Then a new regression module is applied to predict the global and local UV maps. Finally, we design an effective merging algorithm to correct the global prediction with local details. Both quantitative and qualitative experimental results demonstrate that our framework achieves favorable performance against state-of-the-art methods. In addition, the current publicly available document flattening datasets have limited 3D paper shapes without layout annotation and also lack a general geometric correction metric. Therefore, we build a new large-scale synthetic dataset by utilizing a fully automatic rendering method to generate deformed documents with diverse shapes and exact layout segmentation labels. We also propose a new geometric correction metric based on our paired document UV maps. Code and dataset will be released athttps:\/\/github.com\/BunnySoCrazy\/LA-DocFlatten<\/jats:ext-link>.<\/jats:p>","DOI":"10.1145\/3627818","type":"journal-article","created":{"date-parts":[[2023,10,13]],"date-time":"2023-10-13T15:26:59Z","timestamp":1697210819000},"page":"1-17","update-policy":"http:\/\/dx.doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Layout-aware Single-image Document Flattening"],"prefix":"10.1145","volume":"43","author":[{"ORCID":"http:\/\/orcid.org\/0009-0007-1060-689X","authenticated-orcid":false,"given":"Pu","family":"Li","sequence":"first","affiliation":[{"name":"MAIS, Institute of Automation, CAS and School of Artificial Intelligence, UCAS, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-0892-581X","authenticated-orcid":false,"given":"Weize","family":"Quan","sequence":"additional","affiliation":[{"name":"MAIS, Institute of Automation, CAS and School of Artificial Intelligence, UCAS, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0002-3376-1725","authenticated-orcid":false,"given":"Jianwei","family":"Guo","sequence":"additional","affiliation":[{"name":"MAIS, Institute of Automation, CAS and School of Artificial Intelligence, UCAS, China"}]},{"ORCID":"http:\/\/orcid.org\/0000-0003-2209-2404","authenticated-orcid":false,"given":"Dong-Ming","family":"Yan","sequence":"additional","affiliation":[{"name":"MAIS, Institute of Automation, CAS and School of Artificial Intelligence, UCAS, China"}]}],"member":"320","published-online":{"date-parts":[[2023,11,2]]},"reference":[{"key":"e_1_3_3_2_1","first-page":"3751","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917)","author":"Islam Md Amirul","year":"2017","unstructured":"Md Amirul Islam, Mrigank Rochan, Neil D. B. Bruce, and Yang Wang. 2017. Gated feedback refinement network for dense image labeling. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917). 3751\u20133759."},{"issue":"6","key":"e_1_3_3_3_1","article-title":"Document layout analysis: A comprehensive survey","volume":"52","author":"Binmakhashen Galal M.","year":"2019","unstructured":"Galal M. Binmakhashen and Sabri A. Mahmoud. 2019. Document layout analysis: A comprehensive survey. ACM Comput. Surv. 52, 6 (2019).","journal-title":"ACM Comput. Surv."},{"key":"e_1_3_3_4_1","first-page":"1173","volume-title":"International Conference on Computer Vision Workshop.","author":"Oliveira D\u00e1rio Augusto Borges","year":"2017","unstructured":"D\u00e1rio Augusto Borges Oliveira and Matheus Palhares Viana. 2017. Fast CNN-based document layout analysis. In International Conference on Computer Vision Workshop. 1173\u20131180."},{"key":"e_1_3_3_5_1","first-page":"367","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201901)","volume":"2","author":"Brown Michael S.","year":"2001","unstructured":"Michael S. Brown and W. Brent Seales. 2001. Document restoration using 3D shape: A general deskewing algorithm for arbitrarily warped documents. In IEEE International Conference on Computer Vision (ICCV\u201901), Vol. 2. 367\u2013374."},{"key":"e_1_3_3_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.1118"},{"key":"e_1_3_3_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2006.871082"},{"key":"e_1_3_3_8_1","first-page":"33","volume-title":"IEEE Conference on Computer Robot Vision.","author":"Burden Alexander","year":"2019","unstructured":"Alexander Burden, Melissa Cote, and Alexandra Branzan Albu. 2019. Rectification of camera-captured document images with mixed contents and varied layouts. In IEEE Conference on Computer Robot Vision. 33\u201340."},{"key":"e_1_3_3_9_1","first-page":"228","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201903)","volume":"1","author":"Cao Huaigu","year":"2003","unstructured":"Huaigu Cao, Xiaoqing Ding, and Changsong Liu. 2003. A cylindrical surface model to rectify the bound document image. In IEEE International Conference on Computer Vision (ICCV\u201903), Vol. 1. 228\u2013233."},{"issue":"1","key":"e_1_3_3_10_1","first-page":"1","article-title":"Fused behavior recognition model based on attention mechanism","volume":"3","author":"Chen Lei","year":"2020","unstructured":"Lei Chen, Rui Liu, Dongsheng Zhou, Xin Yang, and Qiang Zhang. 2020. Fused behavior recognition model based on attention mechanism. Vis. Comput. Industr., Biomed. Art 3, 1 (2020), 1\u201310.","journal-title":"Vis. Comput. Industr., Biomed. Art"},{"key":"e_1_3_3_11_1","first-page":"301","article-title":"Shape from shading for the digitization of curved documents","volume":"18","author":"Courteille Fr\u00e9d\u00e9ric","year":"2007","unstructured":"Fr\u00e9d\u00e9ric Courteille, Alain Crouzil, Jean-Denis Durou, and Pierre Gurdjos. 2007. Shape from shading for the digitization of curved documents. Pattern Recog. 18 (2007), 301\u2013316.","journal-title":"Pattern Recog."},{"key":"e_1_3_3_12_1","first-page":"131","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201919)","author":"Das Sagnik","year":"2019","unstructured":"Sagnik Das, Ke Ma, Zhixin Shu, Dimitris Samaras, and Roy Shilkrot. 2019. DewarpNet: Single-image document unwarping with stacked 3D and 2D regression networks. In IEEE International Conference on Computer Vision (ICCV\u201919). 131\u2013140."},{"key":"e_1_3_3_13_1","first-page":"125","volume-title":"ACM Symposium on Document Engineering.","author":"Das Sagnik","year":"2017","unstructured":"Sagnik Das, Gaurav Mishra, Akshay Sudharshana, and Roy Shilkrot. 2017. The common fold: Utilizing the four-fold to dewarp printed documents from a single image. In ACM Symposium on Document Engineering. 125\u2013128."},{"key":"e_1_3_3_14_1","first-page":"4268","volume-title":"IEEE\/CVF International Conference on Computer Vision","author":"Das Sagnik","year":"2021","unstructured":"Sagnik Das, Kunwar Yashraj Singh, Jon Wu, Erhan Bas, Vijay Mahadevan, Rahul Bhotika, and Dimitris Samaras. 2021. End-to-end piece-wise unwarping of document images. In IEEE\/CVF International Conference on Computer Vision. 4268\u20134277."},{"key":"e_1_3_3_15_1","article-title":"Multistage curvilinear coordinate transform based document image dewarping using a novel quality estimator","volume":"2003","author":"Dasgupta Tanmoy","year":"2020","unstructured":"Tanmoy Dasgupta, Nibaran Das, and Mita Nasipuri. 2020. Multistage curvilinear coordinate transform based document image dewarping using a novel quality estimator. CoRR abs\/2003.06872 (2020).","journal-title":"CoRR"},{"key":"e_1_3_3_16_1","first-page":"5936","volume-title":"International Conference on Pattern Recognition.","author":"Davoudi Homa","year":"2021","unstructured":"Homa Davoudi, Marco Fiorucci, and Arianna Traviglia. 2021. Ancient document layout analysis: Autoencoders meet sparse coding. In International Conference on Pattern Recognition. 5936\u20135942."},{"key":"e_1_3_3_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2962685"},{"key":"e_1_3_3_18_1","article-title":"An image is worth 16x16 words: Transformers for image recognition at scale","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations.","journal-title":"International Conference on Learning Representations."},{"key":"e_1_3_3_19_1","first-page":"1226","volume-title":"IAPR International Conference on Document Analysis and Recognition.","author":"Fawzi Mohamed","year":"2015","unstructured":"Mohamed Fawzi, Mohsen. A. Rashwan, Hany Ahmed, Shaimaa Samir, Sherif M. Abdou, Hassanin M. Al-Barhamtoshy, and Kamal M. Jambi. 2015. Rectification of camera captured document images for camera-based OCR technology. In IAPR International Conference on Document Analysis and Recognition. 1226\u20131230."},{"key":"e_1_3_3_20_1","doi-asserted-by":"crossref","unstructured":"Hao Feng Shaokai Liu Jiajun Deng Wengang Zhou and Houqiang Li. 2023. Deep unrestricted document image rectification. arXiv preprint arXiv:2304.08796.","DOI":"10.1109\/TMM.2023.3347094"},{"key":"e_1_3_3_21_1","first-page":"273","volume-title":"ACM International Conference on Multimedia","author":"Feng Hao","year":"2021","unstructured":"Hao Feng, Yuechen Wang, Wengang Zhou, Jiajun Deng, and Houqiang Li. 2021a. DocTr: Document image transformer for geometric unwarping and illumination correction. In ACM International Conference on Multimedia. 273\u2013281."},{"key":"e_1_3_3_22_1","article-title":"DocScanner: Robust document image rectification with progressive learning","volume":"2110","author":"Feng Hao","year":"2021","unstructured":"Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, and Houqiang Li. 2021b. DocScanner: Robust document image rectification with progressive learning. CoRR abs\/2110.14968 (2021).","journal-title":"CoRR"},{"volume-title":"European Conference on Computer Vision","year":"2022","author":"Feng Hao","key":"e_1_3_3_23_1","unstructured":"Hao Feng, Wengang Zhou, Jiajun Deng, Yuechen Wang, and Houqiang Li. 2022. Geometric representation learning for document image rectification. In European Conference on Computer Vision."},{"issue":"6","key":"e_1_3_3_24_1","article-title":"Learning to predict indoor illumination from a single image","volume":"36","author":"Gardner Marc-Andr\u00e9","year":"2017","unstructured":"Marc-Andr\u00e9 Gardner, Kalyan Sunkavalli, Ersin Yumer, Xiaohui Shen, Emiliano Gambaretto, Christian Gagn\u00e9, and Jean-Fran\u00e7ois Lalonde. 2017. Learning to predict indoor illumination from a single image. ACM Trans. Graph. 36, 6 (2017).","journal-title":"ACM Trans. Graph."},{"key":"e_1_3_3_25_1","first-page":"254","volume-title":"IAPR International Conference on Document Analysis and Recognition","volume":"01","author":"He Dafang","year":"2017","unstructured":"Dafang He, Scott Cohen, Brian Price, Daniel Kifer, and C. Lee Giles. 2017. Multi-scale multi-task FCN for semantic page segmentation and table detection. In IAPR International Conference on Document Analysis and Recognition, Vol. 01. 254\u2013261."},{"key":"e_1_3_3_26_1","first-page":"403","volume-title":"IAPR International Conference on Document Analysis and Recognition.","author":"He Yuan","year":"2013","unstructured":"Yuan He, Pan Pan, Shufu Xie, Jun Sun, and Satoshi Naoi. 2013. A book dewarping system by boundary-based 3D surface reconstruction. In IAPR International Conference on Document Analysis and Recognition. 403\u2013407."},{"key":"e_1_3_3_27_1","first-page":"4543","volume-title":"IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Jiang Xiangwei","year":"2022","unstructured":"Xiangwei Jiang, Rujiao Long, Nan Xue, Zhibo Yang, Cong Yao, and Gui-Song Xia. 2022. Revisiting document image dewarping by grid regularization. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4543\u20134552."},{"key":"e_1_3_3_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2015.04.026"},{"key":"e_1_3_3_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1360612.1360649"},{"volume-title":"International Conference on Learning Representations.","year":"2015","author":"Kingma Diederik P.","key":"e_1_3_3_30_1","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations."},{"key":"e_1_3_3_31_1","first-page":"421","volume-title":"European Conference on Computer Vision (ECCV\u201910)","author":"Koo Hyung Il","year":"2010","unstructured":"Hyung Il Koo and Nam Ik Cho. 2010. State estimation in a document image and its application in text block identification and text line extraction. In European Conference on Computer Vision (ECCV\u201910). 421\u2013434."},{"key":"e_1_3_3_32_1","first-page":"748","volume-title":"IEEE International Conference on Image Processing","volume":"3","author":"Lavialle Olivier","year":"2001","unstructured":"Olivier Lavialle, X. Molines, Franck Angella, and Pierre Baylou. 2001. Active contours network to straighten distorted text lines. In IEEE International Conference on Image Processing, Vol. 3. 748\u2013751."},{"issue":"6","key":"e_1_3_3_33_1","article-title":"Document rectification and illumination correction using a patch-based CNN","volume":"38","author":"Li Xiaoyu","year":"2019","unstructured":"Xiaoyu Li, Bo Zhang, Jing Liao, and Pedro V. Sander. 2019. Document rectification and illumination correction using a patch-based CNN. ACM Trans. Graph. 38, 6 (2019).","journal-title":"ACM Trans. Graph."},{"key":"e_1_3_3_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2007.70724"},{"key":"e_1_3_3_35_1","first-page":"1925","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917)","author":"Lin Guosheng","year":"2017","unstructured":"Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid. 2017b. RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917). 1925\u20131934."},{"key":"e_1_3_3_36_1","first-page":"2117","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917)","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Piotr Doll\u00e1r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017a. Feature pyramid networks for object detection. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201917). 2117\u20132125."},{"issue":"5","key":"e_1_3_3_37_1","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1109\/TPAMI.2010.147","article-title":"Sift flow: Dense correspondence across scenes and its applications","volume":"33","author":"Liu Ce","year":"2010","unstructured":"Ce Liu, Jenny Yuen, and Antonio Torralba. 2010. Sift flow: Dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5 (2010), 978\u2013994.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_3_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107576"},{"key":"e_1_3_3_39_1","first-page":"3431","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201915)","author":"Long Jonathan","year":"2015","unstructured":"Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201915). 3431\u20133440."},{"key":"e_1_3_3_40_1","first-page":"1","volume-title":"ACM SIGGRAPH Conference Proceedings","author":"Ma Ke","year":"2022","unstructured":"Ke Ma, Sagnik Das, Zhixin Shu, and Dimitris Samaras. 2022. Learning from documents in the wild to improve document unwarping. In ACM SIGGRAPH Conference Proceedings. 1\u20139."},{"volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201918)","year":"2018","author":"Ma Ke","key":"e_1_3_3_41_1","unstructured":"Ke Ma, Zhixin Shu, Xue Bai, Jue Wang, and Dimitris Samaras. 2018. DocUNet: Document image unwarping via a stacked u-net. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201918)."},{"key":"e_1_3_3_42_1","first-page":"208","volume-title":"European Conference on Computer Vision (ECCV\u201920)","author":"Markovitz Amir","year":"2020","unstructured":"Amir Markovitz, Inbal Lavi, Or Perel, Shai Mazor, and Roee Litman. 2020. Can you read me now? Content aware rectification using angle supervision. In European Conference on Computer Vision (ECCV\u201920). 208\u2013223."},{"key":"e_1_3_3_43_1","first-page":"180","volume-title":"European Conference on Computer Vision (ECCV\u201918)","author":"Meng Gaofeng","year":"2018","unstructured":"Gaofeng Meng, Yuanqi Su, Ying Wu, Shiming Xiang, and Chunhong Pan. 2018. Exploiting vector fields for geometric rectification of distorted document images. In European Conference on Computer Vision (ECCV\u201918). 180\u2013195."},{"key":"e_1_3_3_44_1","doi-asserted-by":"crossref","first-page":"3890","DOI":"10.1109\/CVPR.2014.497","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201914)","author":"Meng Gaofeng","year":"2014","unstructured":"Gaofeng Meng, Ying Wang, Shenquan Qu, Shiming Xiang, and Chunhong Pan. 2014. Active flattening of curved document images via two structured beams. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201914). 3890\u20133897."},{"key":"e_1_3_3_45_1","first-page":"1068","volume-title":"International Conference on Image Analysis Processing.","author":"Mischke Lothar","year":"2005","unstructured":"Lothar Mischke and Wolfram Luther. 2005. Document image de-warping based on detection of distorted text lines. In International Conference on Image Analysis Processing. 1068\u20131075."},{"issue":"6","key":"e_1_3_3_46_1","article-title":"Scalable fluid simulation using anisotropic turbulence particles","volume":"29","author":"Pfaff Tobias","year":"2010","unstructured":"Tobias Pfaff, Nils Thuerey, Jonathan Cohen, Sarah Tariq, and Markus Gross. 2010. Scalable fluid simulation using anisotropic turbulence particles. ACM Trans. Graph. 29, 6 (2010).","journal-title":"ACM Trans. Graph."},{"key":"e_1_3_3_47_1","first-page":"12179","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201921)","author":"Ranftl Ren\u00e9","year":"2021","unstructured":"Ren\u00e9 Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. 2021. Vision transformers for dense prediction. In IEEE International Conference on Computer Vision (ICCV\u201921). 12179\u201312188."},{"key":"e_1_3_3_48_1","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1109\/WACV.2015.106","volume-title":"IEEE Winter Conference on on Applications of Computer Vision.","author":"Salvi Dhaval","year":"2015","unstructured":"Dhaval Salvi, Kang Zheng, Youjie Zhou, and Song Wang. 2015. Distance transform based active contour approach for document image rectification. In IEEE Winter Conference on on Applications of Computer Vision. 757\u2013764."},{"key":"e_1_3_3_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2010.2080280"},{"key":"e_1_3_3_50_1","first-page":"1117","volume-title":"IEEE International Conference on Computer Vision (ICCV\u201905)","volume":"2","author":"Sun Mingxuan","year":"2005","unstructured":"Mingxuan Sun, Ruigang Yang, Lin Yun, G. Landon, W. Brent Seales, and Michael S. Brown. 2005. Geometric and photometric restoration of distorted documents. In IEEE International Conference on Computer Vision (ICCV\u201905), Vol. 2. 1117\u20131123."},{"key":"e_1_3_3_51_1","first-page":"27","volume-title":"IAPR International Conference on Document Analysis and Recognition","volume":"06","author":"Takezawa Yusuke","year":"2017","unstructured":"Yusuke Takezawa, Makoto Hasegawa, and Salvatore Tabbone. 2017. Robust perspective rectification of camera-captured document images. In IAPR International Conference on Document Analysis and Recognition, Vol. 06. 27\u201332."},{"key":"e_1_3_3_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2006.40"},{"key":"e_1_3_3_53_1","first-page":"377","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201911)","author":"Tian Yuandong","year":"2011","unstructured":"Yuandong Tian and Srinivasa G. Narasimhan. 2011. Rectification and 3D reconstruction of curved document images. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201911). 377\u2013384."},{"key":"e_1_3_3_54_1","first-page":"1","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201907)","author":"Tsoi Yau-Chat","year":"2007","unstructured":"Yau-Chat Tsoi and Michael S. Brown. 2007. Multi-view document rectification using boundary. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201907). 1\u20138."},{"key":"e_1_3_3_55_1","first-page":"1001\u2013 1005","volume-title":"IAPR International Conference on Document Analysis and Recognition","volume":"2","author":"Ulges Adrian","year":"2005","unstructured":"Adrian Ulges, Christoph H. Lampert, and Thomas M. Breuel. 2005. Document image dewarping using robust estimation of curled text lines. In IAPR International Conference on Document Analysis and Recognition, Vol. 2. 1001\u2013 1005."},{"key":"e_1_3_3_56_1","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan N. Gomez \u0141ukasz Kaiser and Illia Polosukhin. 2017. Attention is All You Need. In International Conference on Neural Information Processing Systems (NIPS\u201917) Long Beach California 6000\u20136010."},{"key":"e_1_3_3_57_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007906904009"},{"key":"e_1_3_3_58_1","first-page":"1398","volume-title":"37th Asilomar Conference on Signals, Systems & Computers","volume":"2","author":"Wang Zhou","year":"2003","unstructured":"Zhou Wang, Eero P. Simoncelli, and Alan C. Bovik. 2003. Multiscale structural similarity for image quality assessment. In 37th Asilomar Conference on Signals, Systems & Computers, Vol. 2. IEEE, 1398\u20131402."},{"key":"e_1_3_3_59_1","first-page":"1","volume-title":"International Conference on Multimedia and Expo","author":"Wu Xingjiao","year":"2021","unstructured":"Xingjiao Wu, Ziling Hu, Xiangcheng Du, Jing Yang, and Liang He. 2021. Document layout analysis via dynamic residual feature fusion. In International Conference on Multimedia and Expo. 1\u20136."},{"key":"e_1_3_3_60_1","first-page":"311","volume-title":"IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201918)","author":"Xian Ke","year":"2018","unstructured":"Ke Xian, Chunhua Shen, Zhiguo Cao, Hao Lu, Yang Xiao, Ruibo Li, and Zhenbo Luo. 2018. Monocular relative depth perception with web stereo data supervision. In IEEE Computer Vision and Pattern Recognition Conference (CVPR\u201918). 311\u2013320."},{"key":"e_1_3_3_61_1","article-title":"Dewarping document image by displacement flow estimation with fully convolutional network","volume":"2104","author":"Xie Guo-Wang","year":"2021","unstructured":"Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, and Cheng-Lin Liu. 2021a. Dewarping document image by displacement flow estimation with fully convolutional network. CoRR abs\/2104.06815 (2021).","journal-title":"CoRR"},{"key":"e_1_3_3_62_1","first-page":"466","volume-title":"International Conference on Document Analysis and Recognition","author":"Xie Guo-Wang","year":"2021","unstructured":"Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, and Cheng-Lin Liu. 2021b. Document dewarping with control points. In International Conference on Document Analysis and Recognition. Springer, 466\u2013480."},{"key":"e_1_3_3_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2675980"},{"key":"e_1_3_3_64_1","first-page":"129","volume-title":"International Conference on Image Analysis and Processing.","author":"Zandifar Ali","year":"2007","unstructured":"Ali Zandifar. 2007. Unwarping scanned image of Japanese\/English documents. In International Conference on Image Analysis and Processing. 129\u2013136."},{"key":"e_1_3_3_65_1","article-title":"Marior: Margin removal and iterative content rectification for document dewarping in the wild","author":"Zhang Jiaxin","year":"2022","unstructured":"Jiaxin Zhang, Canjie Luo, Lianwen Jin, Fengjun Guo, and Kai Ding. 2022. Marior: Margin removal and iterative content rectification for document dewarping in the wild. arXiv preprint arXiv:2207.11515 (2022).","journal-title":"arXiv preprint arXiv:2207.11515"},{"key":"e_1_3_3_66_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2009.03.025"},{"key":"e_1_3_3_67_1","doi-asserted-by":"crossref","unstructured":"Li Zhang Yu Zhang and Chew Tan. 2008. An improved physically-based method for geometric restoration of distorted document images. IEEE Trans. Anal. Mach. Intell. 30 4 (2008) 728\u2013734.","DOI":"10.1109\/TPAMI.2007.70831"},{"key":"e_1_3_3_68_1","first-page":"1015","volume-title":"IAPR International Conference on Document Analysis and Recognition.","author":"Zhong Xu","year":"2019","unstructured":"Xu Zhong, Jianbin Tang, and Antonio Jimeno Yepes. 2019. PubLayNet: Largest dataset ever for document layout analysis. In IAPR International Conference on Document Analysis and Recognition. 1015\u20131022."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3627818","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,30]],"date-time":"2024-10-30T20:22:11Z","timestamp":1730319731000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3627818"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,2]]},"references-count":67,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,2,29]]}},"alternative-id":["10.1145\/3627818"],"URL":"http:\/\/dx.doi.org\/10.1145\/3627818","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2023,11,2]]},"assertion":[{"value":"2022-09-08","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-09-25","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-11-02","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}