Coarse-to-Fine Document Image Registration for Dewarping
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV(2024)
摘要
Document dewarping has made great progress in recent years, however it usually requires huge document pairs with pixel-level annotation to learn a mapping function. Although photographed document images are easy to obtain, the pixel-level annotation between warped and flat images is time-consuming and almost impossible for large-scale datasets. To overcome this issue, we propose to register photographed documents with corresponding flat counterparts, obtaining the auto-annotation of pixel-level mapping labels. Due to the severe deformation in the real photographed documents, we introduce a coarse-to-fine registration pipeline to learn global-scale transformation and local details alignment respectively. In addition, the lack of registration labels motivates us to tailor a teacher-student dual branch under semi-supervised training, where the model is initialized on synthetic documents with labels. Furthermore, we contribute a large-scale dataset containing 12,500 triplets of synthetic-real-flat documents. Extensive experiments demonstrate the effectiveness of our proposed registration method. Specifically, trained by our registered pixel-level documents, the dewarping model can obtain comparable performance with SOTAs trained by almost 100x scale of samples, showing the high quality of our registration results. Our dataset and code are available at https://github.com/hanquansanren/DIRD.
更多查看译文
关键词
Document Dewarping,Document Registration,Coarse-to-Fine,Semi-supervised Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn