Layout analysis is a critical aspect of Document Image Analysis, particularly when it comes to ancient manuscripts. It serves as a foundational step in streamlining subsequent tasks such as optical character recognition and automated transcription. However, one key challenge in this context is represented by the lack of available ground truths as they are extremely time-consuming to produce. Nevertheless, numerous approaches addressing this challenge heavily lean towards a fully supervised learning paradigm, which represents a rare scenario in a real-world setting. For this reason, with this competition, we propose the challenge of addressing this task with a few-shot learning approach, involving the use of only three images for training. The competition dataset, called U-DIADS-Bib, comprises four distinct ancient manuscripts, presenting heterogeneous layout structures, levels of degradation, and languages used. This diversity adds intrigue and complexity to the challenge. In addition, we have also allowed participating in the competition with traditional many-shot learning approaches, for which the whole training set of U-DIADS-Bib was made available.
ICDAR 2024 Competition on Few-Shot and Many-Shot Layout Segmentation of Ancient Manuscripts (SAM)
Zottin S.
;De Nardin A.;Foresti G. L.;Colombi E.;Piciarelli C.
2024-01-01
Abstract
Layout analysis is a critical aspect of Document Image Analysis, particularly when it comes to ancient manuscripts. It serves as a foundational step in streamlining subsequent tasks such as optical character recognition and automated transcription. However, one key challenge in this context is represented by the lack of available ground truths as they are extremely time-consuming to produce. Nevertheless, numerous approaches addressing this challenge heavily lean towards a fully supervised learning paradigm, which represents a rare scenario in a real-world setting. For this reason, with this competition, we propose the challenge of addressing this task with a few-shot learning approach, involving the use of only three images for training. The competition dataset, called U-DIADS-Bib, comprises four distinct ancient manuscripts, presenting heterogeneous layout structures, levels of degradation, and languages used. This diversity adds intrigue and complexity to the challenge. In addition, we have also allowed participating in the competition with traditional many-shot learning approaches, for which the whole training set of U-DIADS-Bib was made available.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.