Layout analysis is a critical aspect of Document Image Analysis, particularly when it comes to ancient manuscripts. It serves as a foundational step in streamlining subsequent tasks such as optical character recognition and automated transcription. However, one key challenge in this context is represented by the lack of available ground truths as they are extremely time-consuming to produce. Nevertheless, numerous approaches addressing this challenge heavily lean towards a fully supervised learning paradigm, which represents a rare scenario in a real-world setting. For this reason, with this competition, we propose the challenge of addressing this task with a few-shot learning approach, involving the use of only three images for training. The competition dataset, called U-DIADS-Bib, comprises four distinct ancient manuscripts, presenting heterogeneous layout structures, levels of degradation, and languages used. This diversity adds intrigue and complexity to the challenge. In addition, we have also allowed participating in the competition with traditional many-shot learning approaches, for which the whole training set of U-DIADS-Bib was made available.

ICDAR 2024 Competition on Few-Shot and Many-Shot Layout Segmentation of Ancient Manuscripts (SAM)

Zottin S.
;
De Nardin A.;Foresti G. L.;Colombi E.;Piciarelli C.
2024-01-01

Abstract

Layout analysis is a critical aspect of Document Image Analysis, particularly when it comes to ancient manuscripts. It serves as a foundational step in streamlining subsequent tasks such as optical character recognition and automated transcription. However, one key challenge in this context is represented by the lack of available ground truths as they are extremely time-consuming to produce. Nevertheless, numerous approaches addressing this challenge heavily lean towards a fully supervised learning paradigm, which represents a rare scenario in a real-world setting. For this reason, with this competition, we propose the challenge of addressing this task with a few-shot learning approach, involving the use of only three images for training. The competition dataset, called U-DIADS-Bib, comprises four distinct ancient manuscripts, presenting heterogeneous layout structures, levels of degradation, and languages used. This diversity adds intrigue and complexity to the challenge. In addition, we have also allowed participating in the competition with traditional many-shot learning approaches, for which the whole training set of U-DIADS-Bib was made available.
2024
9783031705519
9783031705526
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1292774
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact