Over the years, the humanities community has increasingly requested the creation of artificial intelligence frameworks to help the study of cultural heritage. Document Layout segmentation, which aims at identifying the different structural components of a document page, is a particularly interesting task connected to this trend, specifically when it comes to handwritten texts. While there are many effective approaches to this problem, they all rely on large amounts of data for the training of the underlying models, which is rarely possible in a real-world scenario, as the process of producing the ground truth segmentation task with the required precision to the pixel level is a very time-consuming task and often requires a certain degree of domain knowledge regarding the documents at hand. For this reason, in this paper, we propose an effective few-shot learning framework for document layout segmentation relying on two novel components, namely a dynamic instance generation and a segmentation refinement module. This approach is able of achieving performances comparable to the current state of the art on the popular Diva-HisDB dataset, while relying on just a fraction of the available data.
Few-Shot Pixel-Precise Document Layout Segmentation via Dynamic Instance Generation and Local Thresholding
Nardin, Axel De;Zottin, Silvia;Piciarelli, Claudio;Colombi, Emanuela;Foresti, Gian Luca
2023-01-01
Abstract
Over the years, the humanities community has increasingly requested the creation of artificial intelligence frameworks to help the study of cultural heritage. Document Layout segmentation, which aims at identifying the different structural components of a document page, is a particularly interesting task connected to this trend, specifically when it comes to handwritten texts. While there are many effective approaches to this problem, they all rely on large amounts of data for the training of the underlying models, which is rarely possible in a real-world scenario, as the process of producing the ground truth segmentation task with the required precision to the pixel level is a very time-consuming task and often requires a certain degree of domain knowledge regarding the documents at hand. For this reason, in this paper, we propose an effective few-shot learning framework for document layout segmentation relying on two novel components, namely a dynamic instance generation and a segmentation refinement module. This approach is able of achieving performances comparable to the current state of the art on the popular Diva-HisDB dataset, while relying on just a fraction of the available data.File | Dimensione | Formato | |
---|---|---|---|
FewShot_layout_analyisis_IJNS revised.pdf
Open Access dal 22/07/2024
Descrizione: post-print
Tipologia:
Documento in Post-print
Licenza:
Creative commons
Dimensione
699.49 kB
Formato
Adobe PDF
|
699.49 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.