Supervised classification aims to learn a model (or a classifier) from a collection of XML documents individually marked with one of a predefined set of class labels. The learnt classifier isolates each class by the content and structural regularities observed within the respective labeled XML documents and, thus, allows to predict the unknown class of unlabeled XML documents by looking at their content and structural features. The classification of unlabeled XML documents into the predefined classes is a valuable support for more effective and efficient XML search, retrieval and filtering. We discuss an approach for learning intelligible XML classifiers. XML documents are represented as transactions in a space of boolean features, that are informative of their content and structure. Learning algorithms induce compact associative classifiers with outperforming effectiveness from the transactional XML representation. A preprocessing step contributes to the scalability of the approach with the size of XML corpora. © Springer-Verlag Berlin Heidelberg 2013.

Learning Effective XML Classifiers Based on Discriminatory Structures and Nested Content

Ritacco E.
Co-primo
2013-01-01

Abstract

Supervised classification aims to learn a model (or a classifier) from a collection of XML documents individually marked with one of a predefined set of class labels. The learnt classifier isolates each class by the content and structural regularities observed within the respective labeled XML documents and, thus, allows to predict the unknown class of unlabeled XML documents by looking at their content and structural features. The classification of unlabeled XML documents into the predefined classes is a valuable support for more effective and efficient XML search, retrieval and filtering. We discuss an approach for learning intelligible XML classifiers. XML documents are represented as transactions in a space of boolean features, that are informative of their content and structure. Learning algorithms induce compact associative classifiers with outperforming effectiveness from the transactional XML representation. A preprocessing step contributes to the scalability of the approach with the size of XML corpora. © Springer-Verlag Berlin Heidelberg 2013.
2013
978-3-642-37185-1
978-3-642-37186-8
File in questo prodotto:
File Dimensione Formato  
978-3-642-37186-8_10.pdf

non disponibili

Licenza: Non pubblico
Dimensione 749.83 kB
Formato Adobe PDF
749.83 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/1248967
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact