Large archives and digital sky surveys with dimensions of 1012 bytes currently exist, while in the near future they will reach sizes of the order of 1015. Numerical simulations are also producing comparable volumes of information. Data mining tools are needed for information extraction from such large datasets. In this work we propose a multidimensional indexing method, based on a static R-tree data structure, to efficiently query and mine large astrophysical datasets. We follow a top-down construction method, called VAMSplit, which recursively splits the data set on a near median element along the dimension with maximum variance. The obtained index partitions the dataset into non overlapping bounding boxes, with volumes proportional to the local data density. Finally, we show an application of this method for the detection of point sources from a gamma-ray photon list. 1

A Data-driven Multidimensional Indexing Method for Data Mining in Astrophysical Databases

ROBERTO, Vito
2005-01-01

Abstract

Large archives and digital sky surveys with dimensions of 1012 bytes currently exist, while in the near future they will reach sizes of the order of 1015. Numerical simulations are also producing comparable volumes of information. Data mining tools are needed for information extraction from such large datasets. In this work we propose a multidimensional indexing method, based on a static R-tree data structure, to efficiently query and mine large astrophysical datasets. We follow a top-down construction method, called VAMSplit, which recursively splits the data set on a near median element along the dimension with maximum variance. The obtained index partitions the dataset into non overlapping bounding boxes, with volumes proportional to the local data density. Finally, we show an application of this method for the detection of point sources from a gamma-ray photon list. 1
File in questo prodotto:
File Dimensione Formato  
MultiDimIndexing.pdf

non disponibili

Tipologia: Altro materiale allegato
Licenza: Non pubblico
Dimensione 266.81 kB
Formato Adobe PDF
266.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
MultiDimIndexing.pdf

non disponibili

Tipologia: Altro materiale allegato
Licenza: Non pubblico
Dimensione 266.81 kB
Formato Adobe PDF
266.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11390/690626
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact