Normalized to: Roberto, V.
[1]
oai:arXiv.org:astro-ph/0503543 [pdf] - 1233514
Data Mining in Gamma Astrophysics Experiments
Submitted: 2005-03-24
Data mining techniques, including clustering and classification tasks, for
the automatic information extraction from large datasets are increasingly
demanded in several scientific fields. In particular, in the astrophysical
field, large archives and digital sky surveys with dimensions of 10E12 bytes
currently exist, while in the near future they will reach sizes of the order of
10E15. In this work we propose a multidimensional indexing method to
efficiently query and mine large astrophysical datasets. A novelty detection
algorithm, based on the Support Vector Clustering and using density and
neighborhood information stored in the index structure, is proposed to find
regions of interest in data characterized by isotropic noise. We show an
application of this method for the detection of point sources from a gamma-ray
photon list.
[2]
oai:arXiv.org:cs/0402016 [pdf] - 110461
Perspects in astrophysical databases
Submitted: 2004-02-09
Astrophysics has become a domain extremely rich of scientific data. Data
mining tools are needed for information extraction from such large datasets.
This asks for an approach to data management emphasizing the efficiency and
simplicity of data access; efficiency is obtained using multidimensional access
methods and simplicity is achieved by properly handling metadata. Moreover,
clustering and classification techniques on large datasets pose additional
requirements in terms of computation and memory scalability and
interpretability of results. In this study we review some possible solutions.
[3]
oai:arXiv.org:cs/0307032 [pdf] - 110456
Data Management and Mining in Astrophysical Databases
Submitted: 2003-07-12, last modified: 2003-07-16
We analyse the issues involved in the management and mining of astrophysical
data. The traditional approach to data management in the astrophysical field is
not able to keep up with the increasing size of the data gathered by modern
detectors. An essential role in the astrophysical research will be assumed by
automatic tools for information extraction from large datasets, i.e. data
mining techniques, such as clustering and classification algorithms. This asks
for an approach to data management based on data warehousing, emphasizing the
efficiency and simplicity of data access; efficiency is obtained using
multidimensional access methods and simplicity is achieved by properly handling
metadata. Clustering and classification techniques, on large datasets, pose
additional requirements: computational and memory scalability with respect to
the data size, interpretability and objectivity of clustering or classification
results. In this study we address some possible solutions.