sort results by

Use logical operators AND, OR, NOT and round brackets to construct complex queries. Whitespace-separated words are treated as ANDed.

Show articles per page in mode

Brescia, M.

Normalized to: Brescia, M.

95 article(s) in total. 704 co-authors, from 1 to 74 common article(s). Median position in authors list is 3,0.

[1]  oai:arXiv.org:2007.02631  [pdf] - 2128527
Euclid preparation: VIII. The Complete Calibration of the Colour-Redshift Relation survey: VLT/KMOS observations and data release
Euclid Collaboration; Guglielmo, V.; Saglia, R.; Castander, F. J.; Galametz, A.; Paltani, S.; Bender, R.; Bolzonella, M.; Capak, P.; Ilbert, O.; Masters, D. C.; Stern, D.; Andreon, S.; Auricchio, N.; Balaguera-Antolínez, A.; Baldi, M.; Bardelli, S.; Biviano, A.; Bodendorf, C.; Bonino, D.; Bozzo, E.; Branchini, E.; Brau-Nogue, S.; Brescia, M.; Burigana, C.; Cabanac, R. A.; Camera, S.; Capobianco, V.; Cappi, A.; Carbone, C.; Carretero, J.; Carvalho, C. S.; Casas, R.; Casas, S.; Castellano, M.; Castignani, G.; Cavuoti, S.; Cimatti, A.; Cledassou, R.; Colodro-Conde, C.; Congedo, G.; Conselice, C. J.; Conversi, L.; Copin, Y.; Corcione, L.; Costille, A.; Coupon, J.; Courtois, H. M.; Cropper, M.; Da Silva, A.; de la Torre, S.; Di Ferdinando, D.; Dubath, F.; Duncan, C. A. J.; Dupac, X.; Dusini, S.; Fabricius, M.; Farrens, S.; Ferreira, P. G.; Fotopoulou, S.; Frailis, M.; Franceschi, E.; Fumana, M.; Galeotta, S.; Garilli, B.; Gillis, B.; Giocoli, C.; Gozaliasl, G.; Graciá-Carpio, J.; Grupp, F.; Guzzo, L.; Hildebrandt, H.; Hoekstra, H.; Hormuth, F.; Israel, H.; Jahnke, K.; Keihanen, E.; Kermiche, S.; Kilbinger, M.; Kirkpatrick, C. C.; Kitching, T.; Kubik, B.; Kunz, M.; Kurki-Suonio, H.; Laureijs, R.; Ligori, S.; Lilje, P. B.; Lloro, I.; Maino, D.; Maiorano, E.; Maraston, C.; Marggraf, O.; Martinet, N.; Marulli, F.; Massey, R.; Maurogordato, S.; Medinaceli, E.; Mei, S.; Meneghetti, M.; Metcalf, R. Benton; Meylan, G.; Moresco, M.; Moscardini, L.; Munari, E.; Nakajima, R.; Neissner, C.; Niemi, S.; Nucita, A. A.; Padilla, C.; Pasian, F.; Patrizii, L.; Pocino, A.; Poncet, M.; Pozzetti, L.; Raison, F.; Renzi, A.; Rhodes, J.; Riccio, G.; Romelli, E.; Roncarelli, M.; Rossetti, E.; Sanchez, A. G.; Sapone, D.; Schneider, P.; Scottez, V.; Secroun, A.; Serrano, S.; Sirignano, C.; Sirri, G.; Sureau, F.; Tallada-Crespi, P.; Tavagnacco, D.; Taylor, A. N.; Tenti, M.; Tereno, I.; Toledo-Moreo, R.; Torradeflot, F.; Tramacere, A.; Valenziano, L.; Vassallo, T.; Wang, Y.; Welikala, N.; Wetzstein, M.; Whittaker, L.; Zacchei, A.; Zamorani, G.; Zoubian, J.; Zucca, E.
Comments: 21 pages, 12 figures
Submitted: 2020-07-06
The Complete Calibration of the Colour-Redshift Relation survey (C3R2) is a spectroscopic effort involving ESO and Keck facilities designed to empirically calibrate the galaxy colour-redshift relation - P(z|C) to the Euclid depth (i_AB=24.5) and is intimately linked to upcoming Stage IV dark energy missions based on weak lensing cosmology. The aim is to build a spectroscopic calibration sample that is as representative as possible of the galaxies of the Euclid weak lensing sample. In order to minimise the number of spectroscopic observations to fill the gaps in current knowledge of the P(z|C), self-organising map (SOM) representations of the galaxy colour space have been constructed. Here we present the first results of an ESO@ VLT Large Programme approved in the context of C3R2, which makes use of the two VLT optical and near-infrared multi-object spectrographs, FORS2 and KMOS. This paper focuses on high-quality spectroscopic redshifts of high-z galaxies observed with the KMOS spectrograph in the H- and K-bands. A total of 424 highly-reliable z are measured in the 1.3<=z<=2.5 range, with total success rates of 60.7% in the H-band and 32.8% in the K-band. The newly determined z fill 55% of high and 35% of lower priority empty SOM grid cells. We measured Halpha fluxes in a 1."2 radius aperture from the spectra of the spectroscopically confirmed galaxies and converted them into star formation rates. In addition, we performed an SED fitting analysis on the same sample in order to derive stellar masses, E(B-V), total magnitudes, and SFRs. We combine the results obtained from the spectra with those derived via SED fitting, and we show that the spectroscopic failures come from either weakly star-forming galaxies (at z<1.7, i.e. in the H-band) or low S/N spectra (in the K-band) of z>2 galaxies.
[2]  oai:arXiv.org:2007.01840  [pdf] - 2127599
Rejection criteria based on outliers in the KiDS photometric redshifts and PDF distributions derived by machine learning
Comments: Preprint version of the manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-7287
Submitted: 2020-07-03
The Probability Density Function (PDF) provides an estimate of the photometric redshift (zphot) prediction error. It is crucial for current and future sky surveys, characterized by strict requirements on the zphot precision, reliability and completeness. The present work stands on the assumption that properly defined rejection criteria, capable of identifying and rejecting potential outliers, can increase the precision of zphot estimates and of their cumulative PDF, without sacrificing much in terms of completeness of the sample. We provide a way to assess rejection through proper cuts on the shape descriptors of a PDF, such as the width and the height of the maximum PDF's peak. In this work we tested these rejection criteria to galaxies with photometry extracted from the Kilo Degree Survey (KiDS) ESO Data Release 4, proving that such approach could lead to significant improvements to the zphot quality: e.g., for the clipped sample showing the best trade-off between precision and completeness, we achieve a reduction in outliers fraction of $\simeq 75\%$ and an improvement of $\simeq 6\%$ for NMAD, with respect to the original data set, preserving the $\simeq 93\%$ of its content.
[3]  oai:arXiv.org:2007.01240  [pdf] - 2126917
Statistical characterization and classification of astronomical transients with Machine Learning in the era of the Vera Rubin Survey Telescope
Comments: Preprint version of the manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-7287
Submitted: 2020-07-02
Astronomy has entered the multi-messenger data era and Machine Learning has found widespread use in a large variety of applications. The exploitation of synoptic (multi-band and multi-epoch) surveys, like LSST (Large Synoptic Survey Telescope), requires an extensive use of automatic methods for data processing and interpretation. With data volumes in the petabyte domain, the discrimination of time-critical information has already exceeded the capabilities of human operators and crowds of scientists have extreme difficulty to manage such amounts of data in multi-dimensional domains. This work is focused on an analysis of critical aspects related to the approach, based on Machine Learning, to variable sky sources classification, with special care to the various types of Supernovae, one of the most important subjects of Time Domain Astronomy, due to their crucial role in Cosmology. The work is based on a test campaign performed on simulated data. The classification was carried out by comparing the performances among several Machine Learning algorithms on statistical parameters extracted from the light curves. The results make in evidence some critical aspects related to the data quality and their parameter space characterization, propaedeutic to the preparation of processing machinery for the real data exploitation in the incoming decade.
[4]  oai:arXiv.org:2006.13905  [pdf] - 2121597
Periodic Astrometric Signal Recovery through Convolutional Autoencoders
Comments: Preprint version of the manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-7287
Submitted: 2020-06-24
Astrometric detection involves a precise measurement of stellar positions, and is widely regarded as the leading concept presently ready to find earth-mass planets in temperate orbits around nearby sun-like stars. The TOLIMAN space telescope[39] is a low-cost, agile mission concept dedicated to narrow-angle astrometric monitoring of bright binary stars. In particular the mission will be optimised to search for habitable-zone planets around Alpha Centauri AB. If the separation between these two stars can be monitored with sufficient precision, tiny perturbations due to the gravitational tug from an unseen planet can be witnessed and, given the configuration of the optical system, the scale of the shifts in the image plane are about one millionth of a pixel. Image registration at this level of precision has never been demonstrated (to our knowledge) in any setting within science. In this paper we demonstrate that a Deep Convolutional Auto-Encoder is able to retrieve such a signal from simplified simulations of the TOLIMAN data and we present the full experimental pipeline to recreate out experiments from the simulations to the signal analysis. In future works, all the more realistic sources of noise and systematic effects present in the real-world system will be injected into the simulations.
[5]  oai:arXiv.org:2006.13235  [pdf] - 2121532
Nature versus nurture: relic nature and environment of the most massive passive galaxies at $z < 0.5$
Comments: Accepted for publication on Astronomy & Astrophysics Letters, 6 pages, 1 figure
Submitted: 2020-06-23
Relic galaxies are thought to be the progenitors of high-redshift red nuggets that for some reason missed the channels of size growth and evolved passively and undisturbed since the first star formation burst (at $z>2$). These local ultracompact old galaxies are unique laboratories for studying the star formation processes at high redshift and thus the early stage of galaxy formation scenarios. Counterintuitively, theoretical and observational studies indicate that relics are more common in denser environments, where merging events predominate. To verify this scenario, we compared the number counts of a sample of ultracompact massive galaxies (UCMGs) selected within the third data release of the Kilo Degree Survey, that is, systems with sizes $R_{\rm e} < 1.5 \, \rm kpc$ and stellar masses $M_{\rm \star} > 8 \times 10^{10}\, \rm M_{\odot}$, with the number counts of galaxies with the same masses but normal sizes in field and cluster environments. Based on their optical and near-infrared colors, these UCMGs are likely to be mainly old, and hence representative of the relic population. We find that both UCMGs and normal-size galaxies are more abundant in clusters and their relative fraction depends only mildly on the global environment, with denser environments penalizing the survival of relics. Hence, UCMGs (and likely relics overall) are not special because of the environment effect on their nurture, but rather they are just a product of the stochasticity of the merging processes regardless of the global environment in which they live.
[6]  oai:arXiv.org:2006.08235  [pdf] - 2114364
Anomaly detection in Astrophysics: a comparison between unsupervised Deep and Machine Learning on KiDS data
Comments: Preprint version of the manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-7287
Submitted: 2020-06-15
Every field of Science is undergoing unprecedented changes in the discovery process, and Astronomy has been a main player in this transition since the beginning. The ongoing and future large and complex multi-messenger sky surveys impose a wide exploiting of robust and efficient automated methods to classify the observed structures and to detect and characterize peculiar and unexpected sources. We performed a preliminary experiment on KiDS DR4 data, by applying to the problem of anomaly detection two different unsupervised machine learning algorithms, considered as potentially promising methods to detect peculiar sources, a Disentangled Convolutional Autoencoder and an Unsupervised Random Forest. The former method, working directly on images, is considered potentially able to identify peculiar objects like interacting galaxies and gravitational lenses. The latter instead, working on catalogue data, could identify objects with unusual values of magnitudes and colours, which in turn could indicate the presence of singularities.
[7]  oai:arXiv.org:2006.08238  [pdf] - 2114365
Comparison of outlier detection methods on astronomical image data
Comments: Preprint version of the accepted manuscript to appear in the Volume "Intelligent Astrophysics" of the series "Emergence, Complexity and Computation", Book eds. I. Zelinka, D. Baron, M. Brescia, Springer Nature Switzerland, ISSN: 2194-7287
Submitted: 2020-06-15
Among the many challenges posed by the huge data volumes produced by the new generation of astronomical instruments there is also the search for rare and peculiar objects. Unsupervised outlier detection algorithms may provide a viable solution. In this work we compare the performances of six methods: the Local Outlier Factor, Isolation Forest, k-means clustering, a measure of novelty, and both a normal and a convolutional autoencoder. These methods were applied to data extracted from SDSS stripe 82. After discussing the sensitivity of each method to its own set of hyperparameters, we combine the results from each method to rank the objects and produce a final list of outliers.
[8]  oai:arXiv.org:2005.00055  [pdf] - 2087525
Euclid: The importance of galaxy clustering and weak lensing cross-correlations within the photometric Euclid survey
Comments: 15 pages, 8 figures
Submitted: 2020-04-30
The data from the Euclid mission will enable the measurement of the photometric redshifts, angular positions, and weak lensing shapes for over a billion galaxies. This large dataset will allow for cosmological analyses using the angular clustering of galaxies and cosmic shear. The cross-correlation (XC) between these probes can tighten constraints and it is therefore important to quantify their impact for Euclid. In this study we carefully quantify the impact of XC not only on the final parameter constraints for different cosmological models, but also on the nuisance parameters. In particular, we aim at understanding the amount of additional information that XC can provide for parameters encoding systematic effects, such as galaxy bias or intrinsic alignments (IA). We follow the formalism presented in Euclid Collaboration: Blanchard et al. (2019) and make use of the codes validated therein. We show that XC improves the dark energy Figure of Merit (FoM) by a factor $\sim 5$, whilst it also reduces the uncertainties on galaxy bias by $\sim 17\%$ and the uncertainties on IA by a factor $\sim 4$. We observe that the role of XC on the final parameter constraints is qualitatively the same irrespective of the galaxy bias model used. We also show that XC can help in distinguishing between different IA models, and that if IA terms are neglected then this can lead to significant biases on the cosmological parameters. We find that the XC terms are necessary to extract the full information content from the data in future analyses. They help in better constraining the cosmological model, and lead to a better understanding of the systematic effects that contaminate these probes. Furthermore, we find that XC helps in constraining the mean of the photometric-redshift distributions, but it requires a more precise knowledge of this mean in order not to degrade the final FoM. [Abridged]
[9]  oai:arXiv.org:1912.07326  [pdf] - 2085090
Euclid: The reduced shear approximation and magnification bias for Stage IV cosmic shear experiments
Comments: 16 pages, 6 figures, submitted to Astronomy & Astrophysics on 16/12/2019, accepted on 04/03/2020. SSC Fisher procedure corrected
Submitted: 2019-12-16, last modified: 2020-04-01
Stage IV weak lensing experiments will offer more than an order of magnitude leap in precision. We must therefore ensure that our analyses remain accurate in this new era. Accordingly, previously ignored systematic effects must be addressed. In this work, we evaluate the impact of the reduced shear approximation and magnification bias, on the information obtained from the angular power spectrum. To first-order, the statistics of reduced shear, a combination of shear and convergence, are taken to be equal to those of shear. However, this approximation can induce a bias in the cosmological parameters that can no longer be neglected. A separate bias arises from the statistics of shear being altered by the preferential selection of galaxies and the dilution of their surface densities, in high-magnification regions. The corrections for these systematic effects take similar forms, allowing them to be treated together. We calculated the impact of neglecting these effects on the cosmological parameters that would be determined from Euclid, using cosmic shear tomography. To do so, we employed the Fisher matrix formalism, and included the impact of the super-sample covariance. We also demonstrate how the reduced shear correction can be calculated using a lognormal field forward modelling approach. These effects cause significant biases in Omega_m, sigma_8, n_s, Omega_DE, w_0, and w_a of -0.53 sigma, 0.43 sigma, -0.34 sigma, 1.36 sigma, -0.68 sigma, and 1.21 sigma, respectively. We then show that these lensing biases interact with another systematic: the intrinsic alignment of galaxies. Accordingly, we develop the formalism for an intrinsic alignment-enhanced lensing bias correction. Applying this to Euclid, we find that the additional terms introduced by this correction are sub-dominant.
[10]  oai:arXiv.org:2002.12922  [pdf] - 2076992
Building the largest spectroscopic sample of ultra-compact massive galaxies with the Kilo Degree Survey
Comments:
Submitted: 2020-02-28
Ultra-compact massive galaxies UCMGs, i.e. galaxies with stellar masses $M_{*} > 8 \times 10^{10} M_{\odot}$ and effective radii $R_{e} < 1.5$ kpc, are very rare systems, in particular at low and intermediate redshifts. Their origin as well as their number density across cosmic time are still under scrutiny, especially because of the paucity of spectroscopically confirmed samples. We have started a systematic census of UCMG candidates within the ESO Kilo Degree Survey, together with a large spectroscopic follow-up campaign to build the largest possible sample of confirmed UCMGs. This is the third paper of the series and the second based on the spectroscopic follow-up program. Here, we present photometrical and structural parameters of 33 new candidates at redshifts $0.15 \lesssim z \lesssim 0.5$ and confirm 19 of them as UCMGs, based on their nominal spectroscopically inferred $M_{*}$ and $R_{e}$. This corresponds to a success rate of $\sim 58\%$, nicely consistent with our previous findings. The addition of these 19 newly confirmed objects, allows us to fully assess the systematics on the system selection, and finally reduce the number density uncertainties. Moreover, putting together the results from our current and past observational campaigns and some literature data, we build the largest sample of UCMGs ever collected, comprising 92 spectroscopically confirmed objects at $0.1 \lesssim z \lesssim 0.5$. This number raises to 116, allowing for a $3\sigma$ tolerance on the $M_{*}$ and $R_{e}$ thresholds for the UCMG definition. For all these galaxies we have estimated the velocity dispersion values at the effective radii which have been used to derive a preliminary mass-velocity dispersion correlation.
[11]  oai:arXiv.org:2001.03621  [pdf] - 2029805
Evaluation of probabilistic photometric redshift estimation approaches for LSST
Comments: submitted to MNRAS
Submitted: 2020-01-10
Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior probability density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing twelve photo-z algorithms applied to mock data produced for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC). By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/under-breadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate (CDE) loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performance metrics.
[12]  oai:arXiv.org:1912.04020  [pdf] - 2050288
The Hi-GAL catalogue of dusty filamentary structures in the Galactic Plane
Comments: 38 pages, 29 figures, 3 appendices
Submitted: 2019-12-09
The recent data collected by {\it Herschel} have confirmed that interstellar structures with filamentary shape are ubiquitously present in the Milky Way. Filaments are thought to be formed by several physical mechanisms acting from the large Galactic scales down to the sub-pc fractions of molecular clouds, and they might represent a possible link between star formation and the large-scale structure of the Galaxy. In order to study this potential link, a statistically significant sample of filaments spread throughout the Galaxy is required. In this work we present the first catalogue of $32,059$ candidate filaments automatically identified in the Hi-GAL survey of the entire Galactic Plane. For these objects we determined morphological (length, $l^{a}$, and geometrical shape) and physical (average column density, $N_{\rm H_{2}}$, and average temperature, $T$) properties. We identified filaments with a wide range of properties: 2$'$\,$\leq l^{a}\leq$\, 100$'$, $10^{20} \leq N_{\rm H_{2}} \leq 10^{23}$\,cm$^{-2}$ and $10 \leq T\leq$ 35\,K. We discuss their association with the Hi-GAL compact sources, finding that the most tenuous (and stable) structures do not host any major condensation and we also assign a distance to $\sim 18,400$ filaments for which we determine mass, physical size, stability conditions and Galactic distribution. When compared to the spiral arms structure, we find no significant difference between the physical properties of on-arm and inter-arm filaments. We compared our sample with previous studies, finding that our Hi-GAL filament catalogue represents a significant extension in terms of Galactic coverage and sensitivity. This catalogue represents an unique and important tool for future studies devoted to understanding the filament life-cycle.
[13]  oai:arXiv.org:1908.04310  [pdf] - 1994125
Euclid preparation: V. Predicted yield of redshift 7<z<9 quasars from the wide survey
Euclid Collaboration; Barnett, R.; Warren, S. J.; Mortlock, D. J.; Cuby, J. -G.; Conselice, C.; Hewett, P. C.; Willott, C. J.; Auricchio, N.; Balaguera-Antolínez, A.; Baldi, M.; Bardelli, S.; Bellagamba, F.; Bender, R.; Biviano, A.; Bonino, D.; Bozzo, E.; Branchini, E.; Brescia, M.; Brinchmann, J.; Burigana, C.; Camera, S.; Capobianco, V.; Carbone, C.; Carretero, J.; Carvalho, C. S.; Castander, F. J.; Castellano, M.; Cavuoti, S.; Cimatti, A.; Clédassou, R.; Congedo, G.; Conversi, L.; Copin, Y.; Corcione, L.; Coupon, J.; Courtois, H. M.; Cropper, M.; Da Silva, A.; Duncan, C. A. J.; Dusini, S.; Ealet, A.; Farrens, S.; Fosalba, P.; Fotopoulou, S.; Fourmanoit, N.; Frailis, M.; Fumana, M.; Galeotta, S.; Garilli, B.; Gillard, W.; Gillis, B. R.; Graciá-Carpio, J.; Grupp, F.; Hoekstra, H.; Hormuth, F.; Israel, H.; Jahnke, K.; Kermiche, S.; Kilbinger, M.; Kirkpatrick, C. C.; Kitching, T.; Kohley, R.; Kubik, B.; Kunz, M.; Kurki-Suonio, H.; Laureijs, R.; Ligori, S.; Lilje, P. B.; Lloro, I.; Maiorano, E.; Mansutti, O.; Marggraf, O.; Martinet, N.; Marulli, F.; Massey, R.; Mauri, N.; Medinaceli, E.; Mei, S.; Mellier, Y.; Metcalf, R. B.; Metge, J. J.; Meylan, G.; Moresco, M.; Moscardini, L.; Munari, E.; Neissner, C.; Niemi, S. M.; Nutma, T.; Padilla, C.; Paltani, S.; Pasian, F.; Paykari, P.; Percival, W. J.; Pettorino, V.; Polenta, G.; Poncet, M.; Pozzetti, L.; Raison, F.; Renzi, A.; Rhodes, J.; Rix, H. -W.; Romelli, E.; Roncarelli, M.; Rossetti, E.; Saglia, R.; Sapone, D.; Scaramella, R.; Schneider, P.; Scottez, V.; Secroun, A.; Serrano, S.; Sirri, G.; Stanco, L.; Sureau, F.; Tallada-Crespí, P.; Tavagnacco, D.; Taylor, A. N.; Tenti, M.; Tereno, I.; Toledo-Moreo, R.; Torradeflot, F.; Valenziano, L.; Vassallo, T.; Wang, Y.; Zacchei, A.; Zamorani, G.; Zoubian, J.; Zucca, E.
Comments: Published in A&A. Updated to match accepted version
Submitted: 2019-08-12, last modified: 2019-11-05
We provide predictions of the yield of $7<z<9$ quasars from the Euclid wide survey, updating the calculation presented in the Euclid Red Book in several ways. We account for revisions to the Euclid near-infrared filter wavelengths; we adopt steeper rates of decline of the quasar luminosity function (QLF; $\Phi$) with redshift, $\Phi\propto10^{k(z-6)}$, $k=-0.72$, and a further steeper rate of decline, $k=-0.92$; we use better models of the contaminating populations (MLT dwarfs and compact early-type galaxies); and we use an improved Bayesian selection method, compared to the colour cuts used for the Red Book calculation, allowing the identification of fainter quasars, down to $J_{AB}\sim23$. Quasars at $z>8$ may be selected from Euclid $OYJH$ photometry alone, but selection over the redshift interval $7<z<8$ is greatly improved by the addition of $z$-band data from, e.g., Pan-STARRS and LSST. We calculate predicted quasar yields for the assumed values of the rate of decline of the QLF beyond $z=6$. For the case that the decline of the QLF accelerates beyond $z=6$, with $k=-0.92$, Euclid should nevertheless find over 100 quasars with $7.0<z<7.5$, and $\sim25$ quasars beyond the current record of $z=7.5$, including $\sim8$ beyond $z=8.0$. The first Euclid quasars at $z>7.5$ should be found in the DR1 data release, expected in 2024. It will be possible to determine the bright-end slope of the QLF, $7<z<8$, $M_{1450}<-25$, using 8m class telescopes to confirm candidates, but follow-up with JWST or E-ELT will be required to measure the faint-end slope. Contamination of the candidate lists is predicted to be modest even at $J_{AB}\sim23$. The precision with which $k$ can be determined over $7<z<8$ depends on the value of $k$, but assuming $k=-0.72$ it can be measured to a 1 sigma uncertainty of 0.07.
[14]  oai:arXiv.org:1910.10521  [pdf] - 2068949
Euclid preparation: VI. Verifying the Performance of Cosmic Shear Experiments
Euclid Collaboration; Paykari, P.; Kitching, T. D.; Hoekstra, H.; Azzollini, R.; Cardone, V. F.; Cropper, M.; Duncan, C. A. J.; Kannawadi, A.; Miller, L.; Aussel, H.; Conti, I. F.; Auricchio, N.; Baldi, M.; Bardelli, S.; Biviano, A.; Bonino, D.; Borsato, E.; Bozzo, E.; Branchini, E.; Brau-Nogue, S.; Brescia, M.; Brinchmann, J.; Burigana, C.; Camera, S.; Capobianco, V.; Carbone, C.; Carretero, J.; Castander, F. J.; Castellano, M.; Cavuoti, S.; Charles, Y.; Cledassou, R.; Colodro-Conde, C.; Congedo, G.; Conselice, C.; Conversi, L.; Copin, Y.; Coupon, J.; Courtois, H. M.; Da Silva, A.; Dupac, X.; Fabbian, G.; Farrens, S.; Ferreira, P. G.; Fosalba, P.; Fourmanoit, N.; Frailis, M.; Fumana, M.; Galeotta, S.; Garilli, B.; Gillard, W.; Gillis, B. R.; Giocoli, C.; Gracia-Carpio, J.; Grupp, F.; Hormuth, F.; Ilic, S.; Israel, H.; Jahnke, K.; Keihanen, E.; Kermiche, S.; Kilbinger, M.; Kirkpatrick, C. C.; Kubik, B.; Kunz, M.; Kurki-Suonio, H.; Lacasa, F.; Laureijs, R.; Mignant, D. Le; Ligori, S.; Lilje, P. B.; Lloro, I.; Maciaszek, T.; Maiorano, E.; Marggraf, O.; Martinelli, M.; Martinet, N.; Massey, F. Marulli R.; Mauri, N.; Medinaceli, E.; Mei, S.; Mellier, Y.; Meneghetti, M.; Metcalf, R. B.; Moresco, M.; Moscardini, L.; Munari, E.; Neissner, C.; Nichol, R. C.; Niemi, S.; Nutma, T.; Padilla, C.; Paltani, S.; Pasian, F.; Pettorino, V.; Pires, S.; Polenta, G.; Pourtsidou, A.; Raison, F.; Renzi, A.; Rhodes, J.; Romelli, E.; Roncarelli, M.; Rossetti, E.; Saglia, R.; Sánchez, A. G.; Sapone, D.; Scaramella, R.; Schneider, P.; Schrabback, T.; Scottez, V.; Secroun, A.; Serrano, S.; Sirignano, C.; Sirri, G.; Stanco, L.; Starck, J. -L.; Sureau, F.; Tallada-Crespí, P.; Taylor, A.; Tenti, M.; Tereno, I.; Toledo-Moreo, R.; Torradeflot, F.; Tutusaus, I.; Valenziano, L.; Vannier, M.; Vassallo, T.; Zoubian, J.; Zucca, E.
Comments: 18 pages. Submitted to A&A. Comments Welcome
Submitted: 2019-10-23
Our aim is to quantify the impact of systematic effects on the inference of cosmological parameters from cosmic shear. We present an end-to-end approach that introduces sources of bias in a modelled weak lensing survey on a galaxy-by-galaxy level. Residual biases are propagated through a pipeline from galaxy properties (one end) through to cosmic shear power spectra and cosmological parameter estimates (the other end), to quantify how imperfect knowledge of the pipeline changes the maximum likelihood values of dark energy parameters. We quantify the impact of an imperfect correction for charge transfer inefficiency (CTI) and modelling uncertainties of the point spread function (PSF) for Euclid, and find that the biases introduced can be corrected to acceptable levels.
[15]  oai:arXiv.org:1910.09273  [pdf] - 1983218
Euclid preparation: VII. Forecast validation for Euclid cosmological probes
Euclid Collaboration; Blanchard, A.; Camera, S.; Carbone, C.; Cardone, V. F.; Casas, S.; Ilić, S.; Kilbinger, M.; Kitching, T.; Kunz, M.; Lacasa, F.; Linder, E.; Majerotto, E.; Markovič, K.; Martinelli, M.; Pettorino, V.; Pourtsidou, A.; Sakr, Z.; Sánchez, A. G.; Sapone, D.; Tutusaus, I.; Yahia-Cherif, S.; Yankelevich, V.; Andreon, S.; Aussel, H.; Balaguera-Antolínez, A.; Baldi, M.; Bardelli, S.; Bender, R.; Biviano, A.; Bonino, D.; Boucaud, A.; Bozzo, E.; Branchini, E.; Brau-Nogue, S.; Brescia, M.; Brinchmann, J.; Burigana, C.; Cabanac, R.; Capobianco, V.; Cappi, A.; Carretero, J.; Carvalho, C. S.; Casas, R.; Castander, F. J.; Castellano, M.; Cavuoti, S.; Cimatti, A.; Cledassou, R.; Colodro-Conde, C.; Congedo, G.; Conselice, C. J.; Conversi, L.; Copin, Y.; Corcione, L.; Coupon, J.; Courtois, H. M.; Cropper, M.; Da Silva, A.; de la Torre, S.; Di Ferdinando, D.; Dubath, F.; Ducret, F.; Duncan, C. A. J.; Dupac, X.; Dusini, S.; Fabbian, G.; Fabricius, M.; Farrens, S.; Fosalba, P.; Fotopoulou, S.; Fourmanoit, N.; Frailis, M.; Franceschi, E.; Franzetti, P.; Fumana, M.; Galeotta, S.; Gillard, W.; Gillis, B.; Giocoli, C.; Gómez-Alvarez, P.; Graciá-Carpio, J.; Grupp, F.; Guzzo, L.; Hoekstra, H.; Hormuth, F.; Israel, H.; Jahnke, K.; Keihanen, E.; Kermiche, S.; Kirkpatrick, C. C.; Kohley, R.; Kubik, B.; Kurki-Suonio, H.; Ligori, S.; Lilje, P. B.; Lloro, I.; Maino, D.; Maiorano, E.; Marggraf, O.; Martinet, N.; Marulli, F.; Massey, R.; Medinaceli, E.; Mei, S.; Mellier, Y.; Metcalf, B.; Metge, J. J.; Meylan, G.; Moresco, M.; Moscardini, L.; Munari, E.; Nichol, R. C.; Niemi, S.; Nucita, A. A.; Padilla, C.; Paltani, S.; Pasian, F.; Percival, W. J.; Pires, S.; Polenta, G.; Poncet, M.; Pozzetti, L.; Racca, G. D.; Raison, F.; Renzi, A.; Rhodes, J.; Romelli, E.; Roncarelli, M.; Rossetti, E.; Saglia, R.; Schneider, P.; Scottez, V.; Secroun, A.; Sirri, G.; Stanco, L.; Starck, J. -L.; Sureau, F.; Tallada-Crespí, P.; Tavagnacco, D.; Taylor, A. N.; Tenti, M.; Tereno, I.; Toledo-Moreo, R.; Torradeflot, F.; Valenziano, L.; Vassallo, T.; Kleijn, G. A. Verdoes; Viel, M.; Wang, Y.; Zacchei, A.; Zoubian, J.; Zucca, E.
Comments: 75 pages, 13 figures, 18 tables. Acknowledgements include Authors' contributions. Abstract abridged
Submitted: 2019-10-21
The Euclid space telescope will measure the shapes and redshifts of galaxies to reconstruct the expansion history of the Universe and the growth of cosmic structures. Estimation of the expected performance of the experiment, in terms of predicted constraints on cosmological parameters, has so far relied on different methodologies and numerical implementations, developed for different observational probes and for their combination. In this paper we present validated forecasts, that combine both theoretical and observational expertise for different cosmological probes. This is presented to provide the community with reliable numerical codes and methods for Euclid cosmological forecasts. We describe in detail the methodology adopted for Fisher matrix forecasts, applied to galaxy clustering, weak lensing and their combination. We estimate the required accuracy for Euclid forecasts and outline a methodology for their development. We then compare and improve different numerical implementations, reaching uncertainties on the errors of cosmological parameters that are less than the required precision in all cases. Furthermore, we provide details on the validated implementations that can be used by the reader to validate their own codes if required. We present new cosmological forecasts for Euclid. We find that results depend on the specific cosmological model and remaining freedom in each setup, i.e. flat or non-flat spatial cosmologies, or different cuts at nonlinear scales. The validated numerical implementations can now be reliably used for any setup. We present results for an optimistic and a pessimistic choice of such settings. We demonstrate that the impact of cross-correlations is particularly relevant for models beyond a cosmological constant and may allow us to increase the dark energy Figure of Merit by at least a factor of three.
[16]  oai:arXiv.org:1910.01884  [pdf] - 1978833
Astroinformatics based search for globular clusters in the Fornax Deep Survey
Comments: 29 pages, 14 figures
Submitted: 2019-10-04
In the last years, Astroinformatics has become a well defined paradigm for many fields of Astronomy. In this work we demonstrate the potential of a multidisciplinary approach to identify globular clusters (GCs) in the Fornax cluster of galaxies taking advantage of multi-band photometry produced by the VLT Survey Telescope using automatic self-adaptive methodologies. The data analyzed in this work consist of deep, multi-band, partially overlapping images centered on the core of the Fornax cluster. In this work we use a Neural-Gas model, a pure clustering machine learning methodology, to approach the GC detection, while a novel feature selection method ($\Phi$LAB) is exploited to perform the parameter space analysis and optimization. We demonstrate that the use of an Astroinformatics based methodology is able to provide GC samples that are comparable, in terms of purity and completeness with those obtained using single band HST data (Brescia et al. 2012) and two approaches based respectively on a morpho-photometric (Cantiello et al. 2018b) and a PCA analysis (D'Abrusco et al. 2015) using the same data discussed in this work.
[17]  oai:arXiv.org:1909.00606  [pdf] - 1953830
Photometric redshifts for X-ray-selected active galactic nuclei in the eROSITA era
Comments:
Submitted: 2019-09-02
With the launch of eROSITA (extended Roentgen Survey with an Imaging Telescope Array), successfully occurred on 2019 July 13, we are facing the challenge of computing reliable photometric redshifts for 3 million of active galactic nuclei (AGNs) over the entire sky, having available only patchy and inhomogeneous ancillary data. While we have a good understanding of the photo-z quality obtainable for AGN using spectral energy distribution (SED)-fitting technique, we tested the capability of machine learning (ML), usually reliable in computing photo-z for QSO in wide and shallow areas with rich spectroscopic samples. Using MLPQNA as example of ML, we computed photo-z for the X-ray-selected sources in Stripe 82X, using the publicly available photometric and spectroscopic catalogues. Stripe 82X is at least as deep as eROSITA will be and wide enough to include also rare and bright AGNs. In addition, the availability of ancillary data mimics what can be available in the whole sky. We found that when optical, and near- and mid-infrared data are available, ML and SED fitting perform comparably well in terms of overall accuracy, realistic redshift probability density functions, and fraction of outliers, although they are not the same for the two methods. The results could further improve if the photometry available is accurate and including morphological information. Assuming that we can gather sufficient spectroscopy to build a representative training sample, with the current photometry coverage we can obtain reliable photo-z for a large fraction of sources in the Southern hemisphere well before the spectroscopic follow-up, thus timely enabling the eROSITA science return. The photo-z catalogue is released here.
[18]  oai:arXiv.org:1902.02522  [pdf] - 1895994
Star Formation Rates for photometric samples of galaxies using machine learning methods
Comments:
Submitted: 2019-02-07, last modified: 2019-06-06
Star Formation Rates or SFRs are crucial to constrain theories of galaxy formation and evolution. SFRs are usually estimated via spectroscopic observations requiring large amounts of telescope time. We explore an alternative approach based on the photometric estimation of global SFRs for large samples of galaxies, by using methods such as automatic parameter space optimisation, and supervised Machine Learning models. We demonstrate that, with such approach, accurate multi-band photometry allows to estimate reliable SFRs. We also investigate how the use of photometric rather than spectroscopic redshifts, affects the accuracy of derived global SFRs. Finally, we provide a publicly available catalogue of SFRs for more than 27 million galaxies extracted from the Sloan Digital Sky survey Data Release 7. The catalogue is available through the Vizier facility at the following link ftp://cdsarc.u-strasbg.fr/pub/cats/J/MNRAS/486/1377.
[19]  oai:arXiv.org:1812.03084  [pdf] - 1863913
Catalog of quasars from the Kilo-Degree Survey Data Release 3
Comments: Data available from the KiDS website at http://kids.strw.leidenuniv.nl/DR3/quasarcatalog.php and the source code from https://github.com/snakoneczny/kids-quasars
Submitted: 2018-12-07, last modified: 2019-04-09
We present a catalog of quasars selected from broad-band photometric ugri data of the Kilo-Degree Survey Data Release 3 (KiDS DR3). The QSOs are identified by the random forest (RF) supervised machine learning model, trained on SDSS DR14 spectroscopic data. We first cleaned the input KiDS data from entries with excessively noisy, missing or otherwise problematic measurements. Applying a feature importance analysis, we then tune the algorithm and identify in the KiDS multiband catalog the 17 most useful features for the classification, namely magnitudes, colors, magnitude ratios, and the stellarity index. We used the t-SNE algorithm to map the multi-dimensional photometric data onto 2D planes and compare the coverage of the training and inference sets. We limited the inference set to r<22 to avoid extrapolation beyond the feature space covered by training, as the SDSS spectroscopic sample is considerably shallower than KiDS. This gives 3.4 million objects in the final inference sample, from which the random forest identified 190,000 quasar candidates. Accuracy of 97%, purity of 91%, and completeness of 87%, as derived from a test set extracted from SDSS and not used in the training, are confirmed by comparison with external spectroscopic and photometric QSO catalogs overlapping with the KiDS footprint. The robustness of our results is strengthened by number counts of the quasar candidates in the r band, as well as by their mid-infrared colors available from WISE. An analysis of parallaxes and proper motions of our QSO candidates found also in Gaia DR2 suggests that a probability cut of p(QSO)>0.8 is optimal for purity, whereas p(QSO)>0.7 is preferable for better completeness. Our study presents the first comprehensive quasar selection from deep high-quality KiDS data and will serve as the basis for versatile studies of the QSO population detected by this survey.
[20]  oai:arXiv.org:1902.05188  [pdf] - 1958084
A Comparison of Photometric Redshift Techniques for Large Radio Surveys
Comments: Submitted to PASP
Submitted: 2019-02-13
Future radio surveys will generate catalogues of tens of millions of radio sources, for which redshift estimates will be essential to achieve many of the science goals. However, spectroscopic data will be available for only a small fraction of these sources, and in most cases even the optical and infrared photometry will be of limited quality. Furthermore, radio sources tend to be at higher redshift than most optical sources and so a significant fraction of radio sources hosts differ from those for which most photometric redshift templates are designed. We therefore need to develop new techniques for estimating the redshifts of radio sources. As a starting point in this process, we evaluate a number of machine-learning techniques for estimating redshift, together with a conventional template-fitting technique. We pay special attention to how the performance is affected by the incompleteness of the training sample and by sparseness of the parameter space or by limited availability of ancillary multi-wavelength data. As expected, we find that the quality of the photometric-redshift degrades as the quality of the photometry decreases, but that even with the limited quality of photometry available for all sky-surveys, useful redshift information is available for the majority of sources, particularly at low redshift. We find that a template-fitting technique performs best with high-quality and almost complete multi-band photometry, especially if radio sources that are also X-ray emitting are treated separately. When we reduced the quality of photometry to match that available for the EMU all-sky radio survey, the quality of the template-fitting degraded and became comparable to some of the machine learning methods. Machine learning techniques currently perform better at low redshift than at high redshift, because of incompleteness of the currently available training data at high redshifts.
[21]  oai:arXiv.org:1805.06338  [pdf] - 1820122
Stellar formation rates in galaxies using Machine Learning models
Comments: ESANN 2018 - Proceedings, ISBN-13 9782875870483
Submitted: 2018-05-16, last modified: 2019-01-23
Global Stellar Formation Rates or SFRs are crucial to constrain theories of galaxy formation and evolution. SFR's are usually estimated via spectroscopic observations which require too much previous telescope time and therefore cannot match the needs of modern precision cosmology. We therefore propose a novel method to estimate SFRs for large samples of galaxies using a variety of supervised ML models.
[22]  oai:arXiv.org:1810.09777  [pdf] - 1774830
Statistical analysis of probability density functions for photometric redshifts through the KiDS-ESO-DR3 galaxies
Comments: Accepted for publication by MNRAS, 20 pages, 14 figures
Submitted: 2018-10-23
Despite the high accuracy of photometric redshifts (zphot) derived using Machine Learning (ML) methods, the quantification of errors through reliable and accurate Probability Density Functions (PDFs) is still an open problem. First, because it is difficult to accurately assess the contribution from different sources of errors, namely internal to the method itself and from the photometric features defining the available parameter space. Second, because the problem of defining a robust statistical method, always able to quantify and qualify the PDF estimation validity, is still an open issue. We present a comparison among PDFs obtained using three different methods on the same data set: two ML techniques, METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) and ANNz2, plus the spectral energy distribution template fitting method, BPZ. The photometric data were extracted from the KiDS (Kilo Degree Survey) ESO Data Release 3, while the spectroscopy was obtained from the GAMA (Galaxy and Mass Assembly) Data Release 2. The statistical evaluation of both individual and stacked PDFs was done through quantitative and qualitative estimators, including a dummy PDF, useful to verify whether different statistical estimators can correctly assess PDF quality. We conclude that, in order to quantify the reliability and accuracy of any zphot PDF method, a combined set of statistical estimators is required.
[23]  oai:arXiv.org:1806.01307  [pdf] - 1755932
The first sample of spectroscopically confirmed ultra-compact massive galaxies in the Kilo Degree Survey
Comments: Accepted for publication on MNRAS, 27 pages, 13 figures, 7 tables. This revised and improved version presents different updates. In particular, systematics and uncertainties in the measurement of the effective radii are now better discussed, and new plots are added
Submitted: 2018-06-04, last modified: 2018-09-18
We present results from an ongoing investigation using the Kilo Degree Survey (KiDS) on the VLT Survey Telescope (VST) to provide a census of ultra-compact massive galaxies (UCMGs), defined as galaxies with stellar masses $M_{\rm \star} > 8 \times 10^{10} \rm M_{\odot}$ and effective radii $R_{\rm e} < 1.5\,\rm kpc$. UCMGs, which are expected to have undergone very few merger events, provide a unique view on the accretion history of the most massive galaxies in the Universe. Over an effective sky area of nearly 330 square degrees, we select UCMG candidates from KiDS multi-colour images, which provide high quality structural parameters, photometric redshifts and stellar masses. Our sample of $\sim 1000$ photometrically selected UCMGs at $z < 0.5$ represents the largest sample of UCMG candidates assembled to date over the largest sky area. In this paper we present the first effort to obtain their redshifts using different facilities, starting with first results for 28 candidates with redshifts $z < 0.5$, obtained at NTT and TNG telescopes. We confirmed, as bona fide UCMGs, 19 out of the 28 candidates with new redshifts. A further 46 UCMG candidates are confirmed with literature spectroscopic redshifts (35 at $z < 0.5$), bringing the final cumulative sample of spectroscopically-confirmed lower-z UCMGs to 54 galaxies, which is the largest sample at redshifts below $0.5$. We use these spectroscopic redshifts to quantify systematic errors in our photometric selection, and use these to correct our UCMG number counts. We finally compare the results to independent datasets and simulations.
[24]  oai:arXiv.org:1807.07723  [pdf] - 1725091
Vialactea Visual Analytics tool for Star Formation studies of the Galactic Plane
Comments:
Submitted: 2018-07-20
We present a visual analytics tool, based on the VisIVO suite, to exploit a combination of all new-generation surveys of the Galactic Plane to study the star formation process of the Milky Way. The tool has been developed within the VIALACTEA project, founded by the 7th Framework Programme of the European Union, that creates a common forum for the major new-generation surveys of the Milky Way Galactic Plane from the near infrared to the radio, both in thermal continuum and molecular lines. Massive volumes of data are produced by space missions and ground-based facilities and the ability to collect and store them is increasing at a higher pace than the ability to analyze them. This gap leads to new challenges in the analysis pipeline to discover information contained in the data. Visual analytics focuses on handling these massive, heterogeneous, and dynamic volumes of information accessing the data previously processed by data mining algorithms and advanced analysis techniques with highly interactive visual interfaces offering scientists the opportunity for in-depth understanding of massive, noisy, and high-dimensional data.
[25]  oai:arXiv.org:1802.07683  [pdf] - 1715967
Data Deluge in Astrophysics: Photometric Redshifts as a Template Use Case
Comments: 13 pages, 3 figures, Springer's Communications in Computer and Information Science (CCIS), Vol. 822
Submitted: 2018-02-21, last modified: 2018-07-16
Astronomy has entered the big data era and Machine Learning based methods have found widespread use in a large variety of astronomical applications. This is demonstrated by the recent huge increase in the number of publications making use of this new approach. The usage of machine learning methods, however is still far from trivial and many problems still need to be solved. Using the evaluation of photometric redshifts as a case study, we outline the main problems and some ongoing efforts to solve them.
[26]  oai:arXiv.org:1807.06085  [pdf] - 1719591
Evolution of galaxy size--stellar mass relation from the Kilo Degree Survey
Comments: accepted by MNRAS
Submitted: 2018-07-16
We have obtained structural parameters of about 340,000 galaxies from the Kilo Degree Survey (KiDS) in 153 square degrees of data release 1, 2 and 3. We have performed a seeing convolved 2D single S\'ersic fit to the galaxy images in the 4 photometric bands (u, g, r, i) observed by KiDS, by selecting high signal-to-noise ratio (S/N > 50) systems in every bands. We have classified galaxies as spheroids and disc-dominated by combining their spectral energy distribution properties and their S\'ersic index. Using photometric redshifts derived from a machine learning technique, we have determined the evolution of the effective radius, \Re\ and stellar mass, \mst, versus redshift, for both mass complete samples of spheroids and disc-dominated galaxies up to z ~ 0.6. Our results show a significant evolution of the structural quantities at intermediate redshift for the massive spheroids ($\mbox{Log}\ M_*/M_\odot>11$, Chabrier IMF), while almost no evolution has found for less massive ones ($\mbox{Log}\ M_*/M_\odot < 11$). On the other hand, disc dominated systems show a milder evolution in the less massive systems ($\mbox{Log}\ M_*/M_\odot < 11$) and possibly no evolution of the more massive systems. These trends are generally consistent with predictions from hydrodynamical simulations and independent datasets out to redshift z ~ 0.6, although in some cases the scatter of the data is large to drive final conclusions. These results, based on 1/10 of the expected KiDS area, reinforce precedent finding based on smaller statistical samples and show the route toward more accurate results, expected with the the next survey releases.
[27]  oai:arXiv.org:1802.10282  [pdf] - 1699787
Weak Lensing Study in VOICE Survey I: Shear Measurement
Comments: 15 pages, 16 figures, 4 tables. MNRAS Accepted
Submitted: 2018-02-28, last modified: 2018-06-13
The VST Optical Imaging of the CDFS and ES1 Fields (VOICE) Survey is a Guaranteed Time program carried out with the ESO/VST telescope to provide deep optical imaging over two 4 deg$^2$ patches of the sky centred on the CDFS and ES1 pointings. We present the cosmic shear measurement over the 4 deg$^2$ covering the CDFS region in the $r$-band using LensFit. Each of the four tiles of 1 deg$^2$ has more than one hundred exposures, of which more than 50 exposures passed a series of image quality selection criteria for weak lensing study. The $5\sigma$ limiting magnitude in $r$- band is 26.1 for point sources, which is $\sim$1 mag deeper than other weak lensing survey in the literature (e.g. the Kilo Degree Survey, KiDS, at VST). The photometric redshifts are estimated using the VOICE $u,g,r,i$ together with near-infrared VIDEO data $Y,J,H,K_s$. The mean redshift of the shear catalogue is 0.87, considering the shear weight. The effective galaxy number density is 16.35 gal/arcmin$^2$, which is nearly twice the one of KiDS. The performance of LensFit on such a deep dataset was calibrated using VOICE-like mock image simulations. Furthermore, we have analyzed the reliability of the shear catalogue by calculating the star-galaxy cross-correlations, the tomographic shear correlations of two redshift bins and the contaminations of the blended galaxies. As a further sanity check, we have constrained cosmological parameters by exploring the parameter space with Population Monte Carlo sampling. For a flat $\Lambda$CDM model we have obtained $\Sigma_8$ = $\sigma_8(\Omega_m/0.3)^{0.5}$ = $0.68^{+0.11}_{-0.15}$.
[28]  oai:arXiv.org:1709.04205  [pdf] - 1736187
Photometric redshifts for the Kilo-Degree Survey. Machine-learning analysis with artificial neural networks
Comments: A&A, in press. Data available from the KiDS website http://kids.strw.leidenuniv.nl/DR3/ml-photoz.php#annz2
Submitted: 2017-09-13, last modified: 2018-05-11
We present a machine-learning photometric redshift analysis of the Kilo-Degree Survey Data Release 3, using two neural-network based techniques: ANNz2 and MLPQNA. Despite limited coverage of spectroscopic training sets, these ML codes provide photo-zs of quality comparable to, if not better than, those from the BPZ code, at least up to zphot<0.9 and r<23.5. At the bright end of r<20, where very complete spectroscopic data overlapping with KiDS are available, the performance of the ML photo-zs clearly surpasses that of BPZ, currently the primary photo-z method for KiDS. Using the Galaxy And Mass Assembly (GAMA) spectroscopic survey as calibration, we furthermore study how photo-zs improve for bright sources when photometric parameters additional to magnitudes are included in the photo-z derivation, as well as when VIKING and WISE infrared bands are added. While the fiducial four-band ugri setup gives a photo-z bias $\delta z=-2e-4$ and scatter $\sigma_z<0.022$ at mean z = 0.23, combining magnitudes, colours, and galaxy sizes reduces the scatter by ~7% and the bias by an order of magnitude. Once the ugri and IR magnitudes are joined into 12-band photometry spanning up to 12 $\mu$, the scatter decreases by more than 10% over the fiducial case. Finally, using the 12 bands together with optical colours and linear sizes gives $\delta z<4e-5$ and $\sigma_z<0.019$. This paper also serves as a reference for two public photo-z catalogues accompanying KiDS DR3, both obtained using the ANNz2 code. The first one, of general purpose, includes all the 39 million KiDS sources with four-band ugri measurements in DR3. The second dataset, optimized for low-redshift studies such as galaxy-galaxy lensing, is limited to r<20, and provides photo-zs of much better quality than in the full-depth case thanks to incorporating optical magnitudes, colours, and sizes in the GAMA-calibrated photo-z derivation.
[29]  oai:arXiv.org:1802.08086  [pdf] - 1714779
Neural Gas based classification of Globular Clusters
Comments: 15 pages, 3 figures, to appear in the Volume of Springer Communications in Computer and Information Science (CCIS). arXiv admin note: substantial text overlap with arXiv:1710.03900
Submitted: 2018-02-21
Within scientific and real life problems, classification is a typical case of extremely complex tasks in data-driven scenarios, especially if approached with traditional techniques. Machine Learning supervised and unsupervised paradigms, providing self-adaptive and semi-automatic methods, are able to navigate into large volumes of data characterized by a multi-dimensional parameter space, thus representing an ideal method to disentangle classes of objects in a reliable and efficient way. In Astrophysics, the identification of candidate Globular Clusters through deep, wide-field, single band images, is one of such cases where self-adaptive methods demonstrated a high performance and reliability. Here we experimented some variants of the known Neural Gas model, exploring both supervised and unsupervised paradigms of Machine Learning for the classification of Globular Clusters. Main scope of this work was to verify the possibility to improve the computational efficiency of the methods to solve complex data-driven problems, by exploiting the parallel programming with GPU framework. By using the astrophysical playground, the goal was to scientifically validate such kind of models for further applications extended to other contexts.
[30]  oai:arXiv.org:1710.09585  [pdf] - 1622307
Euclid: Superluminous supernovae in the Deep Survey
Comments: Paper accepted by A&A, abstract abridged. This paper is published on behalf of the Euclid Consortium
Submitted: 2017-10-26
In the last decade, astronomers have found a new type of supernova called `superluminous supernovae' (SLSNe) due to their high peak luminosity and long light-curves. These hydrogen-free explosions (SLSNe-I) can be seen to z~4 and therefore, offer the possibility of probing the distant Universe. We aim to investigate the possibility of detecting SLSNe-I using ESA's Euclid satellite, scheduled for launch in 2020. In particular, we study the Euclid Deep Survey (EDS) which will provide a unique combination of area, depth and cadence over the mission. We estimated the redshift distribution of Euclid SLSNe-I using the latest information on their rates and spectral energy distribution, as well as known Euclid instrument and survey parameters, including the cadence and depth of the EDS. We also applied a standardization method to the peak magnitudes to create a simulated Hubble diagram to explore possible cosmological constraints. We show that Euclid should detect approximately 140 high-quality SLSNe-I to z ~ 3.5 over the first five years of the mission (with an additional 70 if we lower our photometric classification criteria). This sample could revolutionize the study of SLSNe-I at z>1 and open up their use as probes of star-formation rates, galaxy populations, the interstellar and intergalactic medium. In addition, a sample of such SLSNe-I could improve constraints on a time-dependent dark energy equation-of-state, namely w(a), when combined with local SLSNe-I and the expected SN Ia sample from the Dark Energy Survey. We show that Euclid will observe hundreds of SLSNe-I for free. These luminous transients will be in the Euclid data-stream and we should prepare now to identify them as they offer a new probe of the high-redshift Universe for both astrophysics and cosmology.
[31]  oai:arXiv.org:1710.03900  [pdf] - 1589564
Astrophysical Data Analytics based on Neural Gas Models, using the Classification of Globular Clusters as Playground
Comments: Proceedings of the XIX International Conference "Data Analytics and Management in Data Intensive Domains" (DAMDID/RCDL 2017), Moscow, Russia, October 10-13, 2017, 8 pages, 4 figures
Submitted: 2017-10-11
In Astrophysics, the identification of candidate Globular Clusters through deep, wide-field, single band HST images, is a typical data analytics problem, where methods based on Machine Learning have revealed a high efficiency and reliability, demonstrating the capability to improve the traditional approaches. Here we experimented some variants of the known Neural Gas model, exploring both supervised and unsupervised paradigms of Machine Learning, on the classification of Globular Clusters, extracted from the NGC1399 HST data. Main focus of this work was to use a well-tested playground to scientifically validate such kind of models for further extended experiments in astrophysics and using other standard Machine Learning methods (for instance Random Forest and Multi Layer Perceptron neural network) for a comparison of performances in terms of purity and completeness.
[32]  oai:arXiv.org:1706.03501  [pdf] - 1584534
Probability density estimation of photometric redshifts based on machine learning
Comments: 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016 7849953
Submitted: 2017-06-12
Photometric redshifts (photo-z's) provide an alternative way to estimate the distances of large samples of galaxies and are therefore crucial to a large variety of cosmological problems. Among the various methods proposed over the years, supervised machine learning (ML) methods capable to interpolate the knowledge gained by means of spectroscopical data have proven to be very effective. METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts) is a novel method designed to provide a reliable PDF (Probability density Function) of the error distribution of photometric redshifts predicted by ML methods. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-z's. After a short description of the software, we present a summary of results on public galaxy data (Sloan Digital Sky Survey - Data Release 9) and a comparison with a completely different method based on Spectral Energy Distribution (SED) template fitting.
[33]  oai:arXiv.org:1706.01046  [pdf] - 1584216
Properties of Hi-GAL clumps in the inner Galaxy]{The Hi-GAL compact source catalogue. I. The physical properties of the clumps in the inner Galaxy ($-71.0^{\circ}< \ell < 67.0^{\circ}$)
Comments: Accepted by MNRAS
Submitted: 2017-06-04
Hi-GAL is a large-scale survey of the Galactic plane, performed with Herschel in five infrared continuum bands between 70 and 500 $\mu$m. We present a band-merged catalogue of spatially matched sources and their properties derived from fits to the spectral energy distributions (SEDs) and heliocentric distances, based on the photometric catalogs presented in Molinari et al. (2016a), covering the portion of Galactic plane $-71.0^{\circ}< \ell < 67.0^{\circ}$. The band-merged catalogue contains 100922 sources with a regular SED, 24584 of which show a 70 $\mu$m counterpart and are thus considered proto-stellar, while the remainder are considered starless. Thanks to this huge number of sources, we are able to carry out a preliminary analysis of early stages of star formation, identifying the conditions that characterise different evolutionary phases on a statistically significant basis. We calculate surface densities to investigate the gravitational stability of clumps and their potential to form massive stars. We also explore evolutionary status metrics such as the dust temperature, luminosity and bolometric temperature, finding that these are higher in proto-stellar sources compared to pre-stellar ones. The surface density of sources follows an increasing trend as they evolve from pre-stellar to proto-stellar, but then it is found to decrease again in the majority of the most evolved clumps. Finally, we study the physical parameters of sources with respect to Galactic longitude and the association with spiral arms, finding only minor or no differences between the average evolutionary status of sources in the fourth and first Galactic quadrants, or between "on-arm" and "inter-arm" positions.
[34]  oai:arXiv.org:1703.02991  [pdf] - 1581832
The third data release of the Kilo-Degree Survey and associated data products
Comments: small modifications; 27 pages, 12 figures, accepted for publication in Astronomy & Astrophysics
Submitted: 2017-03-08, last modified: 2017-05-21
The Kilo-Degree Survey (KiDS) is an ongoing optical wide-field imaging survey with the OmegaCAM camera at the VLT Survey Telescope. It aims to image 1500 square degrees in four filters (ugri). The core science driver is mapping the large-scale matter distribution in the Universe, using weak lensing shear and photometric redshift measurements. Further science cases include galaxy evolution, Milky Way structure, detection of high-redshift clusters, and finding rare sources such as strong lenses and quasars. Here we present the third public data release (DR3) and several associated data products, adding further area, homogenized photometric calibration, photometric redshifts and weak lensing shear measurements to the first two releases. A dedicated pipeline embedded in the Astro-WISE information system is used for the production of the main release. Modifications with respect to earlier releases are described in detail. Photometric redshifts have been derived using both Bayesian template fitting, and machine-learning techniques. For the weak lensing measurements, optimized procedures based on the THELI data reduction and lensfit shear measurement packages are used. In DR3 stacked ugri images, weight maps, masks, and source lists for 292 new survey tiles (~300 sq.deg) are made available. The multi-band catalogue, including homogenized photometry and photometric redshifts, covers the combined DR1, DR2 and DR3 footprint of 440 survey tiles (447 sq.deg). Limiting magnitudes are typically 24.3, 25.1, 24.9, 23.8 (5 sigma in a 2 arcsec aperture) in ugri, respectively, and the typical r-band PSF size is less than 0.7 arcsec. The photometric homogenization scheme ensures accurate colors and an absolute calibration stable to ~2% for gri and ~3% in u. Separately released are a weak lensing shear catalogue and photometric redshifts based on two different machine-learning techniques.
[35]  oai:arXiv.org:1704.01495  [pdf] - 1558960
The VOICE Survey : VST Optical Imaging of the CDFS and ES1 Fields
Comments: Proceedings of the 4th Annual Conference on High Energy Astrophysics in Southern Africa, 25-27 August, 2016, Cape Town, South Africa
Submitted: 2017-04-05
We present the VST Optical Imaging of the CDFS and ES1 Fields (VOICE) Survey, a VST INAF Guaranteed Time program designed to provide optical coverage of two 4 deg$^2$ cosmic windows in the Southern hemisphere. VOICE provides the first, multi-band deep optical imaging of these sky regions, thus complementing and enhancing the rich legacy of longer-wavelength surveys with VISTA, Spitzer, Herschel and ATCA available in these areas and paving the way for upcoming observations with facilities such as the LSST, MeerKAT and the SKA. VOICE exploits VST's OmegaCAM optical imaging capabilities and completes the reduction of WFI data available within the ES1 fields as part of the ESO-Spitzer Imaging Extragalactic Survey (ESIS) program providing $ugri$ and $uBVR$ coverage of 4 and 4 deg$^2$ areas within the CDFS and ES1 field respectively. We present the survey's science rationale and observing strategy, the data reduction and multi-wavelength data fusion pipeline. Survey data products and their future updates will be released at http://www.mattiavaccari.net/voice/ and on CDS/VizieR.
[36]  oai:arXiv.org:1703.02300  [pdf] - 1581789
$C^{3}$ : A Command-line Catalogue Cross-matching tool for modern astrophysical survey data
Comments: 6 pages, 4 figures, proceedings of the IAU-325 symposium on Astroinformatics, Cambridge University press
Submitted: 2017-03-07
In the current data-driven science era, it is needed that data analysis techniques has to quickly evolve to face with data whose dimensions has increased up to the Petabyte scale. In particular, being modern astrophysics based on multi-wavelength data organized into large catalogues, it is crucial that the astronomical catalog cross-matching methods, strongly dependant from the catalogues size, must ensure efficiency, reliability and scalability. Furthermore, multi-band data are archived and reduced in different ways, so that the resulting catalogues may differ each other in formats, resolution, data structure, etc, thus requiring the highest generality of cross-matching features. We present $C^{3}$ (Command-line Catalogue Cross-match), a multi-platform application designed to efficiently cross-match massive catalogues from modern surveys. Conceived as a stand-alone command-line process or a module within generic data reduction/analysis pipeline, it provides the maximum flexibility, in terms of portability, configuration, coordinates and cross-matching types, ensuring high performance capabilities by using a multi-core parallel processing paradigm and a sky partitioning algorithm.
[37]  oai:arXiv.org:1703.02292  [pdf] - 1581787
METAPHOR: Probability density estimation for machine learning based photometric redshifts
Comments: proceedings of the International Astronomical Union, IAU-325 symposium, Cambridge University press
Submitted: 2017-03-07
We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method able to provide a reliable PDF for photometric galaxy redshifts estimated through empirical techniques. METAPHOR is a modular workflow, mainly based on the MLPQNA neural network as internal engine to derive photometric galaxy redshifts, but giving the possibility to easily replace MLPQNA with any other method to predict photo-z's and their PDF. We present here the results about a validation test of the workflow on the galaxies from SDSS-DR9, showing also the universality of the method by replacing MLPQNA with KNN and Random Forest models. The validation test include also a comparison with the PDF's derived from a traditional SED template fitting method (Le Phare).
[38]  oai:arXiv.org:1701.08120  [pdf] - 1581300
Cooperative photometric redshift estimation
Comments: 6 pages, 1 figure, proceedings of the International Astronomical Union, IAU-325 symposium, Cambridge University press
Submitted: 2017-01-27
In the modern galaxy surveys photometric redshifts play a central role in a broad range of studies, from gravitational lensing and dark matter distribution to galaxy evolution. Using a dataset of about 25,000 galaxies from the second data release of the Kilo Degree Survey (KiDS) we obtain photometric redshifts with five different methods: (i) Random forest, (ii) Multi Layer Perceptron with Quasi Newton Algorithm, (iii) Multi Layer Perceptron with an optimization network based on the Levenberg-Marquardt learning rule, (iv) the Bayesian Photometric Redshift model (or BPZ) and (v) a classical SED template fitting procedure (Le Phare). We show how SED fitting techniques could provide useful information on the galaxy spectral type which can be used to improve the capability of machine learning methods constraining systematic errors and reduce the occurrence of catastrophic outliers. We use such classification to train specialized regression estimators, by demonstrating that such hybrid approach, involving SED fitting and machine learning in a single collaborative framework, is capable to improve the overall prediction accuracy of photometric redshifts.
[39]  oai:arXiv.org:1701.08158  [pdf] - 1581303
The Euclid Data Processing Challenges
Comments: 10 pages, 4 figures, IAU Symposium 325 on Astroinformatics
Submitted: 2017-01-27
Euclid is a Europe-led cosmology space mission dedicated to a visible and near infrared survey of the entire extra-galactic sky. Its purpose is to deepen our knowledge of the dark content of our Universe. After an overview of the Euclid mission and science, this contribution describes how the community is getting organized to face the data analysis challenges, both in software development and in operational data processing matters. It ends with a more specific account of some of the main contributions of the Swiss Science Data Center (SDC-CH).
[40]  oai:arXiv.org:1612.02173  [pdf] - 1533068
A cooperative approach among methods for photometric redshifts estimation: an application to KiDS data
Comments: Accepted by MNRAS, 17 pages, 11 figures
Submitted: 2016-12-07
Photometric redshifts (photo-z's) are fundamental in galaxy surveys to address different topics, from gravitational lensing and dark matter distribution to galaxy evolution. The Kilo Degree Survey (KiDS), i.e. the ESO public survey on the VLT Survey Telescope (VST), provides the unprecedented opportunity to exploit a large galaxy dataset with an exceptional image quality and depth in the optical wavebands. Using a KiDS subset of about 25,000 galaxies with measured spectroscopic redshifts, we have derived photo-z's using i) three different empirical methods based on supervised machine learning, ii) the Bayesian Photometric Redshift model (or BPZ), and iii) a classical SED template fitting procedure (Le Phare). We confirm that, in the regions of the photometric parameter space properly sampled by the spectroscopic templates, machine learning methods provide better redshift estimates, with a lower scatter and a smaller fraction of outliers. SED fitting techniques, however, provide useful information on the galaxy spectral type which can be effectively used to constrain systematic errors and to better characterize potential catastrophic outliers. Such classification is then used to specialize the training of regression machine learning models, by demonstrating that a hybrid approach, involving SED fitting and machine learning in a single collaborative framework, can be effectively used to improve the accuracy of photo-z estimates.
[41]  oai:arXiv.org:1611.04431  [pdf] - 1542796
C3, A Command-line Catalogue Cross-match tool for large astrophysical catalogues
Comments: 18 pages, 9 figures, Accepted for publication on PASP
Submitted: 2016-11-14, last modified: 2016-11-30
Modern Astrophysics is based on multi-wavelength data organized into large and heterogeneous catalogues. Hence, the need for efficient, reliable and scalable catalogue cross-matching methods plays a crucial role in the era of the petabyte scale. Furthermore, multi-band data have often very different angular resolution, requiring the highest generality of cross-matching features, mainly in terms of region shape and resolution. In this work we present $C^{3}$ (Command-line Catalogue Cross-match), a multi-platform application designed to efficiently cross-match massive catalogues. It is based on a multi-core parallel processing paradigm and conceived to be executed as a stand-alone command-line process or integrated within any generic data reduction/analysis pipeline, providing the maximum flexibility to the end-user, in terms of portability, parameter configuration, catalogue formats, angular resolution, region shapes, coordinate units and cross-matching types. Using real data, extracted from public surveys, we discuss the cross-matching capabilities and computing time efficiency also through a direct comparison with some publicly available tools, chosen among the most used within the community, and representative of different interface paradigms. We verified that the $C^{3}$ tool has excellent capabilities to perform an efficient and reliable cross-matching between large datasets. Although the elliptical cross-match and the parametric handling of angular orientation and offset are known concepts in the astrophysical context, their availability in the presented command-line tool makes $C^{3}$ competitive in the context of public astronomical tools.
[42]  oai:arXiv.org:1611.08494  [pdf] - 1522662
A Command-line Cross-matching tool for modern astrophysical pipelines
Comments: 4 pages, to appear in the Proceedings of ADASS 2016, Astronomical Society of the Pacific (ASP) Conference Series
Submitted: 2016-11-25
The emerging need for efficient, reliable and scalable astronomical catalog cross-matching is becoming more pressing in the current data-driven science era, where the size of data has rapidly increased up to the Petabyte scale. C3 (Command-line Catalogue Cross-matching) is a multi-platform tool designed to efficiently cross-match massive catalogues from modern astronomical surveys, ensuring high-performance capabilities through the use of a multi-core parallel processing paradigm. The tool has been conceived to be executed as a stand-alone command-line process or integrated within any generic data reduction/analysis pipeline, providing the maximum flexibility to the end user, in terms of parameter configuration, coordinates and cross-matching types. In this work we present the architecture and the features of the tool. Moreover, since the modular design of the tool enables an easy customization to specific use cases and requirements, we present also an example of a customized C3 version designed and used in the FP7 project ViaLactea, dedicated to cross-correlate Hi-GAL clumps with multi-band compact sources.
[43]  oai:arXiv.org:1611.08467  [pdf] - 1522660
The design strategy of scientific data quality control software for Euclid mission
Comments: 4 pages, to appear in the Proceedings of ADASS 2016, Astronomical Society of the Pacific (ASP) Conference Series
Submitted: 2016-11-25
The most valuable asset of a space mission like Euclid are the data. Due to their huge volume, the automatic quality control becomes a crucial aspect over the entire lifetime of the experiment. Here we focus on the design strategy for the Science Ground Segment (SGS) Data Quality Common Tools (DQCT), which has the main role to provide software solutions to gather, evaluate, and record quality information about the raw and derived data products from a primarily scientific perspective. The SGS DQCT will provide a quantitative basis for evaluating the application of reduction and calibration reference data, as well as diagnostic tools for quality parameters, flags, trend analysis diagrams and any other metadata parameter produced by the pipeline. In a large programme like Euclid, it is prohibitively expensive to process large amount of data at the pixel level just for the purpose of quality evaluation. Thus, all measures of quality at the pixel level are implemented in the individual pipeline stages, and passed along as metadata in the production. In this sense most of the tasks related to science data quality are delegated to the pipeline stages, even though the responsibility for science data quality is managed at a higher level. The DQCT subsystem of the SGS is currently under development, but its path to full realization will likely be different than that of other subsystems. Primarily because, due to a high level of parallelism and to the wide pipeline processing redundancy, for instance the mechanism of double Science Data Center for each processing function, the data quality tools have not only to be widely spread over all pipeline segments and data levels, but also to minimize the occurrences of potential diversity of solutions implemented for similar functions, ensuring the maximum of coherency and standardization for quality evaluation and reporting in the SGS.
[44]  oai:arXiv.org:1611.02162  [pdf] - 1532474
METAPHOR: A machine learning based method for the probability density estimation of photometric redshifts
Comments: Accepted from MNRAS, 17 pages, 16 figures
Submitted: 2016-11-07
A variety of fundamental astrophysical science topics require the determination of very accurate photometric redshifts (photo-z's). A wide plethora of methods have been developed, based either on template models fitting or on empirical explorations of the photometric parameter space. Machine learning based techniques are not explicitly dependent on the physical priors and able to produce accurate photo-z estimations within the photometric ranges derived from the spectroscopic training set. These estimates, however, are not easy to characterize in terms of a photo-z Probability Density Function (PDF), due to the fact that the analytical relation mapping the photometric parameters onto the redshift space is virtually unknown. We present METAPHOR (Machine-learning Estimation Tool for Accurate PHOtometric Redshifts), a method designed to provide a reliable PDF of the error distribution for empirical techniques. The method is implemented as a modular workflow, whose internal engine for photo-z estimation makes use of the MLPQNA neural network (Multi Layer Perceptron with Quasi Newton learning rule), with the possibility to easily replace the specific machine learning model chosen to predict photo-z's. We present a summary of results on SDSS-DR9 galaxy data, used also to perform a direct comparison with PDF's obtained by the Le Phare SED template fitting. We show that METAPHOR is capable to estimate the precision and reliability of photometric redshifts obtained with three different self-adaptive techniques, i.e. MLPQNA, Random Forest and the standard K-Nearest Neighbors models.
[45]  oai:arXiv.org:1505.06621  [pdf] - 1579647
Machine learning based data mining for Milky Way filamentary structures reconstruction
Comments: Proceeding of WIRN 2015 Conference, May 20-22, Vietri sul Mare, Salerno, Italy. Published in Smart Innovation, Systems and Technology, Springer, ISSN 2190-3018, 9 pages, 4 figures
Submitted: 2015-05-25, last modified: 2016-10-11
We present an innovative method called FilExSeC (Filaments Extraction, Selection and Classification), a data mining tool developed to investigate the possibility to refine and optimize the shape reconstruction of filamentary structures detected with a consolidated method based on the flux derivative analysis, through the column-density maps computed from Herschel infrared Galactic Plane Survey (Hi-GAL) observations of the Galactic plane. The present methodology is based on a feature extraction module followed by a machine learning model (Random Forest) dedicated to select features and to classify the pixels of the input images. From tests on both simulations and real observations the method appears reliable and robust with respect to the variability of shape and distribution of filaments. In the cases of highly defined filament structures, the presented method is able to bridge the gaps among the detected fragments, thus improving their shape reconstruction. From a preliminary "a posteriori" analysis of derived filament physical parameters, the method appears potentially able to add a sufficient contribution to complete and refine the filament reconstruction.
[46]  oai:arXiv.org:1608.04526  [pdf] - 1457620
VIALACTEA knowledge base homogenizing access to Milky Way data
Comments: 11 pages, 1 figure, SPIE Astronomical Telescopes + Instrumentation 2016, Software and Cyberifrastructure for Astronomy IV, Conference Proceedings
Submitted: 2016-08-16
The VIALACTEA project has a work package dedicated to Tools and Infrastructure and, inside it, a task for the Database and Virtual Observatory Infrastructure. This task aims at providing an infrastructure to store all the resources needed by the, more purposely, scientific work packages of the project itself. This infrastructure includes a combination of: storage facilities, relational databases and web services on top of them, and has taken, as a whole, the name of VIALACTEA Knowledge Base (VLKB). This contribution illustrates the current status of this VLKB. It details the set of data resources put together; describes the database that allows data discovery through VO inspired metadata maintenance; illustrates the discovery, cutout and access services built on top of the former two for the users to exploit the data content.
[47]  oai:arXiv.org:1603.00720  [pdf] - 1375586
DAMEWARE - Data Mining & Exploration Web Application Resource
Comments: User Manual of the DAMEWARE Web Application, 51 pages
Submitted: 2016-03-02, last modified: 2016-03-16
Astronomy is undergoing through a methodological revolution triggered by an unprecedented wealth of complex and accurate data. DAMEWARE (DAta Mining & Exploration Web Application and REsource) is a general purpose, Web-based, Virtual Observatory compliant, distributed data mining framework specialized in massive data sets exploration with machine learning methods. We present the DAMEWARE (DAta Mining & Exploration Web Application REsource) which allows the scientific community to perform data mining and exploratory experiments on massive data sets, by using a simple web browser. DAMEWARE offers several tools which can be seen as working environments where to choose data analysis functionalities such as clustering, classification, regression, feature extraction etc., together with models and algorithms.
[48]  oai:arXiv.org:1602.05408  [pdf] - 1360446
PhotoRaptor - Photometric Research Application To Redshifts
Comments: User Manual of the PhotoRaptor tool, 54 pages. arXiv admin note: substantial text overlap with arXiv:1501.06506
Submitted: 2016-02-17
Due to the necessity to evaluate photo-z for a variety of huge sky survey data sets, it seemed important to provide the astronomical community with an instrument able to fill this gap. Besides the problem of moving massive data sets over the network, another critical point is that a great part of astronomical data is stored in private archives that are not fully accessible on line. So, in order to evaluate photo-z it is needed a desktop application that can be downloaded and used by everyone locally, i.e. on his own personal computer or more in general within the local intranet hosted by a data center. The name chosen for the application is PhotoRApToR, i.e. Photometric Research Application To Redshift (Cavuoti et al. 2015, 2014; Brescia 2014b). It embeds a machine learning algorithm and special tools dedicated to preand post-processing data. The ML model is the MLPQNA (Multi Layer Perceptron trained by the Quasi Newton Algorithm), which has been revealed particularly powerful for the photo-z calculation on the base of a spectroscopic sample (Cavuoti et al. 2012; Brescia et al. 2013, 2014a; Biviano et al. 2013). The PhotoRApToR program package is available, for different platforms, at the official website (http://dame.dsf.unina.it/dame_photoz.html#photoraptor).
[49]  oai:arXiv.org:1507.00731  [pdf] - 1362561
Towards a census of super-compact massive galaxies in the Kilo Degree Survey
Comments: 11 pages, 6 figures, MNRAS in press, revised and improved version, figures and text have been updated
Submitted: 2015-07-02, last modified: 2016-02-03
The abundance of compact, massive, early-type galaxies (ETGs) provides important constraints to galaxy formation scenarios. Thanks to the area covered, depth, excellent spatial resolution and seeing, the ESO Public optical Kilo Degree Survey (KiDS), carried out with the VLT Survey Telescope (VST), offers a unique opportunity to conduct a complete census of the most compact galaxies in the Universe. This paper presents a first census of such systems from the first 156 square degrees of KiDS. Our analysis relies on g-, r-, and i-band effective radii ($R_{\rm e}$), derived by fitting galaxy images with PSF-convolved S\'ersic models, high-quality photometric redshifts, $z_{\rm phot}$, estimated from machine learning techniques, and stellar masses, $M_{\rm \star}$, calculated from KiDS aperture photometry. After massiveness ($M_{\rm \star} > 8 \times 10^{10}\, \rm M_{\odot}$) and compactness ($R_{\rm e} < 1.5 \, \rm kpc$ in g-, r- and i-bands) criteria are applied, a visual inspection of the candidates plus near-infrared photometry from VIKING-DR1 are used to refine our sample. The final catalog, to be spectroscopically confirmed, consists of 92 systems in the redshift range $z \sim 0.2-0.7$. This sample, which we expect to increase by a factor of ten over the total survey area, represents the first attempt to select massive super-compact ETGs (MSCGs) in KiDS. We investigate the impact of redshift systematics in the selection, finding that this seems to be a major source of contamination in our sample. A preliminary analysis shows that MSCGs exhibit negative internal colour gradients, consistent with a passive evolution of these systems. We find that the number density of MSCGs is only mildly consistent with predictions from simulations at $z>0.2$, while no such system is found at $z < 0.2$.
[50]  oai:arXiv.org:1601.03931  [pdf] - 1364937
An analysis of feature relevance in the classification of astronomical transients with machine learning methods
Comments: Accepted by MNRAS, 11 figures, 18 pages
Submitted: 2016-01-15
The exploitation of present and future synoptic (multi-band and multi-epoch) surveys requires an extensive use of automatic methods for data processing and data interpretation. In this work, using data extracted from the Catalina Real Time Transient Survey (CRTS), we investigate the classification performance of some well tested methods: Random Forest, MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) and K-Nearest Neighbors, paying special attention to the feature selection phase. In order to do so, several classification experiments were performed. Namely: identification of cataclysmic variables, separation between galactic and extra-galactic objects and identification of supernovae.
[51]  oai:arXiv.org:1511.08619  [pdf] - 1319775
Advanced Environment for Knowledge Discovery in the VIALACTEA Project
Comments: Astronomical Data Analysis Software and Systems XXV. Proceedings of a Conference held from October 25th to 30th, 2015 at Rydges World Square in Sydney, Australia
Submitted: 2015-11-27, last modified: 2015-12-01
The VIALACTEA project aims at building a predictive model of star formation in our galaxy. We present the innovative integrated framework and the main technologies and methodologies to reach this ambitious goal.
[52]  oai:arXiv.org:1510.08097  [pdf] - 1429396
Luminosity functions in the CLASH-VLT cluster MACS J1206.2-0847: the importance of tidal interactions
Comments: 5 pages, 2 figures, Proceeding of the talk at the conference "The Universe of Digital Sky Surveys" held in Naples (Italy) on 25-28 November 2014
Submitted: 2015-10-27
We present the optical luminosity functions (LFs) of galaxies for the CLASH-VLT cluster MACS J1206.2-0847 at z=0.439, based on HST and SUBARU data, including ~600 spectroscopically confirmed member galaxies. The LFs on the wide SUBARU FoV are well described by a single Schechter function down to M~M*+3, whereas this fit is poor for HST data, due to a faint-end upturn visible down M~M*+7, suggesting a bimodal behaviour. We also investigate the effect of local environment by deriving the LFs in four different regions, according to the distance from the centre, finding an increase in the faint-end slope going from the core to the outer rings. Our results confirm and extend our previous findings on the analysis of mass functions, which showed that the galaxies with stellar mass below 10^10.5, M_sun have been significantly affected by tidal interaction effects, thus contributing to the intra cluster light.
[53]  oai:arXiv.org:1510.05659  [pdf] - 1342878
CLASH-VLT: Environment-driven evolution of galaxies in the z=0.209 cluster Abell 209
Comments: 17 pages, 20 figures, A&A in press
Submitted: 2015-10-19
The analysis of galaxy properties and the relations among them and the environment, can be used to investigate the physical processes driving galaxy evolution. We study the cluster A209 by using the CLASH-VLT spectroscopic data combined with Subaru photometry, yielding to 1916 cluster members down to a stellar mass of 10^{8.6} Msun. We determine: i) the stellar mass function of star-forming and passive galaxies; ii) the intra-cluster light and its properties; iii) the orbits of low- and high-mass passive galaxies; and iv) the mass-size relation of ETGs. The stellar mass function of the star-forming galaxies does not depend on the environment, while the slope found for passive galaxies becomes flatter in the densest region. The color distribution of the intra-cluster light is consistent with the color of passive members. The analysis of the dynamical orbits shows that low-mass passive galaxies have tangential orbits, avoiding small pericenters around the BCG. The mass-size relation of low-mass passive ETGs is flatter than that of high mass galaxies, and its slope is consistent with that of field star-forming galaxies. Low-mass galaxies are also more compact within the scale radius of 0.65 Mpc. The ratio between stellar and number density profiles shows a mass segregation in the center. The comparative analysis of the stellar and total density profiles indicates that this effect is due to dynamical friction. Our results are consistent with a scenario in which the "environmental quenching" of low-mass galaxies is due to mechanisms such as harassment out to R200, starvation and ram-pressure stripping at smaller radii, as supported by the analysis of the mass function, of the dynamical orbits and of the mass-size relation of passive early-types in different regions. Our analyses support the idea that the intra-cluster light is formed through the tidal disruption of subgiant galaxies.
[54]  oai:arXiv.org:1509.03318  [pdf] - 1304105
Mapping the Galaxy Color-Redshift Relation: Optimal Photometric Redshift Calibration Strategies for Cosmology Surveys
Comments: ApJ accepted, 17 pages, 10 figures
Submitted: 2015-09-10
Calibrating the photometric redshifts of >10^9 galaxies for upcoming weak lensing cosmology experiments is a major challenge for the astrophysics community. The path to obtaining the required spectroscopic redshifts for training and calibration is daunting, given the anticipated depths of the surveys and the difficulty in obtaining secure redshifts for some faint galaxy populations. Here we present an analysis of the problem based on the self-organizing map, a method of mapping the distribution of data in a high-dimensional space and projecting it onto a lower-dimensional representation. We apply this method to existing photometric data from the COSMOS survey selected to approximate the anticipated Euclid weak lensing sample, enabling us to robustly map the empirical distribution of galaxies in the multidimensional color space defined by the expected Euclid filters. Mapping this multicolor distribution lets us determine where - in galaxy color space - redshifts from current spectroscopic surveys exist and where they are systematically missing. Crucially, the method lets us determine whether a spectroscopic training sample is representative of the full photometric space occupied by the galaxies in a survey. We explore optimal sampling techniques and estimate the additional spectroscopy needed to map out the color-redshift relation, finding that sampling the galaxy distribution in color space in a systematic way can efficiently meet the calibration requirements. While the analysis presented here focuses on the Euclid survey, similar analysis can be applied to other surveys facing the same calibration challenge, such as DES, LSST, and WFIRST.
[55]  oai:arXiv.org:1508.07327  [pdf] - 1281029
Shapley Supercluster Survey: Construction of the Photometric Catalogues and i-band Data Release
Comments: 14 pages, 12 figures, 7 tables. MNRAS in press
Submitted: 2015-08-28
The Shapley Supercluster Survey is a multi-wavelength survey covering an area of ~23 deg^2 (~260 Mpc^2 at z=0.048) around the supercluster core, including nine Abell and two poor clusters, having redshifts in the range 0.045-0.050. The survey aims to investigate the role of the cluster-scale mass assembly on the evolution of galaxies, mapping the effects of the environment from the cores of the clusters to their outskirts and along the filaments. The optical (ugri) imaging acquired with OmegaCAM on the VLT Survey Telescope is essential to achieve the project goals providing accurate multi-band photometry for the galaxy population down to m*+6. We describe the methodology adopted to construct the optical catalogues and to separate extended and point-like sources. The catalogues reach average 5sigma limiting magnitudes within a 3\arcsec diameter aperture of ugri=[24.4,24.6,24.1,23.3] and are 93% complete down to ugri=[23.8,23.8,23.5,22.0] mag, corresponding to ~m*_r+8.5. The data are highly uniform in terms of observing conditions and all acquired with seeing less than 1.1 arcsec full width at half-maximum. The median seeing in r-band is 0.6 arcsec, corresponding to 0.56 kpc h^{-1}_{70} at z=0.048. While the observations in the u, g and r bands are still ongoing, the i-band observations have been completed, and we present the i-band catalogue over the whole survey area. The latter is released and it will be regularly updated, through the use of the Virtual Observatory tools. This includes 734,319 sources down to i=22.0 mag and it is the first optical homogeneous catalogue at such a depth, covering the central region of the Shapley supercluster.
[56]  oai:arXiv.org:1507.00742  [pdf] - 1292226
The first and second data releases of the Kilo-Degree Survey
Comments: 26 pages, 26 figures, 2 appendices; two new figures, several textual clarifications, updated references; accepted for publication in A&A
Submitted: 2015-07-02, last modified: 2015-08-19
The Kilo-Degree Survey (KiDS) is an optical wide-field imaging survey carried out with the VLT Survey Telescope and the OmegaCAM camera. KiDS will image 1500 square degrees in four filters (ugri), and together with its near-infrared counterpart VIKING will produce deep photometry in nine bands. Designed for weak lensing shape and photometric redshift measurements, the core science driver of the survey is mapping the large-scale matter distribution in the Universe back to a redshift of ~0.5. Secondary science cases are manifold, covering topics such as galaxy evolution, Milky Way structure, and the detection of high-redshift clusters and quasars. KiDS is an ESO Public Survey and dedicated to serving the astronomical community with high-quality data products derived from the survey data, as well as with calibration data. Public data releases will be made on a yearly basis, the first two of which are presented here. For a total of 148 survey tiles (~160 sq.deg.) astrometrically and photometrically calibrated, coadded ugri images have been released, accompanied by weight maps, masks, source lists, and a multi-band source catalog. A dedicated pipeline and data management system based on the Astro-WISE software system, combined with newly developed masking and source classification software, is used for the data production of the data products described here. The achieved data quality and early science projects based on the data products in the first two data releases are reviewed in order to validate the survey data. Early scientific results include the detection of nine high-z QSOs, fifteen candidate strong gravitational lenses, high-quality photometric redshifts and galaxy structural parameters for hundreds of thousands of galaxies. (Abridged)
[57]  oai:arXiv.org:1507.00754  [pdf] - 1253479
Machine Learning based photometric redshifts for the KiDS ESO DR2 galaxies
Comments: MNRAS, 6 pages, 4 figures
Submitted: 2015-07-02, last modified: 2015-07-30
We estimated photometric redshifts (zphot) for more than 1.1 million galaxies of the ESO Public Kilo-Degree Survey (KiDS) Data Release 2. KiDS is an optical wide-field imaging survey carried out with the VLT Survey Telescope (VST) and the OmegaCAM camera, which aims at tackling open questions in cosmology and galaxy evolution, such as the origin of dark energy and the channel of galaxy mass growth. We present a catalogue of photometric redshifts obtained using the Multi Layer Perceptron with Quasi Newton Algorithm (MLPQNA) model, provided within the framework of the DAta Mining and Exploration Web Application REsource (DAMEWARE). These photometric redshifts are based on a spectroscopic knowledge base which was obtained by merging spectroscopic datasets from GAMA (Galaxy And Mass Assembly) data release 2 and SDSS-III data release 9. The overall 1 sigma uncertainty on Delta z = (zspec - zphot) / (1+ zspec) is ~ 0.03, with a very small average bias of ~ 0.001, a NMAD of ~ 0.02 and a fraction of catastrophic outliers (| Delta z | > 0.15) of ~0.4%.
[58]  oai:arXiv.org:1507.00736  [pdf] - 1429365
Galaxy evolution within the Kilo-Degree Survey
Comments: 4 pages, 2 figures, to appear on the refereed Proceeding of the "The Universe of Digital Sky Surveys" conference held at the INAF--OAC, Naples, on 25th-28th november 2014, to be published on Astrophysics and Space Science Proceedings, edited by Longo, Napolitano, Marconi, Paolillo, Iodice
Submitted: 2015-07-02
The ESO Public Kilo-Degree Survey (KiDS) is an optical wide-field imaging survey carried out with the VLT Survey Telescope and the OmegaCAM camera. KiDS will scan 1500 square degrees in four optical filters (u, g, r, i). Designed to be a weak lensing survey, it is ideal for galaxy evolution studies, thanks to the high spatial resolution of VST, the good seeing and the photometric depth. The surface photometry have provided with structural parameters (e.g. size and S\'ersic index), aperture and total magnitudes have been used to derive photometric redshifts from Machine learning methods and stellar masses/luminositites from stellar population synthesis. Our project aimed at investigating the evolution of the colour and structural properties of galaxies with mass and environment up to redshift $z \sim 0.5$ and more, to put constraints on galaxy evolution processes, as galaxy mergers.
[59]  oai:arXiv.org:1504.03857  [pdf] - 1182953
Automated physical classification in the SDSS DR10. A catalogue of candidate Quasars
Comments: Accepted for publication by MNRAS, 13 pages, 6 figures
Submitted: 2015-04-15
We discuss whether modern machine learning methods can be used to characterize the physical nature of the large number of objects sampled by the modern multi-band digital surveys. In particular, we applied the MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) method to the optical data of the Sloan Digital Sky Survey - Data Release 10, investigating whether photometric data alone suffice to disentangle different classes of objects as they are defined in the SDSS spectroscopic classification. We discuss three groups of classification problems: (i) the simultaneous classification of galaxies, quasars and stars; (ii) the separation of stars from quasars; (iii) the separation of galaxies with normal spectral energy distribution from those with peculiar spectra, such as starburst or starforming galaxies and AGN. While confirming the difficulty of disentangling AGN from normal galaxies on a photometric basis only, MLPQNA proved to be quite effective in the three-class separation. In disentangling quasars from stars and galaxies, our method achieved an overall efficiency of 91.31% and a QSO class purity of ~95%. The resulting catalogue of candidate quasars/AGNs consists of ~3.6 million objects, of which about half a million are also flagged as robust candidates, and will be made available on CDS VizieR facility.
[60]  oai:arXiv.org:1503.05607  [pdf] - 974530
CLASH-VLT: Substructure in the galaxy cluster MACS J1206.2-0847 from kinematics of galaxy populations
Comments: A&A accepted, 19 pages, 30 figures, minor language changes
Submitted: 2015-03-18, last modified: 2015-04-02
In the effort to understand the link between the structure of galaxy clusters and their galaxy populations, we focus on MACSJ1206.2-0847 at z~0.44 and probe its substructure in the projected phase space through the spectrophotometric properties of a large number of galaxies from the CLASH-VLT survey. Our analysis is mainly based on an extensive spectroscopic dataset of 445 member galaxies, mostly acquired with VIMOS@VLT as part of our ESO Large Programme, sampling the cluster out to a radius ~2R200 (4 Mpc). We classify 412 galaxies as passive, with strong Hdelta absorption (red and blue galaxies, and with emission lines from weak to very strong. A number of tests for substructure detection are applied to analyze the galaxy distribution in the velocity space, in 2D space, and in 3D projected phase-space. Studied in its entirety, the cluster appears as a large-scale relaxed system with a few secondary, minor overdensities in 2D distribution. We detect no velocity gradients or evidence of deviations in local mean velocities. The main feature is the WNW-ESE elongation. The analysis of galaxy populations per spectral class highlights a more complex scenario. The passive galaxies and red strong Hdelta galaxies trace the cluster center and the WNW-ESE elongated structure. The red strong Hdelta galaxies also mark a secondary, dense peak ~2 Mpc at ESE. The emission line galaxies cluster in several loose structures, mostly outside R200. The observational scenario agrees with MACS J1206.2-0847 having WNW-ESE as the direction of the main cluster accretion, traced by passive galaxies and red strong Hdelta galaxies. The red strong Hdelta galaxies, interpreted as poststarburst galaxies, date a likely important event 1-2 Gyr before the epoch of observation. The emission line galaxies trace a secondary, ongoing infall where groups are accreted along several directions.
[61]  oai:arXiv.org:1501.06506  [pdf] - 1504412
Photometric redshift estimation based on data mining with PhotoRApToR
Comments: To appear on Experimental Astronomy, Springer, 20 pages, 15 figures
Submitted: 2015-01-26
Photometric redshifts (photo-z) are crucial to the scientific exploitation of modern panchromatic digital surveys. In this paper we present PhotoRApToR (Photometric Research Application To Redshift): a Java/C++ based desktop application capable to solve non-linear regression and multi-variate classification problems, in particular specialized for photo-z estimation. It embeds a machine learning algorithm, namely a multilayer neural network trained by the Quasi Newton learning rule, and special tools dedicated to pre- and postprocessing data. PhotoRApToR has been successfully tested on several scientific cases. The application is available for free download from the DAME Program web site.
[62]  oai:arXiv.org:1409.8562  [pdf] - 903808
Extending the supernova Hubble diagram to z~1.5 with the Euclid space mission
Comments: 21 pages. Accepted for publication in A&A
Submitted: 2014-09-30, last modified: 2014-11-04
We forecast dark energy constraints that could be obtained from a new large sample of Type Ia supernovae where those at high redshift are acquired with the Euclid space mission. We simulate a three-prong SN survey: a z<0.35 nearby sample (8000 SNe), a 0.2<z<0.95 intermediate sample (8800 SNe), and a 0.75<z<1.55 high-z sample (1700 SNe). The nearby and intermediate surveys are assumed to be conducted from the ground, while the high-z is a joint ground- and space-based survey. This latter survey, the "Dark Energy Supernova Infra-Red Experiment" (DESIRE), is designed to fit within 6 months of Euclid observing time, with a dedicated observing program. We simulate the SN events as they would be observed in rolling-search mode by the various instruments, and derive the quality of expected cosmological constraints. We account for known systematic uncertainties, in particular calibration uncertainties including their contribution through the training of the supernova model used to fit the supernovae light curves. Using conservative assumptions and a 1-D geometric Planck prior, we find that the ensemble of surveys would yield competitive constraints: a constant equation of state parameter can be constrained to sigma(w)=0.022, and a Dark Energy Task Force figure of merit of 203 is found for a two-parameter equation of state. Our simulations thus indicate that Euclid can bring a significant contribution to a purely geometrical cosmology constraint by extending a high-quality SN Hubble diagram to z~1.5. We also present other science topics enabled by the DESIRE Euclid observations
[63]  oai:arXiv.org:1410.5631  [pdf] - 891462
Data Driven Discovery in Astrophysics
Comments: Keynote talk in the proceedings of ESA-ESRIN Conference: Big Data from Space 2014, Frascati, Italy, November 12-14, 2014, 8 pages, 2 figures
Submitted: 2014-10-21, last modified: 2014-11-01
We review some aspects of the current state of data-intensive astronomy, its methods, and some outstanding data analysis challenges. Astronomy is at the forefront of "big data" science, with exponentially growing data volumes and data rates, and an ever-increasing complexity, now entering the Petascale regime. Telescopes and observatories from both ground and space, covering a full range of wavelengths, feed the data via processing pipelines into dedicated archives, where they can be accessed for scientific analysis. Most of the large archives are connected through the Virtual Observatory framework, that provides interoperability standards and services, and effectively constitutes a global data grid of astronomy. Making discoveries in this overabundance of data requires applications of novel, machine learning tools. We describe some of the recent examples of such applications.
[64]  oai:arXiv.org:1408.6356  [pdf] - 903659
CLASH-VLT: The stellar mass function and stellar mass density profile of the z=0.44 cluster of galaxies MACS J1206.2-0847
Comments: A&A accepted, 15 pages, 13 figures
Submitted: 2014-08-27, last modified: 2014-09-02
Context. The study of the galaxy stellar mass function (SMF) in relation to the galaxy environment and the stellar mass density profile, rho(r), is a powerful tool to constrain models of galaxy evolution. Aims. We determine the SMF of the z=0.44 cluster of galaxies MACS J1206.2-0847 separately for passive and star-forming (SF) galaxies, in different regions of the cluster, from the center out to approximately 2 virial radii. We also determine rho(r) to compare it to the number density and total mass density profiles. Methods. We use the dataset from the CLASH-VLT survey. Stellar masses are obtained by SED fitting on 5-band photometric data obtained at the Subaru telescope. We identify 1363 cluster members down to a stellar mass of 10^9.5 Msolar. Results. The whole cluster SMF is well fitted by a double Schechter function. The SMFs of cluster SF and passive galaxies are statistically different. The SMF of the SF cluster galaxies does not depend on the environment. The SMF of the passive population has a significantly smaller slope (in absolute value) in the innermost (<0.50 Mpc), highest density cluster region, than in more external, lower density regions. The number ratio of giant/subgiant galaxies is maximum in this innermost region and minimum in the adjacent region, but then gently increases again toward the cluster outskirts. This is also reflected in a decreasing radial trend of the average stellar mass per cluster galaxy. On the other hand, the stellar mass fraction, i.e., the ratio of stellar to total cluster mass, does not show any significant radial trend. Conclusions. Our results appear consistent with a scenario in which SF galaxies evolve into passive galaxies due to density-dependent environmental processes, and eventually get destroyed very near the cluster center to become part of a diffuse intracluster medium.
[65]  oai:arXiv.org:1407.2527  [pdf] - 863085
A catalogue of photometric redshifts for the SDSS-DR9 galaxies
Comments: 10 pages, To appear on section 14 (Catalogs and data) of Astronomy and Astrophysics
Submitted: 2014-07-09
Accurate photometric redshifts for large samples of galaxies are among the main products of modern multiband digital surveys. Over the last decade, the Sloan Digital Sky Survey (SDSS) has become a sort of benchmark against which to test the various methods. We present an application of a new method to the estimation of photometric redshifts for the galaxies in the SDSS Data Release 9 (SDSS-DR9). Photometric redshifts for more than 143 million galaxies were produced and made available at the URL: http://dame.dsf.unina.it/catalog/DR9PHOTOZ/. The MLPQNA (Multi Layer Perceptron with Quasi Newton Algorithm) model provided within the framework of the DAMEWARE (DAta Mining and Exploration Web Application REsource) is an interpolative method derived from machine learning models. The obtained redshifts have an overall uncertainty of sigma=0.023 with a very small average bias of about 3x10^-5, and a fraction of catastrophic outliers of about 5%. This result is slightly better than what was already available in the literature, particularly in terms of the smaller fraction of catastrophic outliers.
[66]  oai:arXiv.org:1406.3538  [pdf] - 1214952
DAMEWARE: A web cyberinfrastructure for astrophysical data mining
Comments: To appear in PASP (accepted for pubblication)
Submitted: 2014-06-13
Astronomy is undergoing through a methodological revolution triggered by an unprecedented wealth of complex and accurate data. The new panchromatic, synoptic sky surveys require advanced tools for discovering patterns and trends hidden behind data which are both complex and of high dimensionality. We present DAMEWARE (DAta Mining & Exploration Web Application REsource): a general purpose, web-based, distributed data mining environment developed for the exploration of large datasets, and finely tuned for astronomical applications. By means of graphical user interfaces, it allows the user to perform classification, regression or clustering tasks with machine learning methods. Salient features of DAMEWARE include its capability to work on large datasets with minimal human intervention, and to deal with a wide variety of real problems such as the classification of globular clusters in the galaxy NGC1399, the evaluation of photometric redshifts and, finally, the identification of candidate Active Galactic Nuclei in multiband photometric surveys. In all these applications, DAMEWARE allowed to achieve better results than those attained with more traditional methods. With the aim of providing potential users with all needed information, in this paper we briefly describe the technological background of DAMEWARE, give a short introduction to some relevant aspects of data mining, followed by a summary of some science cases and, finally, we provide a detailed description of a template use case.
[67]  oai:arXiv.org:1406.3192  [pdf] - 835458
Data-Rich Astronomy: Mining Sky Surveys with PhotoRApToR
Comments: proceedings of the IAU Symposium, Vol. 306, Cambridge University Press
Submitted: 2014-06-12
In the last decade a new generation of telescopes and sensors has allowed the production of a very large amount of data and astronomy has become a data-rich science. New automatic methods largely based on machine learning are needed to cope with such data tsunami. We present some results in the fields of photometric redshifts and galaxy classification, obtained using the MLPQNA algorithm available in the DAMEWARE (Data Mining and Web Application Resource) for the SDSS galaxies (DR9 and DR10). We present PhotoRApToR (Photometric Research Application To Redshift): a Java based desktop application capable to solve regression and classification problems and specialized for photo-z estimation.
[68]  oai:arXiv.org:1403.4979  [pdf] - 827300
Intra Cluster Light properties in the CLASH-VLT cluster MACS J1206.2-0847
Comments: 18 pages, 13 figures, accepted for publication in Astronomy and Astrophysics
Submitted: 2014-03-19
We aim at constraining the assembly history of clusters by studying the intra cluster light (ICL) properties, estimating its contribution to the fraction of baryons in stars, f*, and understanding possible systematics/bias using different ICL detection techniques. We developed an automated method, GALtoICL, based on the software GALAPAGOS to obtain a refined version of typical BCG+ICL maps. We applied this method to our test case MACS J1206.2-0847, a massive cluster located at z=0.44, that is part of the CLASH sample. Using deep multi-band SUBARU images, we extracted the surface brightness (SB) profile of the BCG+ICL and we studied the ICL morphology, color, and contribution to f* out to R500. We repeated the same analysis using a different definition of the ICL, SBlimit method, i.e. a SB cut-off level, to compare the results. The most peculiar feature of the ICL in MACS1206 is its asymmetric radial distribution, with an excess in the SE direction and extending towards the 2nd brightest cluster galaxy which is a Post Starburst galaxy. This suggests an interaction between the BCG and this galaxy that dates back to t <= 1.5 Gyr. The BCG+ICL stellar content is 8% of M_(*,500) and the (de-) projected baryon fraction in stars is f*=0.0177 (0.0116), in excellent agreement with recent results. The SBlimit method provides systematically higher ICL fractions and this effect is larger at lower SB limits. This is due to the light from the outer envelopes of member galaxies that contaminate the ICL. Though more time consuming, the GALtoICL method provides safer ICL detections that are almost free of this contamination. This is one of the few ICL study at redshift z > 0.3. At completion, the CLASH/VLT program will allow us to extend this analysis to a statistically significant cluster sample spanning a wide redshift range: 0.2<z<0.6.
[69]  oai:arXiv.org:1310.2840  [pdf] - 731626
Photometric classification of emission line galaxies with Machine Learning methods
Comments: 10 pages, 1 figure, accepted by MNRAS in October 10, 2013
Submitted: 2013-10-10
In this paper we discuss an application of machine learning based methods to the identification of candidate AGN from optical survey data and to the automatic classification of AGNs in broad classes. We applied four different machine learning algorithms, namely the Multi Layer Perceptron (MLP), trained respectively with the Conjugate Gradient, Scaled Conjugate Gradient and Quasi Newton learning rules, and the Support Vector Machines (SVM), to tackle the problem of the classification of emission line galaxies in different classes, mainly AGNs vs non-AGNs, obtained using optical photometry in place of the diagnostics based on line intensity ratios which are classically used in the literature. Using the same photometric features we discuss also the behavior of the classifiers on finer AGN classification tasks, namely Seyfert I vs Seyfert II and Seyfert vs LINER. Furthermore we describe the algorithms employed, the samples of spectroscopically classified galaxies used to train the algorithms, the procedure followed to select the photometric parameters and the performances of our methods in terms of multiple statistical indicators. The results of the experiments show that the application of self adaptive data mining algorithms trained on spectroscopic data sets and applied to carefully chosen photometric parameters represents a viable alternative to the classical methods that employ time-consuming spectroscopic observations.
[70]  oai:arXiv.org:1307.5867  [pdf] - 1172897
CLASH-VLT: The mass, velocity-anisotropy, and pseudo-phase-space density profiles of the z=0.44 galaxy cluster MACS 1206.2-0847
Comments: A&A in press; 22 pages, 19 figures
Submitted: 2013-07-22, last modified: 2013-08-13
We use an unprecedented data-set of about 600 redshifts for cluster members, obtained as part of a VLT/VIMOS large programme, to constrain the mass profile of the z=0.44 cluster MACS J1206.2-0847 over the radial range 0-5 Mpc (0-2.5 virial radii) using the MAMPOSSt and Caustic methods. We then add external constraints from our previous gravitational lensing analysis. We invert the Jeans equation to obtain the velocity-anisotropy profiles of cluster members. With the mass-density and velocity-anisotropy profiles we then obtain the first determination of a cluster pseudo-phase-space density profile. The kinematics and lensing determinations of the cluster mass profile are in excellent agreement. This is very well fitted by a NFW model with mass M200=(1.4 +- 0.2) 10^15 Msun and concentration c200=6 +- 1, only slightly higher than theoretical expectations. Other mass profile models also provide acceptable fits to our data, of (slightly) lower (Burkert, Hernquist, and Softened Isothermal Sphere) or comparable (Einasto) quality than NFW. The velocity anisotropy profiles of the passive and star-forming cluster members are similar, close to isotropic near the center and increasingly radial outside. Passive cluster members follow extremely well the theoretical expectations for the pseudo-phase-space density profile and the relation between the slope of the mass-density profile and the velocity anisotropy. Star-forming cluster members show marginal deviations from theoretical expectations. This is the most accurate determination of a cluster mass profile out to a radius of 5 Mpc, and the only determination of the velocity-anisotropy and pseudo-phase-space density profiles of both passive and star-forming galaxies for an individual cluster [abridged]
[71]  oai:arXiv.org:1305.5641  [pdf] - 1171566
Photometric redshifts for Quasars in multi band Surveys
Comments: 38 pages, Submitted to ApJ in February 2013; Accepted by ApJ in May 2013
Submitted: 2013-05-24
MLPQNA stands for Multi Layer Perceptron with Quasi Newton Algorithm and it is a machine learning method which can be used to cope with regression and classification problems on complex and massive data sets. In this paper we give the formal description of the method and present the results of its application to the evaluation of photometric redshifts for quasars. The data set used for the experiment was obtained by merging four different surveys (SDSS, GALEX, UKIDSS and WISE), thus covering a wide range of wavelengths from the UV to the mid-infrared. The method is able i) to achieve a very high accuracy; ii) to drastically reduce the number of outliers and catastrophic objects; iii) to discriminate among parameters (or features) on the basis of their significance, so that the number of features used for training and analysis can be optimized in order to reduce both the computational demands and the effects of degeneracy. The best experiment, which makes use of a selected combination of parameters drawn from the four surveys, leads, in terms of DeltaZnorm (i.e. (zspec-zphot)/(1+zspec)), to an average of DeltaZnorm = 0.004, a standard deviation sigma = 0.069 and a Median Absolute Deviation MAD = 0.02 over the whole redshift range (i.e. zspec <= 3.6), defined by the 4-survey cross-matched spectroscopic sample. The fraction of catastrophic outliers, i.e. of objects with photo-z deviating more than 2sigma from the spectroscopic value is < 3%, leading to a sigma = 0.035 after their removal, over the same redshift range. The method is made available to the community through the DAMEWARE web application.
[72]  oai:arXiv.org:1304.0597  [pdf] - 1165674
Astrophysical data mining with GPU. A case study: genetic classification of globular clusters
Comments: submitted to New Astronomy, Accepted; 17 pages, 5 figures
Submitted: 2013-04-02
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was successfully tested and validated on the detection of candidate Globular Clusters in deep, wide-field, single band HST images. The GPU version of GAME will be made available to the community by integrating it into the web application DAMEWARE (DAta Mining Web Application REsource (http://dame.dsf.unina.it/beta_info.html), a public data mining service specialized on massive astrophysical data. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm leads to a speedup of a factor of 200x in the training phase with respect to the CPU based version.
[73]  oai:arXiv.org:1212.0564  [pdf] - 1158240
Inside catalogs: a comparison of source extraction software
Comments: 20 pages, 10 figures, 6 tables. PASP in press
Submitted: 2012-12-03
The scope of this paper is to compare the catalog extraction performances obtained using the new combination of SExtractor with PSFEx, against the more traditional and diffuse application of DAOPHOT with ALLSTAR; therefore, the paper may provide a guide for the selection of the most suitable catalog extraction software. Both software packages were tested on two kinds of simulated images having, respectively, a uniform spatial distribution of sources and an overdensity in the center. In both cases, SExtractor is able to generate a deeper catalog than DAOPHOT. Moreover, the use of neural networks for object classification plus the novel SPREAD\_MODEL parameter push down to the limiting magnitude the possibility of star/galaxy separation. DAOPHOT and ALLSTAR provide an optimal solution for point-source photometry in stellar fields and very accurate and reliable PSF photometry, with robust star-galaxy separation. However, they are not useful for galaxy characterization, and do not generate catalogs that are very complete for faint sources. On the other hand, SExtractor, along with the new capability to derive PSF photometry, turns to be competitive and returns accurate photometry also for galaxies. We can assess that the new version of SExtractor, used in conjunction with PSFEx, represents a very powerful software package for source extraction with performances comparable to those of DAOPHOT. Finally, by comparing the results obtained in the case of a uniform and of an overdense spatial distribution of stars, we notice, for both software packages, a decline for the latter case in the quality of the results produced in terms of magnitudes and centroids.
[74]  oai:arXiv.org:1211.5481  [pdf] - 928091
Genetic Algorithm Modeling with GPU Parallel Computing Technology
Comments: 11 pages, 2 figures, refereed proceedings; Neural Nets and Surroundings, Proceedings of 22nd Italian Workshop on Neural Nets, WIRN 2012; Smart Innovation, Systems and Technologies, Vol. 19, Springer
Submitted: 2012-11-23
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from a multi-core CPU serial implementation, named GAME, already scientifically successfully tested and validated on astrophysical massive data classification problems, through a web application resource (DAMEWARE), specialized in data mining based on Machine Learning paradigms. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm has provided an exploit of the internal training features of the model, permitting a strong optimization in terms of processing performances and scalability.
[75]  oai:arXiv.org:1206.0876  [pdf] - 569159
Photometric redshifts with Quasi Newton Algorithm (MLPQNA). Results in the PHAT1 contest
Comments: Accepted for publication in Astronomy & Astrophysics; 9 pages, 2 figures
Submitted: 2012-06-05, last modified: 2012-08-08
Context. Since the advent of modern multiband digital sky surveys, photometric redshifts (photo-z's) have become relevant if not crucial to many fields of observational cosmology, from the characterization of cosmic structures, to weak and strong lensing. Aims. We describe an application to an astrophysical context, namely the evaluation of photometric redshifts, of MLPQNA, a machine learning method based on Quasi Newton Algorithm. Methods. Theoretical methods for photo-z's evaluation are based on the interpolation of a priori knowledge (spectroscopic redshifts or SED templates) and represent an ideal comparison ground for neural networks based methods. The MultiLayer Perceptron with Quasi Newton learning rule (MLPQNA) described here is a computing effective implementation of Neural Networks for the first time exploited to solve regression problems in the astrophysical context and is offered to the community through the DAMEWARE (DAta Mining & ExplorationWeb Application REsource) infrastructure. Results. The PHAT contest (Hildebrandt et al. 2010) provides a standard dataset to test old and new methods for photometric redshift evaluation and with a set of statistical indicators which allow a straightforward comparison among different methods. The MLPQNA model has been applied on the whole PHAT1 dataset of 1984 objects after an optimization of the model performed by using as training set the 515 available spectroscopic redshifts. When applied to the PHAT1 dataset, MLPQNA obtains the best bias accuracy (0.0006) and very competitive accuracies in terms of scatter (0.056) and outlier percentage (16.3%), scoring as the second most effective empirical method among those which have so far participated to the contest. MLPQNA shows better generalization capabilities than most other empirical methods especially in presence of underpopulated regions of the Knowledge Base.
[76]  oai:arXiv.org:1201.1867  [pdf] - 1092816
Astroinformatics, data mining and the future of astronomical research
Comments: To appear in the Proceedings of the 2-nd International Conference on Frontiers on diagnostic technologies
Submitted: 2012-01-09
Astronomy, as many other scientific disciplines, is facing a true data deluge which is bound to change both the praxis and the methodology of every day research work. The emerging field of astroinformatics, while on the one end appears crucial to face the technological challenges, on the other is opening new exciting perspectives for new astronomical discoveries through the implementation of advanced data mining procedures. The complexity of astronomical data and the variety of scientific problems, however, call for innovative algorithms and methods as well as for an extreme usage of ICT technologies.
[77]  oai:arXiv.org:1110.2144  [pdf] - 1084765
The detection of globular clusters in galaxies as a data mining problem
Comments: Accepted 2011 December 12; Received 2011 November 28; in original form 2011 October 10
Submitted: 2011-10-10, last modified: 2011-12-16
We present an application of self-adaptive supervised learning classifiers derived from the Machine Learning paradigm, to the identification of candidate Globular Clusters in deep, wide-field, single band HST images. Several methods provided by the DAME (Data Mining & Exploration) web application, were tested and compared on the NGC1399 HST data described in Paolillo 2011. The best results were obtained using a Multi Layer Perceptron with Quasi Newton learning rule which achieved a classification accuracy of 98.3%, with a completeness of 97.8% and 1.6% of contamination. An extensive set of experiments revealed that the use of accurate structural parameters (effective radius, central surface brightness) does improve the final result, but only by 5%. It is also shown that the method is capable to retrieve also extreme sources (for instance, very extended objects) which are missed by more traditional approaches.
[78]  oai:arXiv.org:1112.0750  [pdf] - 447227
DAME: A Distributed Data Mining & Exploration Framework within the Virtual Observatory
Comments: 20 pages, INGRID 2010 - 5th International Workshop on Distributed Cooperative Laboratories: "Instrumenting" the Grid, May 12-14, 2010, Poznan, Poland; Volume Remote Instrumentation for eScience and Related Aspects, 2011, F. Davoli et al. (eds.), SPRINGER NY
Submitted: 2011-12-04
Nowadays, many scientific areas share the same broad requirements of being able to deal with massive and distributed datasets while, when possible, being integrated with services and applications. In order to solve the growing gap between the incremental generation of data and our understanding of it, it is required to know how to access, retrieve, analyze, mine and integrate data from disparate sources. One of the fundamental aspects of any new generation of data mining software tool or package which really wants to become a service for the community is the possibility to use it within complex workflows which each user can fine tune in order to match the specific demands of his scientific goal. These workflows need often to access different resources (data, providers, computing facilities and packages) and require a strict interoperability on (at least) the client side. The project DAME (DAta Mining & Exploration) arises from these requirements by providing a distributed WEB-based data mining infrastructure specialized on Massive Data Sets exploration with Soft Computing methods. Originally designed to deal with astrophysical use cases, where first scientific application examples have demonstrated its effectiveness, the DAME Suite results as a multi-disciplinary platform-independent tool perfectly compliant with modern KDD (Knowledge Discovery in Databases) requirements and Information & Communication Technology trends.
[79]  oai:arXiv.org:1112.0742  [pdf] - 447224
The DAME/VO-Neural Infrastructure: an Integrated Data Mining System Support for the Science Community
Comments: 10 pages, Proceedings of the Final Workshop of the Grid Projects of the Italian National Operational Programme 2000-2006 Call 1575; Edited by Cometa Consortium, 2009, ISBN: 978-88-95892-02-3
Submitted: 2011-12-04
Astronomical data are gathered through a very large number of heterogeneous techniques and stored in very diversified and often incompatible data repositories. Moreover in the e-science environment, it is needed to integrate services across distributed, heterogeneous, dynamic "virtual organizations" formed by different resources within a single enterprise and/or external resource sharing and service provider relationships. The DAME/VONeural project, run jointly by the University Federico II, INAF (National Institute of Astrophysics) Astronomical Observatories of Napoli and the California Institute of Technology, aims at creating a single, sustainable, distributed e-infrastructure for data mining and exploration in massive data sets, to be offered to the astronomical (but not only) community as a web application. The framework makes use of distributed computing environments (e.g. S.Co.P.E.) and matches the international IVOA standards and requirements. The integration process is technically challenging due to the need of achieving a specific quality of service when running on top of different native platforms. In these terms, the result of the DAME/VO-Neural project effort will be a service-oriented architecture, obtained by using appropriate standards and incorporating Grid paradigms and restful Web services frameworks where needed, that will have as main target the integration of interdisciplinary distributed systems within and across organizational domains.
[80]  oai:arXiv.org:1109.4104  [pdf] - 415247
VOGCLUSTERS: an example of DAME web application
Comments: 4 pages, 1 figure. Proceedings of "Advances in Computational Astrophysics: methods, tools and outcomes" (Cefal\`u, Sicily, June 2011). To be published on ASP Conference Series
Submitted: 2011-09-19, last modified: 2011-09-22
We present the alpha release of the VOGCLUSTERS web application, specialized for data and text mining on globular clusters. It is one of the web2.0 technology based services of Data Mining & Exploration (DAME) Program, devoted to mine and explore heterogeneous information related to globular clusters data.
[81]  oai:arXiv.org:1109.2840  [pdf] - 1084049
Extracting Knowledge From Massive Astronomical Data Sets
Comments:
Submitted: 2011-09-13, last modified: 2011-09-21
The exponential growth of astronomical data collected by both ground based and space borne instruments has fostered the growth of Astroinformatics: a new discipline laying at the intersection between astronomy, applied computer science, and information and computation (ICT) technologies. At the very heart of Astroinformatics is a complex set of methodologies usually called Data Mining (DM) or Knowledge Discovery in Data Bases (KDD). In the astronomical domain, DM/KDD are still in a very early usage stage, even though new methods and tools are being continuously deployed in order to cope with the Massive Data Sets (MDS) that can only grow in the future. In this paper, we briefly outline some general problems encountered when applying DM/KDD methods to astrophysical problems, and describe the DAME (DAta Mining & Exploration) web application. While specifically tailored to work on MDS, DAME can be effectively applied also to smaller data sets. As an illustration, we describe two application of DAME to two different problems: the identification of candidate globular clusters in external galaxies, and the classification of active galactic nuclei (AGN). We believe that tools and services of this nature will become increasingly necessary for the data-intensive astronomy (and indeed all sciences) in the 21st century.
[82]  oai:arXiv.org:1010.4843  [pdf] - 275635
DAME: A Web Oriented Infrastructure for Scientific Data Mining & Exploration
Comments: 16 pages, 9 figures, software available at http://voneural.na.infn.it/beta_info.html
Submitted: 2010-10-23, last modified: 2010-12-07
Nowadays, many scientific areas share the same need of being able to deal with massive and distributed datasets and to perform on them complex knowledge extraction tasks. This simple consideration is behind the international efforts to build virtual organizations such as, for instance, the Virtual Observatory (VObs). DAME (DAta Mining & Exploration) is an innovative, general purpose, Web-based, VObs compliant, distributed data mining infrastructure specialized in Massive Data Sets exploration with machine learning methods. Initially fine tuned to deal with astronomical data only, DAME has evolved in a general purpose platform which has found applications also in other domains of human endeavor. We present the products and a short outline of a science case, together with a detailed description of main features available in the beta release of the web application now released.
[83]  oai:arXiv.org:1010.3796  [pdf] - 243803
Mining Knowledge in Astrophysical Massive Data Sets
Comments: Pages 845-849 1rs International Conference on Frontiers in Diagnostics Technologies
Submitted: 2010-10-19
Modern scientific data mainly consist of huge datasets gathered by a very large number of techniques and stored in very diversified and often incompatible data repositories. More in general, in the e-science environment, it is considered as a critical and urgent requirement to integrate services across distributed, heterogeneous, dynamic "virtual organizations" formed by different resources within a single enterprise. In the last decade, Astronomy has become an immensely data rich field due to the evolution of detectors (plates to digital to mosaics), telescopes and space instruments. The Virtual Observatory approach consists into the federation under common standards of all astronomical archives available worldwide, as well as data analysis, data mining and data exploration applications. The main drive behind such effort being that once the infrastructure will be completed, it will allow a new type of multi-wavelength, multi-epoch science which can only be barely imagined. Data Mining, or Knowledge Discovery in Databases, while being the main methodology to extract the scientific information contained in such MDS (Massive Data Sets), poses crucial problems since it has to orchestrate complex problems posed by transparent access to different computing environments, scalability of algorithms, reusability of resources, etc. In the present paper we summarize the present status of the MDS in the Virtual Observatory and what is currently done and planned to bring advanced Data Mining methodologies in the case of the DAME (DAta Mining & Exploration) project.
[84]  oai:arXiv.org:1007.1455  [pdf] - 200714
A decline and fall in the future of Italian Astronomy?
Antonelli, Angelo; Antonuccio-Delogu, Vincenzo; Baruffolo, Andrea; Benetti, Stefano; Bianchi, Simone; Biviano, Andrea; Bonafede, Annalisa; Bondi, Marco; Borgani, Stefano; Bragaglia, Angela; Brescia, Massimo; Brucato, John Robert; Brunetti, Gianfranco; Brunino, Riccardo; Cantiello, Michele; Casasola, Viviana; Cassano, Rossella; Cellino, Alberto; Cescutti, Gabriele; Cimatti, Andrea; Comastri, Andrea; Corbelli, Edvige; Cresci, Giovanni; Criscuoli, Serena; Cristiani, Stefano; Cupani, Guido; De Grandi, Sabrina; D'Elia, Valerio; Del Santo, Melania; De Lucia, Gabriella; Desidera, Silvano; Di Criscienzo, Marcella; D'Odorico, Valentina; Dotto, Elisabetta; Fontanot, Fabio; Gai, Mario; Gallerani, Simona; Gallozzi, Stefano; Garilli, Bianca; Gioia, Isabella; Girardi, Marisa; Gitti, Myriam; Granato, Gianluigi; Gratton, Raffaele; Grazian, Andrea; Gruppioni, Carlotta; Hunt, Leslie; Leto, Giuseppe; Israel, Gianluca; Magliocchetti, Manuela; Magrini, Laura; Mainetti, Gabriele; Mannucci, Filippo; Marconi, Alessandro; Marelli, Martino; Maris, Michele; Matteucci, Francesca; Meneghetti, Massimo; Mennella, Aniello; Mercurio, Amata; Molendi, Silvano; Monaco, Pierluigi; Moretti, Alessia; Murante, Giuseppe; Nicastro, Fabrizio; Orio, Marina; Paizis, Adamantia; Panessa, Francesca; Pasian, Fabio; Pentericci, Laura; Pozzetti, Lucia; Rossetti, Mariachiara; Santos, Joana S.; Saro, Alexandro; Schneider, Raffaella; Silva, Laura; Silvotti, Roberto; Smart, Richard; Tiengo, Andrea; Tornatore, Luca; Tozzi, Paolo; Trussoni, Edoardo; Valentinuzzi, Tiziano; Vanzella, Eros; Vazza, Franco; Vecchiato, Alberto; Venturi, Tiziana; Vianello, Giacomo; Viel, Matteo; Villalobos, Alvaro; Viotto, Valentina; Vulcani, Benedetta
Comments: Also available at http://adoptitaastronom.altervista.org/index.html
Submitted: 2010-07-08
On May 27th 2010, the Italian astronomical community learned with concern that the National Institute for Astrophysics (INAF) was going to be suppressed, and that its employees were going to be transferred to the National Research Council (CNR). It was not clear if this applied to all employees (i.e. also to researchers hired on short-term contracts), and how this was going to happen in practice. In this letter, we give a brief historical overview of INAF and present a short chronicle of the few eventful days that followed. Starting from this example, we then comment on the current situation and prospects of astronomical research in Italy.
[85]  oai:arXiv.org:0807.0967  [pdf] - 14254
Astrophysics in S.Co.P.E
Comments:
Submitted: 2008-07-07
S.Co.P.E. is one of the four projects funded by the Italian Government in order to provide Southern Italy with a distributed computing infrastructure for fundamental science. Beside being aimed at building the infrastructure, S.Co.P.E. is also actively pursuing research in several areas among which astrophysics and observational cosmology. We shortly summarize the most significant results obtained in the first two years of the project and related to the development of middleware and Data Mining tools for the Virtual Observatory.
[86]  oai:arXiv.org:0806.1144  [pdf] - 13320
GRID-Launcher v.1.0
Comments: Contributed, Data Centre Alliance Workshops: GRID and the Virtual Observatory, April 9-11 Munich, to appear in Mem. SAIt
Submitted: 2008-06-06
GRID-launcher-1.0 was built within the VO-Tech framework, as a software interface between the UK-ASTROGRID and a generic GRID infrastructures in order to allow any ASTROGRID user to launch on the GRID computing intensive tasks from the ASTROGRID Workbench or Desktop. Even though of general application, so far the Grid-Launcher has been tested on a few selected softwares (VONeural-MLP, VONeural-SVM, Sextractor and SWARP) and on the SCOPE-GRID.
[87]  oai:arXiv.org:0806.1006  [pdf] - 13294
The VO-Neural project: recent developments and some applications
Comments: Contributed, Data Centre Alliance Workshops: GRID and the Virtual Observatory, April 9-11 Munich, to appear in Mem. SAIt
Submitted: 2008-06-05
VO-Neural is the natural evolution of the Astroneural project which was started in 1994 with the aim to implement a suite of neural tools for data mining in astronomical massive data sets. At a difference with its ancestor, which was implemented under Matlab, VO-Neural is written in C++, object oriented, and it is specifically tailored to work in distributed computing architectures. We discuss the current status of implementation of VO-Neural, present an application to the classification of Active Galactic Nuclei, and outline the ongoing work to improve the functionalities of the package.
[88]  oai:arXiv.org:astro-ph/0701135  [pdf] - 88264
Steps towards a map of the nearby universe
Comments: 3 pages, 1 figure. To appear in Nucl Phys. B, in the proceedings of the NOW-2006 (Neutrino Oscillation Workshop - 2006), R. Fogli et al. eds
Submitted: 2007-01-05, last modified: 2007-01-28
We present a new analysis of the Sloan Digital Sky Survey data aimed at producing a detailed map of the nearby (z < 0.5) universe. Using neural networks trained on the available spectroscopic base of knowledge we derived distance estimates for about 30 million galaxies distributed over ca. 8,000 sq. deg. We also used unsupervised clustering tools developed in the framework of the VO-Tech project, to investigate the possibility to understand the nature of each object present in the field and, in particular, to produce a list of candidate AGNs and QSOs.
[89]  oai:arXiv.org:astro-ph/0701621  [pdf] - 88750
Statistical analysis of the trigger algorithm for the NEMO project
Comments: Published in the Proceedings of the "I Workshop of Astronomy and Astrophysics for Students", Eds. N.R. Napolitano & M. Paolillo, Naples, 19-20 April 2006 (astro-ph/0701577)
Submitted: 2007-01-22
We discuss the performances of a trigger implemented for the planned neutrino telescope NEMO. This trigger seems capable to discriminate between the signal and the strong background introduced by atmospheric muons and by the beta decay of the K-40 nuclei present in the water. The performances of the trigger, as evaluated on simulated data are analyzed in detail.
[90]  oai:arXiv.org:astro-ph/0701622  [pdf] - 88751
Implementation of the trigger algorithm for the NEMO project
Comments: Published in the Proceedings of the "I Workshop of Astronomy and Astrophysics for Students", Eds. N.R. Napolitano & M. Paolillo, Naples, 19-20 April 2006 (astro-ph/0701577)
Submitted: 2007-01-22
We describe the implementation of trigger algorithm specifically tailored on the characteristics of the neutrino telescope NEMO. Extensive testing against realistic simulations shows that, by making use of the uncorrelated nature of the noise produced mainly by the decay of K-40 beta-decay, this trigger is capable to discriminate among different types of muonic events.
[91]  oai:arXiv.org:astro-ph/0701137  [pdf] - 88266
The use of neural networks to probe the structure of the nearby universe
Comments: 7 pages, 5 figures. To appear in the proceedings of the Astronomical Data Analysis -IV workshop held in Marseille in 2006. J.L. Starck et al. eds
Submitted: 2007-01-05
In the framework of the European VO-Tech project, we are implementing new machine learning methods specifically tailored to match the needs of astronomical data mining. In this paper, we shortly present the methods and discuss an application to the Sloan Digital Sky Survey public data set. In particular, we discuss some preliminary results on the 3-D taxonomy of the nearby (z < 0.5) universe. Using neural networks trained on the available spectroscopic base of knowledge we derived distance estimates for ca. 30 million galaxies distributed over 8,000 sq. deg. We also use unsupervised clustering tools to investigate whether it is possible to characterize in broad morphological bins the nature of each object and produce a reliable list of candidate AGNs and QSOs.
[92]  oai:arXiv.org:astro-ph/0501598  [pdf] - 70710
VST - VLT Survey Telescope Integration Status
Comments: 2 pages, 2 figures, conference
Submitted: 2005-01-27
The VLT Survey Telescope (VST) is a 2.6m aperture, wide field, UV to I facility, to be installed at the European Southern Observatory (ESO) on the Cerro Paranal Chile. VST was primarily intended to complement the observing capabilities of VLT with wide-angle imaging for detecting and pre-characterising sources for further observations with the VLT.
[93]  oai:arXiv.org:astro-ph/0111139  [pdf] - 45905
The VST telescope control software in the ESO VLT environment
Comments: 3 pages, 2 figures, ICALEPCS 2001 Conference, PSN#THAP051
Submitted: 2001-11-07, last modified: 2001-12-06
The VST (VLT Survey Telescope) is a 2.6 m Alt-Az telescope to be installed at Mount Paranal in Chile, in the European Southern Observatory (ESO) site. The VST is a wide-field imaging facility planned to supply databases for the ESO Very Large Telescope (VLT) science and carry out stand-alone observations in the UV to I spectral range. This paper will focus mainly on control software aspects, describing the VST software architecture in the context of the whole ESO VLT control concept. The general architecture and the main components of the control software will be described.
[94]  oai:arXiv.org:astro-ph/0111142  [pdf] - 45908
Active optics control of the VST telescope with the CAN field-bus
Comments: 3 pages, 3 figures, ICALEPCS 2001 Conference, PSN#TUAP057
Submitted: 2001-11-07, last modified: 2001-12-06
The VST (VLT Survey Telescope) is a 2.6 m class Alt-Az telescope to be installed at Mount Paranal in the Atacama desert, Chile, in the European Southern Observatory (ESO) site. The VST is a wide-field imaging facility planned to supply databases for the ESO Very Large Telescope (VLT) science and carry out stand-alone observations in the UV to I spectral range. This paper will focus on the distributed control system of active optics based on CAN bus and PIC microcontrollers. Both axial and radial pads of the primary mirror will be equipped by astatic lever supports controlled by microcontroller units. The same CAN bus + microcontroller boards approach will be used for the temperature acquisition modules.
[95]  oai:arXiv.org:astro-ph/0111143  [pdf] - 45909
Integration of the VIMOS control system
Comments: 3 pages, 3 figures, ICALEPCS 2001 Conference, PSN#TUBT003
Submitted: 2001-11-07, last modified: 2001-12-06
The VIRMOS consortium of French and Italian Institutes (PI: O. Le Fevre, co-PI: G. Vettolani) is manufacturing two wide field imaging multi-object spectrographs for the European Southern Observatory Very Large Telescope (VLT), with emphasis on the ability to carry over spectroscopic surveys of large numbers of sources: the VIsible Multi-Object Spectrograph, VIMOS, and the Near InfraRed Multi-Object Spectrograph, NIRMOS. There are 52 motors to be controlled in parallel in the spectrograph, making VIMOS a complex machine to be handled. This paper will focus on the description of the control system, designed in the ESO VLT standard control concepts, and on some integration issues and problem solving strategies.