Normalized to: Llorà, X.
[1]
oai:arXiv.org:astro-ph/0612471 [pdf] - 316659
Robust Machine Learning Applied to Astronomical Datasets II: Quantifying
Photometric Redshifts for Quasars Using Instance-Based Learning
Submitted: 2006-12-17, last modified: 2007-03-22
We apply instance-based machine learning in the form of a k-nearest neighbor
algorithm to the task of estimating photometric redshifts for 55,746 objects
spectroscopically classified as quasars in the Fifth Data Release of the Sloan
Digital Sky Survey. We compare the results obtained to those from an empirical
color-redshift relation (CZR). In contrast to previously published results
using CZRs, we find that the instance-based photometric redshifts are assigned
with no regions of catastrophic failure. Remaining outliers are simply
scattered about the ideal relation, in a similar manner to the pattern seen in
the optical for normal galaxies at redshifts z < ~1. The instance-based
algorithm is trained on a representative sample of the data and
pseudo-blind-tested on the remaining unseen data. The variance between the
photometric and spectroscopic redshifts is sigma^2 = 0.123 +/- 0.002 (compared
to sigma^2 = 0.265 +/- 0.006 for the CZR), and 54.9 +/- 0.7%, 73.3 +/- 0.6%,
and 80.7 +/- 0.3% of the objects are within delta z < 0.1, 0.2, and 0.3
respectively. We also match our sample to the Second Data Release of the Galaxy
Evolution Explorer legacy data and the resulting 7,642 objects show a further
improvement, giving a variance of sigma^2 = 0.054 +/- 0.005, and 70.8 +/- 1.2%,
85.8 +/- 1.0%, and 90.8 +/- 0.7% of objects within delta z < 0.1, 0.2, and 0.3.
We show that the improvement is indeed due to the extra information provided by
GALEX, by training on the same dataset using purely SDSS photometry, which has
a variance of sigma^2 = 0.090 +/- 0.007. Each set of results represents a
realistic standard for application to further datasets for which the spectra
are representative.