Normalized to: Dan, G.
[1]
oai:arXiv.org:0802.0537 [pdf] - 9762
Support Vector Machines and Kd-tree for Separating Quasars from Large
Survey Databases
Submitted: 2008-02-04
We compare the performance of two automated classification algorithms:
k-dimensional tree (kd-tree) and support vector machines (SVMs), to separate
quasars from stars in the databases of the Sloan Digital Sky Survey (SDSS) and
the Two Micron All Sky Survey (2MASS) catalogs. The two algorithms are trained
on subsets of SDSS and 2MASS objects whose nature is known via spectroscopy. We
choose different attribute combination as input patterns to train the
classifier using photometric data only and present the classification results
obtained by these two methods. Performance metrics such as precision and
recall, true positive rate and true negative rate, F-measure, G-mean and
Weighted Accuracy are computed to evaluate the performance of the two
algorithms. The study shows that both kd-tree and SVMs are effective automated
algorithms to classify point sources. SVMs show slightly higher accuracy, but
kd-tree requires less computation time. Given different input patterns based on
various parameters(e.g. magnitudes, color information), we conclude that both
kd-tree and SVMs show better performance with fewer features. What is more, our
results also indicate that the accuracy using the four colors (u-g, g-r, r-i,
i-z) and r magnitude based on SDSS model magnitudes adds up to the highest
value. The classifiers trained by kd-tree and SVMs can be used to solve the
automated classification problems faced by the virtual observatory (VO);
moreover, they all can be applied for the photometric preselection of quasar
candidates for large survey projects in order to optimize the efficiency of
telescopes.