Normalized to: Rohde, D.
[1]
oai:arXiv.org:astro-ph/0605216 [pdf] - 81904
Matching Catalogues by Probabilistic Pattern Classification
Submitted: 2006-05-09
We consider the statistical problem of catalogue matching from a machine
learning perspective with the goal of producing probabilistic outputs, and
using all available information. A framework is provided that unifies two
existing approaches to producing probabilistic outputs in the literature, one
based on combining distribution estimates and the other based on combining
probabilistic classifiers. We apply both of these to the problem of matching
the HIPASS radio catalogue with large positional uncertainties to the much
denser SuperCOSMOS catalogue with much smaller positional uncertainties. We
demonstrate the utility of probabilistic outputs by a controllable completeness
and efficiency trade-off and by identifying objects that have high probability
of being rare. Finally, possible biasing effects in the output of these
classifiers are also highlighted and discussed.
[2]
oai:arXiv.org:astro-ph/0505591 [pdf] - 73361
The HIPASS Catalogue: III - Optical Counterparts & Isolated Dark
Galaxies
Doyle, Marianne T.;
Drinkwater, M. J.;
Rohde, D. J.;
Pimbblet, K. A.;
Read, M.;
Meyer, M. J.;
Zwaan, M. A.;
Ryan-Weber, E.;
Stevens, J.;
Koribalski, B. S.;
Webster, R. L.;
Staveley-Smith, L.;
Barnes, D. G.;
Howlett, M.;
Kilborn, V. A.;
Waugh, M.;
Pierce, M. J.;
Bhathal, R.;
de Blok, W. J. G.;
Disney, M. J.;
Ekers, R. D.;
Freeman, K. C.;
Garcia, D. A.;
Gibson, B. K.;
Harnett, J.;
Henning, P. A.;
Jerjen, H.;
Kesteven, M. J.;
Knezek, P. M.;
Mader, S.;
Marquarding, M.;
Minchin, R. F.;
O'Brien, J.;
Oosterloo, T.;
Price, R. M.;
Putman, M. E.;
Ryder, S. D.;
Sadler, E. M.;
Stewart, I. M.;
Stootman, F.;
Wright, A. E.
Submitted: 2005-05-30
We present the largest catalogue to date of optical counterparts for HI
radio-selected galaxies, Hopcat. Of the 4315 HI radio-detected sources from the
HI Parkes All Sky Survey (Hipass) catalogue, we find optical counterparts for
3618 (84%) galaxies. Of these, 1798 (42%) have confirmed optical velocities and
848 (20%) are single matches without confirmed velocities. Some galaxy matches
are members of galaxy groups. From these multiple galaxy matches, 714 (16%)
have confirmed optical velocities and a further 258 (6%) galaxies are without
confirmed velocities. For 481 (11%), multiple galaxies are present but no
single optical counterpart can be chosen and 216 (5%) have no obvious optical
galaxy present. Most of these 'blank fields' are in crowded fields along the
Galactic plane or have high extinctions.
Isolated 'Dark galaxy' candidates are investigated using an extinction cut of
ABj < 1 mag and the blank fields category. Of the 3692 galaxies with an ABj
extinction < 1 mag, only 13 are also blank fields. Of these, 12 are eliminated
either with follow-up Parkes observations or are in crowded fields. The
remaining one has a low surface brightness optical counterpart. Hence, no
isolated optically dark galaxies have been found within the limits of the
Hipass survey.
[3]
oai:arXiv.org:astro-ph/0504013 [pdf] - 72110
Applying Machine Learning to Catalogue Matching in Astrophysics
Submitted: 2005-04-01
We present the results of applying automated machine learning techniques to
the problem of matching different object catalogues in astrophysics. In this
study we take two partially matched catalogues where one of the two catalogues
has a large positional uncertainty. The two catalogues we used here were taken
from the HI Parkes All Sky Survey (HIPASS), and SuperCOSMOS optical survey.
Previous work had matched 44% (1887 objects) of HIPASS to the SuperCOSMOS
catalogue.
A supervised learning algorithm was then applied to construct a model of the
matched portion of our catalogue. Validation of the model shows that we
achieved a good classification performance (99.12% correct).
Applying this model, to the unmatched portion of the catalogue found 1209 new
matches. This increases the catalogue size from 1887 matched objects to 3096.
The combination of these procedures yields a catalogue that is 72% matched.