Normalized to: Alger, M.
[1]
oai:arXiv.org:1906.02864 [pdf] - 1975293
Radio Galaxy Zoo: Unsupervised Clustering of Convolutionally
Auto-encoded Radio-astronomical Images
Ralph, Nicholas O.;
Norris, Ray P.;
Fang, Gu;
Park, Laurence A. F.;
Galvin, Timothy J.;
Alger, Matthew J.;
Andernach, Heinz;
Lintott, Chris;
Rudnick, Lawrence;
Shabala, Stanislav;
Wong, O. Ivy
Submitted: 2019-06-06
This paper demonstrates a novel and efficient unsupervised clustering method
with the combination of a Self-Organising Map (SOM) and a convolutional
autoencoder. The rapidly increasing volume of radio-astronomical data has
increased demand for machine learning methods as solutions to classification
and outlier detection. Major astronomical discoveries are unplanned and found
in the unexpected, making unsupervised machine learning highly desirable by
operating without assumptions and labelled training data. Our approach shows
SOM training time is drastically reduced and high-level features can be
clustered by training on auto-encoded feature vectors instead of raw images.
Our results demonstrate this method is capable of accurately separating
outliers on a SOM with neighbourhood similarity and K-means clustering of
radio-astronomical features complexity. We present this method as a powerful
new approach to data exploration by providing a detailed understanding of the
morphology and relationships of Radio Galaxy Zoo (RGZ) dataset image features
which can be applied to new radio survey data.
[2]
oai:arXiv.org:1904.02876 [pdf] - 1862368
Radio Galaxy Zoo: Knowledge Transfer Using Rotationally Invariant
Self-Organising Maps
Galvin, T. J.;
Huynh, M.;
Norris, R. P.;
Wang, X. R.;
Hopkins, E.;
Wong, O. I.;
Shabala, S.;
Rudnick, L.;
Alger, M. J.;
Polsterer, K. L.
Submitted: 2019-04-05
With the advent of large scale surveys the manual analysis and classification
of individual radio source morphologies is rendered impossible as existing
approaches do not scale. The analysis of complex morphological features in the
spatial domain is a particularly important task. Here we discuss the challenges
of transferring crowdsourced labels obtained from the Radio Galaxy Zoo project
and introduce a proper transfer mechanism via quantile random forest
regression. By using parallelized rotation and flipping invariant Kohonen-maps,
image cubes of Radio Galaxy Zoo selected galaxies formed from the FIRST radio
continuum and WISE infrared all sky surveys are first projected down to a
two-dimensional embedding in an unsupervised way. This embedding can be seen as
a discretised space of shapes with the coordinates reflecting morphological
features as expressed by the automatically derived prototypes. We find that
these prototypes have reconstructed physically meaningful processes across two
channel images at radio and infrared wavelengths in an unsupervised manner. In
the second step, images are compared with those prototypes to create a
heat-map, which is the morphological fingerprint of each object and the basis
for transferring the user generated labels. These heat-maps have reduced the
feature space by a factor of 248 and are able to be used as the basis for
subsequent ML methods. Using an ensemble of decision trees we achieve upwards
of 85.7% and 80.7% accuracy when predicting the number of components and peaks
in an image, respectively, using these heat-maps. We also question the
currently used discrete classification schema and introduce a continuous scale
that better reflects the uncertainty in transition between two classes, caused
by sensitivity and resolution limits.
[3]
oai:arXiv.org:1805.12008 [pdf] - 1775562
Radio Galaxy Zoo: ClaRAN - A Deep Learning Classifier for Radio
Morphologies
Wu, Chen;
Wong, O. Ivy;
Rudnick, Lawrence;
Shabala, Stanislav S.;
Alger, Matthew J.;
Banfield, Julie K.;
Ong, Cheng Soon;
White, Sarah V.;
Garon, Avery F.;
Norris, Ray P.;
Andernach, Heinz;
Tate, Jean;
Lukic, Vesna;
Tang, Hongming;
Schawinski, Kevin;
Diakogiannis, Foivos I.
Submitted: 2018-05-30, last modified: 2018-10-29
The upcoming next-generation large area radio continuum surveys can expect
tens of millions of radio sources, rendering the traditional method for radio
morphology classification through visual inspection unfeasible. We present
ClaRAN - Classifying Radio sources Automatically with Neural networks - a
proof-of-concept radio source morphology classifier based upon the Faster
Region-based Convolutional Neutral Networks (Faster R-CNN) method.
Specifically, we train and test ClaRAN on the FIRST and WISE images from the
Radio Galaxy Zoo Data Release 1 catalogue. ClaRAN provides end users with
automated identification of radio source morphology classifications from a
simple input of a radio image and a counterpart infrared image of the same
region. ClaRAN is the first open-source, end-to-end radio source morphology
classifier that is capable of locating and associating discrete and extended
components of radio sources in a fast (< 200 milliseconds per image) and
accurate (>= 90 %) fashion. Future work will improve ClaRAN's relatively lower
success rates in dealing with multi-source fields and will enable ClaRAN to
identify sources on much larger fields without loss in classification accuracy.
[4]
oai:arXiv.org:1805.05540 [pdf] - 1684633
Radio Galaxy Zoo: Machine learning for radio source host galaxy
cross-identification
Submitted: 2018-05-14
We consider the problem of determining the host galaxies of radio sources by
cross-identification. This has traditionally been done manually, which will be
intractable for wide-area radio surveys like the Evolutionary Map of the
Universe (EMU). Automated cross-identification will be critical for these
future surveys, and machine learning may provide the tools to develop such
methods. We apply a standard approach from computer vision to
cross-identification, introducing one possible way of automating this problem,
and explore the pros and cons of this approach. We apply our method to the 1.4
GHz Australian Telescope Large Area Survey (ATLAS) observations of the Chandra
Deep Field South (CDFS) and the ESO Large Area ISO Survey South 1 (ELAIS-S1)
fields by cross-identifying them with the Spitzer Wide-area Infrared
Extragalactic (SWIRE) survey. We train our method with two sets of data: expert
cross-identifications of CDFS from the initial ATLAS data release and
crowdsourced cross-identifications of CDFS from Radio Galaxy Zoo. We found that
a simple strategy of cross-identifying a radio component with the nearest
galaxy performs comparably to our more complex methods, though our estimated
best-case performance is near 100 per cent. ATLAS contains 87 complex radio
sources that have been cross-identified by experts, so there are not enough
complex examples to learn how to cross-identify them accurately. Much larger
datasets are therefore required for training methods like ours. We also show
that training our method on Radio Galaxy Zoo cross-identifications gives
comparable results to training on expert cross-identifications, demonstrating
the value of crowdsourced training data.