Normalized to: Thuillard, M.
[1]
oai:arXiv.org:1508.06756 [pdf] - 1267442
Multivariate Approaches to Classification in Extragalactic Astronomy
Submitted: 2015-08-27
Clustering objects into synthetic groups is a natural activity of any
science. Astrophysics is not an exception and is now facing a deluge of data.
For galaxies, the one-century old Hubble classification and the Hubble tuning
fork are still largely in use, together with numerous mono-or bivariate
classifications most often made by eye. However, a classification must be
driven by the data, and sophisticated multivariate statistical tools are used
more and more often. In this paper we review these different approaches in
order to situate them in the general context of unsupervised and supervised
learning. We insist on the astrophysical outcomes of these studies to show that
multivariate analyses provide an obvious path toward a renewal of our
classification of galaxies and are invaluable tools to investigate the physics
and evolution of galaxies.
[2]
oai:arXiv.org:1206.3690 [pdf] - 1124165
A six-parameter space to describe galaxy diversification
Submitted: 2012-06-16, last modified: 2012-07-04
Galaxy diversification proceeds by transforming events like accretion,
interaction or mergers. These explain the formation and evolution of galaxies
that can now be described with many observables. Multivariate analyses are the
obvious tools to tackle the datasets and understand the differences between
different kinds of objects. However, depending on the method used,
redundancies, incompatibilities or subjective choices of the parameters can
void the usefulness of such analyses. The behaviour of the available parameters
should be analysed before an objective reduction of dimensionality and
subsequent clustering analyses can be undertaken, especially in an evolutionary
context. We study a sample of 424 early-type galaxies described by 25
parameters, ten of which are Lick indices, to identify the most structuring
parameters and determine an evolutionary classification of these objects. Four
independent statistical methods are used to investigate the discriminant
properties of the observables and the partitioning of the 424 galaxies:
Principal Component Analysis, K-means cluster analysis, Minimum Contradiction
Analysis and Cladistics. (abridged)
[3]
oai:arXiv.org:0905.2481 [pdf] - 24302
Phylogenetic Applications of the Minimum Contradiction Approach on
Continuous Characters
Submitted: 2009-05-15
We describe the conditions under which a set of continuous variables or
characters can be described as an X-tree or a split network. A distance matrix
corresponds exactly to a split network or a valued X-tree if, after ordering of
the taxa, the variables values can be embedded into a function with at most a
local maxima and a local minima, and crossing any horizontal line at most
twice. In real applications, the order of the taxa best satisfying the above
conditions can be obtained using the Minimum Contradiction method. This
approach is applied to 2 sets of continuous characters. The first set
corresponds to craniofacial landmarks in Hominids. The contradiction matrix is
used to identify possible tree structures and some alternatives when they
exist. We explain how to discover the main structuring characters in a tree.
The second set consists of a sample of 100 galaxies. In that second example one
shows how to discretize the continuous variables describing physical properties
of the galaxies without disrupting the underlying tree structure.