Normalized to: Hattab, M.
[1]
oai:arXiv.org:1805.07435 [pdf] - 1813997
A case study of hurdle and generalized additive models in astronomy: the
escape of ionizing radiation
Submitted: 2018-05-18, last modified: 2019-01-13
The dark ages of the Universe end with the formation of the first generation
of stars residing in primeval galaxies. These objects were the first to produce
ultraviolet ionizing photons in a period when the cosmic gas changed from a
neutral state to an ionized one, known as Epoch of Reionization (EoR). A
pivotal aspect to comprehend the EoR is to probe the intertwined relationship
between the fraction of ionizing photons capable to escape dark haloes, also
known as the escape fraction ($f_{esc}$), and the physical properties of the
galaxy. This work develops a sound statistical model suitable to account for
such non-linear relationships and the non-Gaussian nature of $f_{esc}$. This
model simultaneously estimates the probability that a given primordial galaxy
starts the ionizing photon production and estimates the mean level of the
$f_{esc}$ once it is triggered. The model was employed in the First Billion
Years simulation suite, from which we show that the baryonic fraction and the
rate of ionizing photons appear to have a larger impact on $f_{esc}$ than
previously thought. A naive univariate analysis of the same problem would
suggest smaller effects for these properties and a much larger impact for the
specific star formation rate, which is lessened after accounting for other
galaxy properties and non-linearities in the statistical model.
[2]
oai:arXiv.org:1701.08748 [pdf] - 1935344
On the realistic validation of photometric redshifts, or why Teddy will
never be Happy
Submitted: 2017-01-30, last modified: 2017-03-20
Two of the main problems encountered in the development and accurate
validation of photometric redshift (photo-z) techniques are the lack of
spectroscopic coverage in feature space (e.g. colours and magnitudes) and the
mismatch between photometric error distributions associated with the
spectroscopic and photometric samples. Although these issues are well known,
there is currently no standard benchmark allowing a quantitative analysis of
their impact on the final photo-z estimation. In this work, we present two
galaxy catalogues, Teddy and Happy, built to enable a more demanding and
realistic test of photo-z methods. Using photometry from the Sloan Digital Sky
Survey and spectroscopy from a collection of sources, we constructed datasets
which mimic the biases between the underlying probability distribution of the
real spectroscopic and photometric sample. We demonstrate the potential of
these catalogues by submitting them to the scrutiny of different photo-z
methods, including machine learning (ML) and template fitting approaches.
Beyond the expected bad results from most ML algorithms for cases with missing
coverage in feature space, we were able to recognize the superiority of global
models in the same situation and the general failure across all types of
methods when incomplete coverage is convoluted with the presence of photometric
errors - a data situation which photo-z methods were not trained to deal with
up to now and which must be addressed by future large scale surveys. Our
catalogues represent the first controlled environment allowing a
straightforward implementation of such tests. The data are publicly available
within the COINtoolbox (https://github.com/COINtoolbox/photoz_catalogues).
[3]
oai:arXiv.org:1603.06256 [pdf] - 1935289
Is the cluster environment quenching the Seyfert activity in elliptical
and spiral galaxies?
de Souza, R. S.;
Dantas, M. L. L.;
Krone-Martins, A.;
Cameron, E.;
Coelho, P.;
Hattab, M. W.;
de Val-Borro, M.;
Hilbe, J. M.;
Elliott, J.;
Hagen, A.
Submitted: 2016-03-20, last modified: 2016-07-06
We developed a hierarchical Bayesian model (HBM) to investigate how the
presence of Seyfert activity relates to their environment, herein represented
by the galaxy cluster mass, $M_{200}$, and the normalized cluster-centric
distance, $r/r_{200}$. We achieved this by constructing an unbiased sample of
galaxies from the Sloan Digital Sky Survey, with morphological classifications
provided by the Galaxy Zoo Project. A propensity score matching approach is
introduced to control for the effects of confounding variables: stellar mass,
galaxy colour, and star formation rate. The connection between Seyfert-activity
and environmental properties in the de-biased sample is modelled within an HBM
framework using the so-called logistic regression technique, suitable for the
analysis of binary data (e.g., whether or not a galaxy hosts an AGN). Unlike
standard ordinary least square fitting methods, our methodology naturally
allows modelling the probability of Seyfert-AGN activity in galaxies on their
natural scale, i.e. as a binary variable. Furthermore, we demonstrate how an
HBM can incorporate information of each particular galaxy morphological type in
a unified framework. In elliptical galaxies, our analysis indicates a strong
correlation of Seyfert-AGN activity with $r/r_{200}$, and a weaker correlation
with the mass of the host. In spiral galaxies these trends do not appear,
suggesting that the link between Seyfert activity and the properties of spiral
galaxies are independent of the environment.