Normalized to: Riggs, J.
[1]
oai:arXiv.org:1506.04792 [pdf] - 1935110
The Overlooked Potential of Generalized Linear Models in Astronomy-III:
Bayesian Negative Binomial Regression and Globular Cluster Populations
Submitted: 2015-06-15, last modified: 2015-08-13
In this paper, the third in a series illustrating the power of generalized
linear models (GLMs) for the astronomical community, we elucidate the potential
of the class of GLMs which handles count data. The size of a galaxy's globular
cluster population $N_{\rm GC}$ is a prolonged puzzle in the astronomical
literature. It falls in the category of count data analysis, yet it is usually
modelled as if it were a continuous response variable. We have developed a
Bayesian negative binomial regression model to study the connection between
$N_{\rm GC}$ and the following galaxy properties: central black hole mass,
dynamical bulge mass, bulge velocity dispersion, and absolute visual magnitude.
The methodology introduced herein naturally accounts for heteroscedasticity,
intrinsic scatter, errors in measurements in both axes (either discrete or
continuous), and allows modelling the population of globular clusters on their
natural scale as a non-negative integer variable. Prediction intervals of 99%
around the trend for expected $N_{\rm GC}$comfortably envelope the data,
notably including the Milky Way, which has hitherto been considered a
problematic outlier. Finally, we demonstrate how random intercept models can
incorporate information of each particular galaxy morphological type. Bayesian
variable selection methodology allows for automatically identifying galaxy
types with different productions of GCs, suggesting that on average S0 galaxies
have a GC population 35% smaller than other types with similar brightness.
[2]
oai:arXiv.org:1409.7696 [pdf] - 1047947
The Overlooked Potential of Generalized Linear Models in Astronomy - I:
Binomial Regression
Submitted: 2014-09-26, last modified: 2015-04-04
Revealing hidden patterns in astronomical data is often the path to
fundamental scientific breakthroughs; meanwhile the complexity of scientific
inquiry increases as more subtle relationships are sought. Contemporary data
analysis problems often elude the capabilities of classical statistical
techniques, suggesting the use of cutting edge statistical methods. In this
light, astronomers have overlooked a whole family of statistical techniques for
exploratory data analysis and robust regression, the so-called Generalized
Linear Models (GLMs). In this paper -- the first in a series aimed at
illustrating the power of these methods in astronomical applications -- we
elucidate the potential of a particular class of GLMs for handling
binary/binomial data, the so-called logit and probit regression techniques,
from both a maximum likelihood and a Bayesian perspective. As a case in point,
we present the use of these GLMs to explore the conditions of star formation
activity and metal enrichment in primordial minihaloes from cosmological
hydro-simulations including detailed chemistry, gas physics, and stellar
feedback. We predict that for a dark mini-halo with metallicity $\approx 1.3
\times 10^{-4} Z_{\bigodot}$, an increase of $1.2 \times 10^{-2}$ in the gas
molecular fraction, increases the probability of star formation occurrence by a
factor of 75%. Finally, we highlight the use of receiver operating
characteristic curves as a diagnostic for binary classifiers, and ultimately we
use these to demonstrate the competitive predictive performance of GLMs against
the popular technique of artificial neural networks.