sort results by

Use logical operators AND, OR, NOT and round brackets to construct complex queries. Whitespace-separated words are treated as ANDed.

Show articles per page in mode

Hilbe, Joseph

Normalized to: Hilbe, J.

6 article(s) in total. 21 co-authors, from 1 to 5 common article(s). Median position in authors list is 5,0.

[1]  oai:arXiv.org:1409.7699  [pdf] - 1807260
The Overlooked Potential of Generalized Linear Models in Astronomy-II: Gamma regression and photometric redshifts
Comments: 19 pages, 7 figures, 1 table, accepted for publication in Astronomy and Computing
Submitted: 2014-09-26, last modified: 2018-12-30
Machine learning techniques offer a precious tool box for use within astronomy to solve problems involving so-called big data. They provide a means to make accurate predictions about a particular system without prior knowledge of the underlying physical processes of the data. In this article, and the companion papers of this series, we present the set of Generalized Linear Models (GLMs) as a fast alternative method for tackling general astronomical problems, including the ones related to the machine learning paradigm. To demonstrate the applicability of GLMs to inherently positive and continuous physical observables, we explore their use in estimating the photometric redshifts of galaxies from their multi-wavelength photometry. Using the gamma family with a log link function we predict redshifts from the PHoto-z Accuracy Testing simulated catalogue and a subset of the Sloan Digital Sky Survey from Data Release 10. We obtain fits that result in catastrophic outlier rates as low as ~1% for simulated and ~2% for real data. Moreover, we can easily obtain such levels of precision within a matter of seconds on a normal desktop computer and with training sets that contain merely thousands of galaxies. Our software is made publicly available as an user-friendly package developed in Python, R and via an interactive web application (https://cosmostatisticsinitiative.shinyapps.io/CosmoPhotoz). This software allows users to apply a set of GLMs to their own photometric catalogues and generates publication quality plots with minimum effort from the user. By facilitating their ease of use to the astronomical community, this paper series aims to make GLMs widely known and to encourage their implementation in future large-scale projects, such as the Large Synoptic Survey Telescope.
[2]  oai:arXiv.org:1603.06256  [pdf] - 1935289
Is the cluster environment quenching the Seyfert activity in elliptical and spiral galaxies?
Comments: 11 pages, 6 figures, accepted in MNRAS
Submitted: 2016-03-20, last modified: 2016-07-06
We developed a hierarchical Bayesian model (HBM) to investigate how the presence of Seyfert activity relates to their environment, herein represented by the galaxy cluster mass, $M_{200}$, and the normalized cluster-centric distance, $r/r_{200}$. We achieved this by constructing an unbiased sample of galaxies from the Sloan Digital Sky Survey, with morphological classifications provided by the Galaxy Zoo Project. A propensity score matching approach is introduced to control for the effects of confounding variables: stellar mass, galaxy colour, and star formation rate. The connection between Seyfert-activity and environmental properties in the de-biased sample is modelled within an HBM framework using the so-called logistic regression technique, suitable for the analysis of binary data (e.g., whether or not a galaxy hosts an AGN). Unlike standard ordinary least square fitting methods, our methodology naturally allows modelling the probability of Seyfert-AGN activity in galaxies on their natural scale, i.e. as a binary variable. Furthermore, we demonstrate how an HBM can incorporate information of each particular galaxy morphological type in a unified framework. In elliptical galaxies, our analysis indicates a strong correlation of Seyfert-AGN activity with $r/r_{200}$, and a weaker correlation with the mass of the host. In spiral galaxies these trends do not appear, suggesting that the link between Seyfert activity and the properties of spiral galaxies are independent of the environment.
[3]  oai:arXiv.org:1506.04792  [pdf] - 1935110
The Overlooked Potential of Generalized Linear Models in Astronomy-III: Bayesian Negative Binomial Regression and Globular Cluster Populations
Comments: 14 pages, 12 figures. Accepted for publication in MNRAS
Submitted: 2015-06-15, last modified: 2015-08-13
In this paper, the third in a series illustrating the power of generalized linear models (GLMs) for the astronomical community, we elucidate the potential of the class of GLMs which handles count data. The size of a galaxy's globular cluster population $N_{\rm GC}$ is a prolonged puzzle in the astronomical literature. It falls in the category of count data analysis, yet it is usually modelled as if it were a continuous response variable. We have developed a Bayesian negative binomial regression model to study the connection between $N_{\rm GC}$ and the following galaxy properties: central black hole mass, dynamical bulge mass, bulge velocity dispersion, and absolute visual magnitude. The methodology introduced herein naturally accounts for heteroscedasticity, intrinsic scatter, errors in measurements in both axes (either discrete or continuous), and allows modelling the population of globular clusters on their natural scale as a non-negative integer variable. Prediction intervals of 99% around the trend for expected $N_{\rm GC}$comfortably envelope the data, notably including the Milky Way, which has hitherto been considered a problematic outlier. Finally, we demonstrate how random intercept models can incorporate information of each particular galaxy morphological type. Bayesian variable selection methodology allows for automatically identifying galaxy types with different productions of GCs, suggesting that on average S0 galaxies have a GC population 35% smaller than other types with similar brightness.
[4]  oai:arXiv.org:1507.01293  [pdf] - 1429369
Using gamma regression for photometric redshifts of survey galaxies
Comments: Refereed Proceeding of "The Universe of Digital Sky Surveys" conference held at the INAF - Observatory of Capodimonte, Naples, on 25th-28th November 2014, to be published in the Astrophysics and Space Science Proceedings, edited by Longo, Napolitano, Marconi, Paolillo, Iodice, 6 pages, and 1 figure
Submitted: 2015-07-05
Machine learning techniques offer a plethora of opportunities in tackling big data within the astronomical community. We present the set of Generalized Linear Models as a fast alternative for determining photometric redshifts of galaxies, a set of tools not commonly applied within astronomy, despite being widely used in other professions. With this technique, we achieve catastrophic outlier rates of the order of ~1%, that can be achieved in a matter of seconds on large datasets of size ~1,000,000. To make these techniques easily accessible to the astronomical community, we developed a set of libraries and tools that are publicly available.
[5]  oai:arXiv.org:1409.7696  [pdf] - 1047947
The Overlooked Potential of Generalized Linear Models in Astronomy - I: Binomial Regression
Comments: 20 pages, 10 figures, 3 tables, accepted for publication in Astronomy and Computing
Submitted: 2014-09-26, last modified: 2015-04-04
Revealing hidden patterns in astronomical data is often the path to fundamental scientific breakthroughs; meanwhile the complexity of scientific inquiry increases as more subtle relationships are sought. Contemporary data analysis problems often elude the capabilities of classical statistical techniques, suggesting the use of cutting edge statistical methods. In this light, astronomers have overlooked a whole family of statistical techniques for exploratory data analysis and robust regression, the so-called Generalized Linear Models (GLMs). In this paper -- the first in a series aimed at illustrating the power of these methods in astronomical applications -- we elucidate the potential of a particular class of GLMs for handling binary/binomial data, the so-called logit and probit regression techniques, from both a maximum likelihood and a Bayesian perspective. As a case in point, we present the use of these GLMs to explore the conditions of star formation activity and metal enrichment in primordial minihaloes from cosmological hydro-simulations including detailed chemistry, gas physics, and stellar feedback. We predict that for a dark mini-halo with metallicity $\approx 1.3 \times 10^{-4} Z_{\bigodot}$, an increase of $1.2 \times 10^{-2}$ in the gas molecular fraction, increases the probability of star formation occurrence by a factor of 75%. Finally, we highlight the use of receiver operating characteristic curves as a diagnostic for binary classifiers, and ultimately we use these to demonstrate the competitive predictive performance of GLMs against the popular technique of artificial neural networks.
[6]  oai:arXiv.org:1301.3069  [pdf] - 614001
New Organizations to Support Astroinformatics and Astrostatistics
Comments: 4 pages, to appear in the proceedings of `Astronomical Data Analysis and Software Systems XXII' (D. N. Friedel & R. L. Plante, eds.) to be published in the Publ. Astro. Society Pacific conference series
Submitted: 2013-01-14
In the past two years, the environment within which astronomers conduct their data analysis and management has rapidly changed. Working Groups associated with international societies and Big Data projects have emerged to support and stimulate the new fields of astroinformatics and astrostatistics. Sponsoring societies include the Intenational Statistical Institute, International Astronomical Union, American Astronomical Society, and Large Synoptic Survey Telescope project. They enthusiastically support cross-disciplinary activities where the advanced capabilities of computer science, statistics and related fields of applied mathematics are applied to advance research on planets, stars, galaxies and the Universe. The ADASS community is encouraged to join these organizations and to explore and engage in their public communication Web site, the Astrostatistics and Astroinformatics Portal (http://asaip.psu.edu).