Full-text search for arXiv

Izbicki, R.

Normalized to: Izbicki, R.

5 article(s) in total. 31 co-authors, from 1 to 5 common article(s). Median position in authors list is 2,0.

[1] oai:arXiv.org:2001.03621 [pdf] - 2029805

Evaluation of probabilistic photometric redshift estimation approaches for LSST

Comments: submitted to MNRAS

Submitted: 2020-01-10

Many scientific investigations of photometric galaxy surveys require redshift estimates, whose uncertainty properties are best encapsulated by photometric redshift (photo-z) posterior probability density functions (PDFs). A plethora of photo-z PDF estimation methodologies abound, producing discrepant results with no consensus on a preferred approach. We present the results of a comprehensive experiment comparing twelve photo-z algorithms applied to mock data produced for the Large Synoptic Survey Telescope (LSST) Dark Energy Science Collaboration (DESC). By supplying perfect prior information, in the form of the complete template library and a representative training set as inputs to each code, we demonstrate the impact of the assumptions underlying each technique on the output photo-z PDFs. In the absence of a notion of true, unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the ensemble properties of the derived photo-z PDFs as well as traditional reductions to photo-z point estimates. We report systematic biases and overall over/under-breadth of the photo-z PDFs of many popular codes, which may indicate avenues for improvement in the algorithms or implementations. Furthermore, we raise attention to the limitations of established metrics for assessing photo-z PDF accuracy; though we identify the conditional density estimate (CDE) loss as a promising metric of photo-z PDF performance in the case where true redshifts are available but true photo-z PDFs are not, we emphasize the need for science-specific performance metrics.

[2] oai:arXiv.org:1908.11523 [pdf] - 2031949

Conditional Density Estimation Tools in Python and R with Applications to Photometric Redshifts and Likelihood-Free Cosmological Inference

Dalmasso, Niccolò; Pospisil, Taylor; Lee, Ann B.; Izbicki, Rafael; Freeman, Peter E.; Malz, Alex I.

Comments: 27 pages, 7 figures, 4 tables

Submitted: 2019-08-29, last modified: 2019-12-20

It is well known in astronomy that propagating non-Gaussian prediction uncertainty in photometric redshift estimates is key to reducing bias in downstream cosmological analyses. Similarly, likelihood-free inference approaches, which are beginning to emerge as a tool for cosmological analysis, require a characterization of the full uncertainty landscape of the parameters of interest given observed data. However, most machine learning (ML) or training-based methods with open-source software target point prediction or classification, and hence fall short in quantifying uncertainty in complex regression and parameter inference settings. As an alternative to methods that focus on predicting the response (or parameters) $\mathbf{y}$ from features $\mathbf{x}$, we provide nonparametric conditional density estimation (CDE) tools for approximating and validating the entire probability density function (PDF) $\mathrm{p}(\mathbf{y}|\mathbf{x})$ of $\mathbf{y}$ given (i.e., conditional on) $\mathbf{x}$. As there is no one-size-fits-all CDE method, the goal of this work is to provide a comprehensive range of statistical tools and open-source software for nonparametric CDE and method assessment which can accommodate different types of settings and be easily fit to the problem at hand. Specifically, we introduce four CDE software packages in $\texttt{Python}$ and $\texttt{R}$ based on ML prediction methods adapted and optimized for CDE: $\texttt{NNKCDE}$, $\texttt{RFCDE}$, $\texttt{FlexCode}$, and $\texttt{DeepCDE}$. Furthermore, we present the $\texttt{cdetools}$ package, which includes functions for computing a CDE loss function for tuning and assessing the quality of individual PDFs, along with diagnostic functions. We provide sample code in $\texttt{Python}$ and $\texttt{R}$ as well as examples of applications to photometric redshift estimation and likelihood-free cosmological inference via CDE.

[3] oai:arXiv.org:1703.09242 [pdf] - 1582160

A Unified Framework for Constructing, Tuning and Assessing Photometric Redshift Density Estimates in a Selection Bias Setting

Freeman, Peter E.; Izbicki, Rafael; Lee, Ann B.

Comments: 11 pages; accepted by MNRAS

Submitted: 2017-03-27

Photometric redshift estimation is an indispensable tool of precision cosmology. One problem that plagues the use of this tool in the era of large-scale sky surveys is that the bright galaxies that are selected for spectroscopic observation do not have properties that match those of (far more numerous) dimmer galaxies; thus, ill-designed empirical methods that produce accurate and precise redshift estimates for the former generally will not produce good estimates for the latter. In this paper, we provide a principled framework for generating conditional density estimates (i.e. photometric redshift PDFs) that takes into account selection bias and the covariate shift that this bias induces. We base our approach on the assumption that the probability that astronomers label a galaxy (i.e. determine its spectroscopic redshift) depends only on its measured (photometric and perhaps other) properties x and not on its true redshift. With this assumption, we can explicitly write down risk functions that allow us to both tune and compare methods for estimating importance weights (i.e. the ratio of densities of unlabeled and labeled galaxies for different values of x) and conditional densities. We also provide a method for combining multiple conditional density estimates for the same galaxy into a single estimate with better properties. We apply our risk functions to an analysis of approximately one million galaxies, mostly observed by SDSS, and demonstrate through multiple diagnostic tests that our method achieves good conditional density estimates for the unlabeled galaxies.

[4] oai:arXiv.org:1604.01339 [pdf] - 1386744

Photo-z Estimation: An Example of Nonparametric Conditional Density Estimation under Selection Bias

Izbicki, Rafael; Lee, Ann B.; Freeman, Peter E.

Comments:

Submitted: 2016-04-05

Redshift is a key quantity for inferring cosmological model parameters. In photometric redshift estimation, cosmologists use the coarse data collected from the vast majority of galaxies to predict the redshift of individual galaxies. To properly quantify the uncertainty in the predictions, however, one needs to go beyond standard regression and instead estimate the full conditional density f(z|x) of a galaxy's redshift z given its photometric covariates x. The problem is further complicated by selection bias: usually only the rarest and brightest galaxies have known redshifts, and these galaxies have characteristics and measured covariates that do not necessarily match those of more numerous and dimmer galaxies of unknown redshift. Unfortunately, there is not much research on how to best estimate complex multivariate densities in such settings. Here we describe a general framework for properly constructing and assessing nonparametric conditional density estimators under selection bias, and for combining two or more estimators for optimal performance. We propose new improved photo-z estimators and illus- trate our methods on data from the Sloan Data Sky Survey and an application to galaxy-galaxy lensing. Although our main application is photo-z estimation, our methods are relevant to any high-dimensional regression setting with complicated asymmetric and multimodal distributions in the response variable.

[5] oai:arXiv.org:1306.1238 [pdf] - 1171844

New Image Statistics for Detecting Disturbed Galaxy Morphologies at High Redshift

Freeman, P. E.; Izbicki, R.; Lee, A. B.; Newman, J. A.; Conselice, C. J.; Koekemoer, A. M.; Lotz, J. M.; Mozena, M.

Comments: 15 pages, 14 figures, accepted for publication in MNRAS

Submitted: 2013-06-05

Testing theories of hierarchical structure formation requires estimating the distribution of galaxy morphologies and its change with redshift. One aspect of this investigation involves identifying galaxies with disturbed morphologies (e.g., merging galaxies). This is often done by summarizing galaxy images using, e.g., the CAS and Gini-M20 statistics of Conselice (2003) and Lotz et al. (2004), respectively, and associating particular statistic values with disturbance. We introduce three statistics that enhance detection of disturbed morphologies at high-redshift (z ~ 2): the multi-mode (M), intensity (I), and deviation (D) statistics. We show their effectiveness by training a machine-learning classifier, random forest, using 1,639 galaxies observed in the H band by the Hubble Space Telescope WFC3, galaxies that had been previously classified by eye by the CANDELS collaboration (Grogin et al. 2011, Koekemoer et al. 2011). We find that the MID statistics (and the A statistic of Conselice 2003) are the most useful for identifying disturbed morphologies. We also explore whether human annotators are useful for identifying disturbed morphologies. We demonstrate that they show limited ability to detect disturbance at high redshift, and that increasing their number beyond approximately 10 does not provably yield better classification performance. We propose a simulation-based model-fitting algorithm that mitigates these issues by bypassing annotation.