Normalized to: Freeman, P.
[1]
oai:arXiv.org:2001.03621 [pdf] - 2029805
Evaluation of probabilistic photometric redshift estimation approaches
for LSST
Schmidt, S. J.;
Malz, A. I.;
Soo, J. Y. H.;
Almosallam, I. A.;
Brescia, M.;
Cavuoti, S.;
Cohen-Tanugi, J.;
Connolly, A. J.;
DeRose, J.;
Freeman, P. E.;
Graham, M. L.;
Iyer, K. G.;
Jarvis, M. J.;
Kalmbach, J. B.;
Kovacs, E.;
Lee, A. B.;
Longo, G.;
Morrison, C. B.;
Newman, J. A.;
Nourbakhsh, E.;
Nuss, E.;
Pospisil, T.;
Tranin, H.;
Wechsler, R. H.;
Zhou, R.;
Izbicki, R.;
Collaboration, The LSST Dark Energy Science
Submitted: 2020-01-10
Many scientific investigations of photometric galaxy surveys require redshift
estimates, whose uncertainty properties are best encapsulated by photometric
redshift (photo-z) posterior probability density functions (PDFs). A plethora
of photo-z PDF estimation methodologies abound, producing discrepant results
with no consensus on a preferred approach. We present the results of a
comprehensive experiment comparing twelve photo-z algorithms applied to mock
data produced for the Large Synoptic Survey Telescope (LSST) Dark Energy
Science Collaboration (DESC). By supplying perfect prior information, in the
form of the complete template library and a representative training set as
inputs to each code, we demonstrate the impact of the assumptions underlying
each technique on the output photo-z PDFs. In the absence of a notion of true,
unbiased photo-z PDFs, we evaluate and interpret multiple metrics of the
ensemble properties of the derived photo-z PDFs as well as traditional
reductions to photo-z point estimates. We report systematic biases and overall
over/under-breadth of the photo-z PDFs of many popular codes, which may
indicate avenues for improvement in the algorithms or implementations.
Furthermore, we raise attention to the limitations of established metrics for
assessing photo-z PDF accuracy; though we identify the conditional density
estimate (CDE) loss as a promising metric of photo-z PDF performance in the
case where true redshifts are available but true photo-z PDFs are not, we
emphasize the need for science-specific performance metrics.
[2]
oai:arXiv.org:1908.11523 [pdf] - 2031949
Conditional Density Estimation Tools in Python and R with Applications
to Photometric Redshifts and Likelihood-Free Cosmological Inference
Submitted: 2019-08-29, last modified: 2019-12-20
It is well known in astronomy that propagating non-Gaussian prediction
uncertainty in photometric redshift estimates is key to reducing bias in
downstream cosmological analyses. Similarly, likelihood-free inference
approaches, which are beginning to emerge as a tool for cosmological analysis,
require a characterization of the full uncertainty landscape of the parameters
of interest given observed data. However, most machine learning (ML) or
training-based methods with open-source software target point prediction or
classification, and hence fall short in quantifying uncertainty in complex
regression and parameter inference settings. As an alternative to methods that
focus on predicting the response (or parameters) $\mathbf{y}$ from features
$\mathbf{x}$, we provide nonparametric conditional density estimation (CDE)
tools for approximating and validating the entire probability density function
(PDF) $\mathrm{p}(\mathbf{y}|\mathbf{x})$ of $\mathbf{y}$ given (i.e.,
conditional on) $\mathbf{x}$. As there is no one-size-fits-all CDE method, the
goal of this work is to provide a comprehensive range of statistical tools and
open-source software for nonparametric CDE and method assessment which can
accommodate different types of settings and be easily fit to the problem at
hand. Specifically, we introduce four CDE software packages in
$\texttt{Python}$ and $\texttt{R}$ based on ML prediction methods adapted and
optimized for CDE: $\texttt{NNKCDE}$, $\texttt{RFCDE}$, $\texttt{FlexCode}$,
and $\texttt{DeepCDE}$. Furthermore, we present the $\texttt{cdetools}$
package, which includes functions for computing a CDE loss function for tuning
and assessing the quality of individual PDFs, along with diagnostic functions.
We provide sample code in $\texttt{Python}$ and $\texttt{R}$ as well as
examples of applications to photometric redshift estimation and likelihood-free
cosmological inference via CDE.
[3]
oai:arXiv.org:1809.02136 [pdf] - 1871405
Automated Distant Galaxy Merger Classifications from Space Telescope
Images using the Illustris Simulation
Submitted: 2018-09-06, last modified: 2019-04-12
We present image-based evolution of galaxy mergers from the Illustris
cosmological simulation at 12 time-steps over 0.5 < z < 5. To do so, we created
approximately one million synthetic deep Hubble Space Telescope and James Webb
Space Telescope images and measured common morphological indicators. Using the
merger tree, we assess methods to observationally select mergers with stellar
mass ratios as low as 10:1 completing within +/- 250 Myr of the mock
observation. We confirm that common one- or two-dimensional statistics select
mergers so defined with low purity and completeness, leading to high
statistical errors. As an alternative, we train redshift-dependent random
forests (RFs) based on 5-10 inputs. Cross-validation shows the RFs yield
superior, yet still imperfect, measurements of the late-stage merger fraction,
and they select more mergers in bulge-dominated galaxies. When applied to
CANDELS morphology catalogs, the RFs estimate a merger rate increasing to at
least z = 3, albeit two times higher than expected by theory. This suggests
possible mismatches in the feedback-determined morphologies, but affirms the
basic understanding of galaxy merger evolution. The RFs achieve completeness of
roughly 70% at 0.5 < z < 3, and purity increasing from 10% at z = 0.5 to 60% at
z = 3. At earlier times, the training sets are insufficient, motivating larger
simulations and smaller time sampling. By blending large surveys and large
simulations, such machine learning techniques offer a promising opportunity to
teach us the strengths and weaknesses of inferences about galaxy evolution.
[4]
oai:arXiv.org:1903.06796 [pdf] - 1850859
Astro2020 Science White Paper: The Next Decade of Astroinformatics and
Astrostatistics
Siemiginowska, A.;
Eadie, G.;
Czekala, I.;
Feigelson, E.;
Ford, E. B.;
Kashyap, V.;
Kuhn, M.;
Loredo, T.;
Ntampaka, M.;
Stevens, A.;
Avelino, A.;
Borne, K.;
Budavari, T.;
Burkhart, B.;
Cisewski-Kehe, J.;
Civano, F.;
Chilingarian, I.;
van Dyk, D. A.;
Fabbiano, G.;
Finkbeiner, D. P.;
Foreman-Mackey, D.;
Freeman, P.;
Fruscione, A.;
Goodman, A. A.;
Graham, M.;
Guenther, H. M.;
Hakkila, J.;
Hernquist, L.;
Huppenkothen, D.;
James, D. J.;
Law, C.;
Lazio, J.;
Lee, T.;
López-Morales, M.;
Mahabal, A. A.;
Mandel, K.;
Meng, X. L.;
Moustakas, J.;
Muna, D.;
Peek, J. E. G.;
Richards, G.;
Portillo, S. K. N.;
Scargle, J.;
de Souza, R. S.;
Speagle, J. S.;
Stassun, K. G.;
Stenning, D. C.;
Taylor, S. R.;
Tremblay, G. R.;
Trimble, V.;
Yanamandra-Fisher, P. A.;
Young, C. A.
Submitted: 2019-03-15
Over the past century, major advances in astronomy and astrophysics have been
largely driven by improvements in instrumentation and data collection. With the
amassing of high quality data from new telescopes, and especially with the
advent of deep and large astronomical surveys, it is becoming clear that future
advances will also rely heavily on how those data are analyzed and interpreted.
New methodologies derived from advances in statistics, computer science, and
machine learning are beginning to be employed in sophisticated investigations
that are not only bringing forth new discoveries, but are placing them on a
solid footing. Progress in wide-field sky surveys, interferometric imaging,
precision cosmology, exoplanet detection and characterization, and many
subfields of stellar, Galactic and extragalactic astronomy, has resulted in
complex data analysis challenges that must be solved to perform scientific
inference. Research in astrostatistics and astroinformatics will be necessary
to develop the state-of-the-art methodology needed in astronomy. Overcoming
these challenges requires dedicated, interdisciplinary research. We recommend:
(1) increasing funding for interdisciplinary projects in astrostatistics and
astroinformatics; (2) dedicating space and time at conferences for
interdisciplinary research and promotion; (3) developing sustainable funding
for long-term astrostatisics appointments; and (4) funding infrastructure
development for data archives and archive support, state-of-the-art algorithms,
and efficient computing.
[5]
oai:arXiv.org:1704.06273 [pdf] - 1621253
Intrinsic Alignment in redMaPPer clusters -- II. Radial alignment of
satellites toward cluster centers
Submitted: 2017-04-20, last modified: 2018-01-20
We study the orientations of satellite galaxies in redMaPPer clusters
constructed from the Sloan Digital Sky Survey at $0.1<z<0.35$ to determine
whether there is any preferential tendency for satellites to point radially
toward cluster centers. We analyze the satellite alignment (SA) signal based on
three shape measurement methods (re-Gaussianization, de Vaucouleurs, and
isophotal shapes), which trace galaxy light profiles at different radii. The
measured SA signal depends on these shape measurement methods. We detect the
strongest SA signal in isophotal shapes, followed by de Vaucouleurs shapes.
While no net SA signal is detected using re-Gaussianization shapes across the
entire sample, the observed SA signal reaches a statistically significant level
when limiting to a subsample of higher luminosity satellites. We further
investigate the impact of noise, systematics, and real physical isophotal
twisting effects in the comparison between the SA signal detected via different
shape measurement methods. Unlike previous studies, which only consider the
dependence of SA on a few parameters, here we explore a total of 17 galaxy and
cluster properties, using a statistical model averaging technique to naturally
account for parameter correlations and identify significant SA predictors. We
find that the measured SA signal is strongest for satellites with the following
characteristics: higher luminosity, smaller distance to the cluster center,
rounder in shape, higher bulge fraction, and distributed preferentially along
the major axis directions of their centrals. Finally, we provide physical
explanations for the identified dependences, and discuss the connection to
theories of SA.
[6]
oai:arXiv.org:1711.00660 [pdf] - 1641321
Stellar Multiplicity Meets Stellar Evolution And Metallicity: The APOGEE
View
Badenes, Carles;
Mazzola, Christine;
Thompson, Todd A.;
Covey, Kevin;
Freeman, Peter E.;
Walker, Matthew G.;
Moe, Maxwell;
Troup, Nicholas;
Nidever, David;
Prieto, Carlos Allende;
Andrews, Brett;
Barbá, Rodolfo H.;
Beers, Timothy C.;
Bovy, Jo;
Carlberg, Joleen K.;
De Lee, Nathan;
Johnson, Jennifer;
Lewis, Hannah;
Majewski, Steven R.;
Pinsonneault, Marc;
Sobeck, Jennifer;
Stassun, Keivan G.;
Stringfellow, Guy;
Zasowski, Gail
Submitted: 2017-11-02, last modified: 2018-01-15
We use the multi-epoch radial velocities acquired by the APOGEE survey to
perform a large scale statistical study of stellar multiplicity for field stars
in the Milky Way, spanning the evolutionary phases between the main sequence
and the red clump. We show that the distribution of maximum radial velocity
shifts (\drvm) for APOGEE targets is a strong function of \logg, with main
sequence stars showing \drvm\ as high as $\sim$300 \kms, and steadily dropping
down to $\sim$30 \kms\ for \logg$\sim$0, as stars climb up the Red Giant Branch
(RGB). Red clump stars show a distribution of \drvm\ values comparable to that
of stars at the tip of the RGB, implying they have similar multiplicity
characteristics. The observed attrition of high \drvm\ systems in the RGB is
consistent with a lognormal period distribution in the main sequence and a
multiplicity fraction of 0.35, which is truncated at an increasing period as
stars become physically larger and undergo mass transfer after Roche Lobe
Overflow during H shell burning. The \drvm\ distributions also show that the
multiplicity characteristics of field stars are metallicity dependent, with
metal-poor ([Fe/H]$\lesssim-0.5$) stars having a multiplicity fraction a factor
2-3 higher than metal-rich ([Fe/H]$\gtrsim0.0$) stars. This has profound
implications for the formation rates of interacting binaries observed by
astronomical transient surveys and gravitational wave detectors, as well as the
habitability of circumbinary planets.
[7]
oai:arXiv.org:1712.00432 [pdf] - 1626415
Mapping Jet-ISM Interactions in X-ray Binaries with ALMA: A GRS
1915$+$105 Case Study
Submitted: 2017-12-01
We present Atacama Large Millimetre/Sub-Millimetre Array (ALMA) observations
of IRAS 19132+1035, a candidate jet-ISM interaction zone near the black hole
X-ray binary (BHXB) GRS 1915+105. With these ALMA observations (combining data
from the 12 m array and the Atacama Compact Array), we map the molecular line
emission across the IRAS 19132+1035 region. We detect emission from the
$^{12}$CO [$J=2-1$], $^{13}$CO [$\nu=0$, $J=2-1$], C$^{18}$O [$J=2-1$], ${\rm
H}_{2}{\rm CO}$ [$J=3_{0,3}-2_{0,2}$], ${\rm H}_{2}{\rm CO}$
[$J=3_{2,2}-2_{2,1}$], ${\rm H}_{2}{\rm CO}$ [$J=3_{2,1}-2_{2,0}$], SiO
[$\nu=0$, $J=5-4$], CH$_3$OH [$J=4_{2,2}-3_{1,2}$], and CS [$\nu=0$, $J=5-4$]
transitions. Given the morphological, spectral, and kinematic properties of
this molecular emission, we present several lines of evidence that support the
presence of a jet-ISM interaction at this site, including a jet-blown cavity in
the molecular gas. This compelling new evidence identifies this site as a
jet-ISM interaction zone, making GRS 1915$+$105 the third Galactic BHXB with at
least one conclusive jet-ISM interaction zone. However, we find that this
interaction occurs on much smaller scales than was postulated by previous work,
where the BHXB jet does not appear to be dominantly powering the entire IRAS
19132+1035 region. Using estimates of the ISM conditions in the region, we
utilize the detected cavity as a calorimeter to estimate the time-averaged
power carried in the GRS 1915+105 jets of
$(8.4^{+7.7}_{-8.1})\times10^{32}\,{\rm erg\,s}^{-1}$. Overall, our analysis
demonstrates that molecular lines are excellent diagnostic tools to identify
and probe jet-ISM interaction zones near Galactic BHXBs.
[8]
oai:arXiv.org:1707.04592 [pdf] - 1585956
Local Two-Sample Testing: A New Tool for Analysing High-Dimensional
Astronomical Data
Submitted: 2017-07-14
Modern surveys have provided the astronomical community with a flood of
high-dimensional data, but analyses of these data often occur after their
projection to lower-dimensional spaces. In this work, we introduce a local
two-sample hypothesis test framework that an analyst may directly apply to data
in their native space. In this framework, the analyst defines two classes based
on a response variable of interest (e.g. higher-mass galaxies versus lower-mass
galaxies) and determines at arbitrary points in predictor space whether the
local proportions of objects that belong to the two classes significantly
differs from the global proportion.
Our framework has a potential myriad of uses throughout astronomy; here, we
demonstrate its efficacy by applying it to a sample of 2487 i-band-selected
galaxies observed by the HST ACS in four of the CANDELS program fields. For
each galaxy, we have seven morphological summary statistics along with an
estimated stellar mass and star-formation rate. We perform two studies: one in
which we determine regions of the seven-dimensional space of morphological
statistics where high-mass galaxies are significantly more numerous than
low-mass galaxies, and vice-versa, and another study where we use SFR in place
of mass. We find that we are able to identify such regions, and show how
high-mass/low-SFR regions are associated with concentrated and undisturbed
galaxies while galaxies in low-mass/high-SFR regions appear more extended
and/or disturbed than their high-mass/low-SFR counterparts.
[9]
oai:arXiv.org:1703.09242 [pdf] - 1582160
A Unified Framework for Constructing, Tuning and Assessing Photometric
Redshift Density Estimates in a Selection Bias Setting
Submitted: 2017-03-27
Photometric redshift estimation is an indispensable tool of precision
cosmology. One problem that plagues the use of this tool in the era of
large-scale sky surveys is that the bright galaxies that are selected for
spectroscopic observation do not have properties that match those of (far more
numerous) dimmer galaxies; thus, ill-designed empirical methods that produce
accurate and precise redshift estimates for the former generally will not
produce good estimates for the latter. In this paper, we provide a principled
framework for generating conditional density estimates (i.e. photometric
redshift PDFs) that takes into account selection bias and the covariate shift
that this bias induces. We base our approach on the assumption that the
probability that astronomers label a galaxy (i.e. determine its spectroscopic
redshift) depends only on its measured (photometric and perhaps other)
properties x and not on its true redshift. With this assumption, we can
explicitly write down risk functions that allow us to both tune and compare
methods for estimating importance weights (i.e. the ratio of densities of
unlabeled and labeled galaxies for different values of x) and conditional
densities. We also provide a method for combining multiple conditional density
estimates for the same galaxy into a single estimate with better properties. We
apply our risk functions to an analysis of approximately one million galaxies,
mostly observed by SDSS, and demonstrate through multiple diagnostic tests that
our method achieves good conditional density estimates for the unlabeled
galaxies.
[10]
oai:arXiv.org:1702.07728 [pdf] - 1564077
The Varying Mass Distribution of Molecular Clouds Across M83
Submitted: 2017-02-24
The work of Adamo et al. (2015) showed that the mass distributions of young
massive stellar clusters were truncated above a maximum-mass scale in the
nearby galaxy M83 and that this truncation mass varies with galactocentric
radius. Here, we present a cloud-based analysis of ALMA CO($1\to 0$)
observations of M83 to search for such a truncation mass in the molecular cloud
population. We identify a population of 873 molecular clouds in M83 that is
largely similar to those found in the Milky Way and Local Group galaxies,
though clouds in the centre of the galaxy show high surface densities and
enhanced turbulence, as is common for clouds in high-density nuclear
environments. Like the young massive clusters, we find a maximum-mass scale for
the molecular clouds that decreases radially in the galaxy. We find the most
massive young massive cluster tracks the most massive molecular cloud with the
cluster mass being $10^{-2}$ times that of the most massive molecular cloud.
Outside the nuclear region of M83 ($R_{g}>0.5$ kpc), there is no evidence for
changing internal conditions in the population of molecular clouds, with the
average internal pressures, densities, and free-fall times remaining constant
for the cloud population over the galaxy. This result is consistent with the
bound cluster formation efficiency depending only on the large-scale properties
of the ISM, rather than the internal conditions of individual clouds.
[11]
oai:arXiv.org:1509.06376 [pdf] - 1530313
Detecting Effects of Filaments on Galaxy Properties in the Sloan Digital
Sky Survey III
Submitted: 2015-09-21, last modified: 2017-01-12
We study the effects of filaments on galaxy properties in the Sloan Digital
Sky Survey (SDSS) Data Release 12 using filaments from the `Cosmic Web
Reconstruction' catalogue (Chen et al. 2016), a publicly available filament
catalogue for SDSS. Since filaments are tracers of medium-to-high density
regions, we expect that galaxy properties associated with the environment are
dependent on the distance to the nearest filament. Our analysis demonstrates
that a red galaxy or a high-mass galaxy tend to reside closer to filaments than
a blue or low-mass galaxy. After adjusting the effect from stellar mass, on
average, early-forming galaxies or large galaxies have a shorter distance to
filaments than late-forming galaxies or small galaxies. For the Main galaxy
sample (MGS), all signals are very significant ($>6\sigma$). For the LOWZ and
CMASS sample, the stellar mass and size are significant ($>2 \sigma$). The
filament effects we observe persist until $z = 0.7$ (the edge of the CMASS
sample). Comparing our results to those using the galaxy distances from
redMaPPer galaxy clusters as a reference, we find a similar result between
filaments and clusters. Moreover, we find that the effect of clusters on the
stellar mass of nearby galaxies depends on the galaxy's filamentary
environment. Our findings illustrate the strong correlation of galaxy
properties with proximity to density ridges, strongly supporting the claim that
density ridges are good tracers of filaments.
[12]
oai:arXiv.org:1605.01065 [pdf] - 1457265
Intrinsic alignments in redMaPPer clusters -- I. Central galaxy
alignments and angular segregation of satellites
Submitted: 2016-05-03, last modified: 2016-08-04
The shapes of cluster central galaxies are not randomly oriented, but rather
exhibit coherent alignments with the shapes of their parent clusters as well as
with the surrounding large-scale structures. In this work, we aim to identify
the galaxy and cluster quantities that most strongly predict the central galaxy
alignment phenomenon among a large parameter space with a sample of 8237
clusters and 94817 members within 0.1<z<0.35, based on the redMaPPer cluster
catalog constructed from the Sloan Digital Sky Survey. We first quantify the
alignment between the projected central galaxy shapes and the distribution of
member satellites, to understand what central galaxy and cluster properties
most strongly correlate with these alignments. Next, we investigate the angular
segregation of satellites with respect to their central galaxy major axis
directions, to identify the satellite properties that most strongly predict
their angular segregation. We find that central galaxies are more aligned with
their member galaxy distributions in clusters that are more elongated and have
higher richness, and for central galaxies with larger physical size, higher
luminosity and centering probability, and redder color. Satellites with redder
color, higher luminosity, located closer to the central galaxy, and with
smaller ellipticity show a stronger angular segregation toward their central
galaxy major axes. Finally, we provide physical explanations for some of the
identified correlations, and discuss the connection to theories of central
galaxy alignments, the impact of primordial alignments with tidal fields, and
the importance of anisotropic accretion.
[13]
oai:arXiv.org:1604.01339 [pdf] - 1386744
Photo-z Estimation: An Example of Nonparametric Conditional Density
Estimation under Selection Bias
Submitted: 2016-04-05
Redshift is a key quantity for inferring cosmological model parameters. In
photometric redshift estimation, cosmologists use the coarse data collected
from the vast majority of galaxies to predict the redshift of individual
galaxies. To properly quantify the uncertainty in the predictions, however, one
needs to go beyond standard regression and instead estimate the full
conditional density f(z|x) of a galaxy's redshift z given its photometric
covariates x. The problem is further complicated by selection bias: usually
only the rarest and brightest galaxies have known redshifts, and these galaxies
have characteristics and measured covariates that do not necessarily match
those of more numerous and dimmer galaxies of unknown redshift. Unfortunately,
there is not much research on how to best estimate complex multivariate
densities in such settings. Here we describe a general framework for properly
constructing and assessing nonparametric conditional density estimators under
selection bias, and for combining two or more estimators for optimal
performance. We propose new improved photo-z estimators and illus- trate our
methods on data from the Sloan Data Sky Survey and an application to
galaxy-galaxy lensing. Although our main application is photo-z estimation, our
methods are relevant to any high-dimensional regression setting with
complicated asymmetric and multimodal distributions in the response variable.
[14]
oai:arXiv.org:1504.01751 [pdf] - 1358764
Beyond Spheroids and Discs: Classifications of CANDELS Galaxy Structure
at 1.4 < z < 2 via Principal Component Analysis
Peth, Michael A.;
Lotz, Jennifer M.;
Freeman, Peter E.;
McPartland, Conor;
Mortazavi, S. Alireza;
Snyder, Gregory F.;
Barro, Guillermo;
Grogin, Norman A.;
Guo, Yicheng;
Hemmati, Shoubaneh;
Kartaltepe, Jeyhan S.;
Kocevski, Dale D.;
Koekemoer, Anton M.;
McIntosh, Daniel H.;
Nayyeri, Hooshang;
Papovich, Casey;
Primack, Joel R.;
Simons, Raymond C.
Submitted: 2015-04-07, last modified: 2016-02-08
Important but rare and subtle processes driving galaxy morphology and
star-formation may be missed by traditional spiral, elliptical, irregular or
S\'ersic bulge/disk classifications. To overcome this limitation, we use a
principal component analysis of non-parametric morphological indicators
(concentration, asymmetry, Gini coefficient, $M_{20}$, multi-mode, intensity
and deviation) measured at rest-frame $B$-band (corresponding to HST/WFC3 F125W
at 1.4 $< z <$ 2) to trace the natural distribution of massive ($>10^{10}
M_{\odot}$) galaxy morphologies. Principal component analysis (PCA) quantifies
the correlations between these morphological indicators and determines the
relative importance of each. The first three principal components (PCs) capture
$\sim$75 per cent of the variance inherent to our sample. We interpret the
first principal component (PC) as bulge strength, the second PC as dominated by
concentration and the third PC as dominated by asymmetry. Both PC1 and PC2
correlate with the visual appearance of a central bulge and predict galaxy
quiescence. PC1 is a better predictor of quenching than stellar mass, as as
good as other structural indicators (S\'ersic-n or compactness). We divide the
PCA results into groups using an agglomerative hierarchical clustering method.
Unlike S\'ersic, this classification scheme separates compact galaxies from
larger, smooth proto-elliptical systems, and star-forming disk-dominated clumpy
galaxies from star-forming bulge-dominated asymmetric galaxies. Distinguishing
between these galaxy structural types in a quantitative manner is an important
step towards understanding the connections between morphology, galaxy assembly
and star-formation.
[15]
oai:arXiv.org:1509.06443 [pdf] - 1447640
Cosmic Web Reconstruction through Density Ridges: Catalogue
Submitted: 2015-09-21
We construct a catalogue for filaments using a novel approach called SCMS
(subspace constrained mean shift; Ozertem & Erdogmus 2011; Chen et al. 2015).
SCMS is a gradient-based method that detects filaments through density ridges
(smooth curves tracing high-density regions). A great advantage of SCMS is its
uncertainty measure, which allows an evaluation of the errors for the detected
filaments. To detect filaments, we use data from the Sloan Digital Sky Survey,
which consist of three galaxy samples: the NYU main galaxy sample (MGS), the
LOWZ sample and the CMASS sample. Each of the three dataset covers different
redshift regions so that the combined sample allows detection of filaments up
to z = 0.7. Our filament catalogue consists of a sequence of two-dimensional
filament maps at different redshifts that provide several useful statistics on
the evolution cosmic web. To construct the maps, we select spectroscopically
confirmed galaxies within 0.050 < z < 0.700 and partition them into 130 bins.
For each bin, we ignore the redshift, treating the galaxy observations as a 2-D
data and detect filaments using SCMS. The filament catalogue consists of 130
individual 2-D filament maps, and each map comprises points on the detected
filaments that describe the filamentary structures at a particular redshift. We
also apply our filament catalogue to investigate galaxy luminosity and its
relation with distance to filament. Using a volume-limited sample, we find
strong evidence (6.1$\sigma$ - 12.3$\sigma$) that galaxies close to filaments
are generally brighter than those at significant distance from filaments.
[16]
oai:arXiv.org:1509.05619 [pdf] - 1279380
Prediction of galaxy ellipticities and reduction of shape noise in
cosmic shear measurements
Submitted: 2015-09-18
The intrinsic scatter in the ellipticities of galaxies about the mean shape,
known as "shape noise," is the most important source of noise in weak lensing
shear measurements. Several approaches to reducing shape noise have recently
been put forward, using information beyond photometry, such as radio
polarization and optical spectroscopy. Here we investigate how well the
intrinsic ellipticities of galaxies can be predicted using other, exclusively
photometric parameters. These parameters (such as galaxy colours) are already
available in the data and do not necessitate additional, often expensive
observations. We apply two regression techniques, generalized additive models
(GAM) and projection pursuit regression (PPR) to the publicly released data
catalog of galaxy properties from CFHTLenS. In our simple analysis we find that
the individual galaxy ellipticities can indeed be predicted from other
photometric parameters to better precision than the scatter about the mean
ellipticity. This means that without additional observations beyond photometry
the ellipticity contribution to the shear can be measured to higher precision,
comparable to using a larger sample of galaxies. Our best-fit model, achieved
using PPR, yields a gain equivalent to having 114.3% more galaxies. Using only
parameters unaffected by lensing (e.g.~surface brightness, colour), the gain is
only ~12%.
[17]
oai:arXiv.org:1501.05303 [pdf] - 1288321
Cosmic Web Reconstruction through Density Ridges: Method and Algorithm
Submitted: 2015-01-21, last modified: 2015-08-27
The detection and characterization of filamentary structures in the cosmic
web allows cosmologists to constrain parameters that dictates the evolution of
the Universe. While many filament estimators have been proposed, they generally
lack estimates of uncertainty, reducing their inferential power. In this paper,
we demonstrate how one may apply the Subspace Constrained Mean Shift (SCMS)
algorithm (Ozertem and Erdogmus (2011); Genovese et al. (2012)) to uncover
filamentary structure in galaxy data. The SCMS algorithm is a gradient ascent
method that models filaments as density ridges, one-dimensional smooth curves
that trace high-density regions within the point cloud. We also demonstrate how
augmenting the SCMS algorithm with bootstrap-based methods of uncertainty
estimation allows one to place uncertainty bands around putative filaments. We
apply the SCMS method to datasets sampled from the P3M N-body simulation, with
galaxy number densities consistent with SDSS and WFIRST-AFTA and to LOWZ and
CMASS data from the Baryon Oscillation Spectroscopic Survey (BOSS). To further
assess the efficacy of SCMS, we compare the relative locations of BOSS
filaments with galaxy clusters in the redMaPPer catalog, and find that
redMaPPer clusters are significantly closer (with p-values $< 10^{-9}$) to
SCMS-detected filaments than to randomly selected galaxies.
[18]
oai:arXiv.org:1508.04149 [pdf] - 1300265
Investigating Galaxy-Filament Alignments in Hydrodynamic Simulations
using Density Ridges
Submitted: 2015-08-17
In this paper, we study the filamentary structures and the galaxy alignment
along filaments at redshift $z=0.06$ in the MassiveBlack-II simulation, a
state-of-the-art, high-resolution hydrodynamical cosmological simulation which
includes stellar and AGN feedback in a volume of (100 Mpc$/h$)$^3$. The
filaments are constructed using the subspace constrained mean shift (SCMS;
Ozertem & Erdogmus (2011) and Chen et al. (2015a)). First, we show that
reconstructed filaments using galaxies and reconstructed filaments using dark
matter particles are similar to each other; over $50\%$ of the points on the
galaxy filaments have a corresponding point on the dark matter filaments within
distance $0.13$ Mpc$/h$ (and vice versa) and this distance is even smaller at
high-density regions. Second, we observe the alignment of the major principal
axis of a galaxy with respect to the orientation of its nearest filament and
detect a $2.5$ Mpc$/h$ critical radius for filament's influence on the
alignment when the subhalo mass of this galaxy is between $10^9M_\odot/h$ and
$10^{12}M_\odot/h$. Moreover, we find the alignment signal to increase
significantly with the subhalo mass. Third, when a galaxy is close to filaments
(less than $0.25$ Mpc$/h$), the galaxy alignment toward the nearest galaxy
group depends on the galaxy subhalo mass. Finally, we find that galaxies close
to filaments or groups tend to be rounder than those away from filaments or
groups.
[19]
oai:arXiv.org:1409.1583 [pdf] - 1240868
Diverse Structural Evolution at z > 1 in Cosmologically Simulated
Galaxies
Submitted: 2014-09-04, last modified: 2015-07-02
From mock Hubble Space Telescope images, we quantify non-parametric
statistics of galaxy morphology, thereby predicting the emergence of
relationships among stellar mass, star formation, and observed rest-frame
optical structure at 1 < z < 3. We measure automated diagnostics of galaxy
morphology in cosmological simulations of the formation of 22 central galaxies
with 9.3 < log10 M_*/M_sun < 10.7. These high-spatial-resolution zoom-in
calculations enable accurate modeling of the rest-frame UV and optical
morphology. Even with small numbers of galaxies, we find that structural
evolution is neither universal nor monotonic: galaxy interactions can trigger
either bulge or disc formation, and optically bulge-dominated galaxies at this
mass may not remain so forever. Simulated galaxies with M_* > 10^10 M_sun
contain relatively more disc-dominated light profiles than those with lower
mass, reflecting significant disc brightening in some haloes at 1 < z < 2. By
this epoch, simulated galaxies with specific star formation rates below 10^-9.7
yr^-1 are more likely than normal star-formers to have a broader mix of
structural types, especially at M_* > 10^10 M_sun. We analyze a cosmological
major merger at z ~ 1.5 and find that the newly proposed MID morphology
diagnostics trace later merger stages while G-M20 trace earlier ones. MID is
sensitive also to clumpy star-forming discs. The observability time of typical
MID-enhanced events in our simulation sample is less than 100 Myr. A larger
sample of cosmological assembly histories may be required to calibrate such
diagnostics in the face of their sensitivity to viewing angle, segmentation
algorithm, and various phenomena such as clumpy star formation and minor
mergers.
[20]
oai:arXiv.org:1406.7536 [pdf] - 844312
Estimating the distribution of Galaxy Morphologies on a continuous space
Submitted: 2014-06-29
The incredible variety of galaxy shapes cannot be summarized by human defined
discrete classes of shapes without causing a possibly large loss of
information. Dictionary learning and sparse coding allow us to reduce the high
dimensional space of shapes into a manageable low dimensional continuous vector
space. Statistical inference can be done in the reduced space via probability
distribution estimation and manifold estimation.
[21]
oai:arXiv.org:1404.3168 [pdf] - 809422
Functional Regression for Quasar Spectra
Submitted: 2014-04-11
The Lyman-alpha forest is a portion of the observed light spectrum of distant
galactic nuclei which allows us to probe remote regions of the Universe that
are otherwise inaccessible. The observed Lyman-alpha forest of a quasar light
spectrum can be modeled as a noisy realization of a smooth curve that is
affected by a `damping effect' which occurs whenever the light emitted by the
quasar travels through regions of the Universe with higher matter
concentration. To decode the information conveyed by the Lyman-alpha forest
about the matter distribution, we must be able to separate the smooth
`continuum' from the noise and the contribution of the damping effect in the
quasar light spectra. To predict the continuum in the Lyman-alpha forest, we
use a nonparametric functional regression model in which both the response and
the predictor variable (the smooth part of the damping-free portion of the
spectrum) are function-valued random variables. We demonstrate that the
proposed method accurately predicts the unobservable continuum in the
Lyman-alpha forest both on simulated spectra and real spectra. Also, we
introduce distribution-free prediction bands for the nonparametric functional
regression model that have finite sample guarantees. These prediction bands,
together with bootstrap-based confidence bands for the projection of the mean
continuum on a fixed number of principal components, allow us to assess the
degree of uncertainty in the model predictions.
[22]
oai:arXiv.org:1401.1867 [pdf] - 1202636
Nonparametric 3D map of the IGM using the Lyman-alpha forest
Submitted: 2014-01-08
Visualizing the high-redshift Universe is difficult due to the dearth of
available data; however, the Lyman-alpha forest provides a means to map the
intergalactic medium at redshifts not accessible to large galaxy surveys.
Large-scale structure surveys, such as the Baryon Oscillation Spectroscopic
Survey (BOSS), have collected quasar (QSO) spectra that enable the
reconstruction of HI density fluctuations. The data fall on a collection of
lines defined by the lines-of-sight (LOS) of the QSO, and a major issue with
producing a 3D reconstruction is determining how to model the regions between
the LOS. We present a method that produces a 3D map of this relatively
uncharted portion of the Universe by employing local polynomial smoothing, a
nonparametric methodology. The performance of the method is analyzed on
simulated data that mimics the varying number of LOS expected in real data, and
then is applied to a sample region selected from BOSS. Evaluation of the
reconstruction is assessed by considering various features of the predicted 3D
maps including visual comparison of slices, PDFs, counts of local minima and
maxima, and standardized correlation functions. This 3D reconstruction allows
for an initial investigation of the topology of this portion of the Universe
using persistent homology.
[23]
oai:arXiv.org:1306.1238 [pdf] - 1171844
New Image Statistics for Detecting Disturbed Galaxy Morphologies at High
Redshift
Submitted: 2013-06-05
Testing theories of hierarchical structure formation requires estimating the
distribution of galaxy morphologies and its change with redshift. One aspect of
this investigation involves identifying galaxies with disturbed morphologies
(e.g., merging galaxies). This is often done by summarizing galaxy images
using, e.g., the CAS and Gini-M20 statistics of Conselice (2003) and Lotz et
al. (2004), respectively, and associating particular statistic values with
disturbance. We introduce three statistics that enhance detection of disturbed
morphologies at high-redshift (z ~ 2): the multi-mode (M), intensity (I), and
deviation (D) statistics. We show their effectiveness by training a
machine-learning classifier, random forest, using 1,639 galaxies observed in
the H band by the Hubble Space Telescope WFC3, galaxies that had been
previously classified by eye by the CANDELS collaboration (Grogin et al. 2011,
Koekemoer et al. 2011). We find that the MID statistics (and the A statistic of
Conselice 2003) are the most useful for identifying disturbed morphologies.
We also explore whether human annotators are useful for identifying disturbed
morphologies. We demonstrate that they show limited ability to detect
disturbance at high redshift, and that increasing their number beyond
approximately 10 does not provably yield better classification performance. We
propose a simulation-based model-fitting algorithm that mitigates these issues
by bypassing annotation.
[24]
oai:arXiv.org:1105.6344 [pdf] - 489781
Prototype selection for parameter estimation in complex models
Submitted: 2011-05-31, last modified: 2012-03-20
Parameter estimation in astrophysics often requires the use of complex
physical models. In this paper we study the problem of estimating the
parameters that describe star formation history (SFH) in galaxies. Here,
high-dimensional spectral data from galaxies are appropriately modeled as
linear combinations of physical components, called simple stellar populations
(SSPs), plus some nonlinear distortions. Theoretical data for each SSP is
produced for a fixed parameter vector via computer modeling. Though the
parameters that define each SSP are continuous, optimizing the signal model
over a large set of SSPs on a fine parameter grid is computationally infeasible
and inefficient. The goal of this study is to estimate the set of parameters
that describes the SFH of each galaxy. These target parameters, such as the
average ages and chemical compositions of the galaxy's stellar populations, are
derived from the SSP parameters and the component weights in the signal model.
Here, we introduce a principled approach of choosing a small basis of SSP
prototypes for SFH parameter estimation. The basic idea is to quantize the
vector space and effective support of the model components. In addition to
greater computational efficiency, we achieve better estimates of the SFH target
parameters. In simulations, our proposed quantization method obtains a
substantial improvement in estimating the target parameters over the common
method of employing a parameter grid. Sparse coding techniques are not
appropriate for this problem without proper constraints, while constrained
sparse coding methods perform poorly for parameter estimation because their
objective is signal reconstruction, not estimation of the target parameters.
[25]
oai:arXiv.org:1111.0911 [pdf] - 434117
Exploiting Non-Linear Structure in Astronomical Data for Improved
Statistical Inference
Submitted: 2011-11-03
Many estimation problems in astrophysics are highly complex, with
high-dimensional, non-standard data objects (e.g., images, spectra, entire
distributions, etc.) that are not amenable to formal statistical analysis. To
utilize such data and make accurate inferences, it is crucial to transform the
data into a simpler, reduced form. Spectral kernel methods are non-linear data
transformation methods that efficiently reveal the underlying geometry of
observable data. Here we focus on one particular technique: diffusion maps or
more generally spectral connectivity analysis (SCA). We give examples of
applications in astronomy; e.g., photometric redshift estimation, prototype
selection for estimation of star formation history, and supernova light curve
classification. We outline some computational and statistical challenges that
remain, and we discuss some promising future directions for astronomy and data
mining.
[26]
oai:arXiv.org:1103.6034 [pdf] - 1053061
Semi-supervised Learning for Photometric Supernova Classification
Submitted: 2011-03-30, last modified: 2011-09-27
We present a semi-supervised method for photometric supernova typing. Our
approach is to first use the nonlinear dimension reduction technique diffusion
map to detect structure in a database of supernova light curves and
subsequently employ random forest classification on a spectroscopically
confirmed training set to learn a model that can predict the type of each newly
observed supernova. We demonstrate that this is an effective method for
supernova typing. As supernova numbers increase, our semi-supervised method
efficiently utilizes this information to improve classification, a property not
enjoyed by template based methods. Applied to supernova data simulated by
Kessler et al. (2010b) to mimic those of the Dark Energy Survey, our methods
achieve (cross-validated) 95% Type Ia purity and 87% Type Ia efficiency on the
spectroscopic sample, but only 50% Type Ia purity and 50% efficiency on the
photometric sample due to their spectroscopic follow-up strategy. To improve
the performance on the photometric sample, we search for better spectroscopic
follow-up procedures by studying the sensitivity of our machine learned
supernova classification on the specific strategy used to obtain training sets.
With a fixed amount of spectroscopic follow-up time, we find that deeper
magnitude-limited spectroscopic surveys are better for producing training sets.
For supernova Ia (II-P) typing, we obtain a 44% (1%) increase in purity to 72%
(87%) and 30% (162%) increase in efficiency to 65% (84%) of the sample using a
25th (24.5th) magnitude-limited survey instead of the shallower spectroscopic
sample used in the original simulations. When redshift information is
available, we incorporate it into our analysis using a novel method of altering
the diffusion map representation of the supernovae. Incorporating host
redshifts leads to a 5% improvement in Type Ia purity and 13% improvement in
Type Ia efficiency.
[27]
oai:arXiv.org:1010.0677 [pdf] - 1041050
The XMM Cluster Survey: X-ray analysis methodology
Lloyd-Davies, E. J.;
Romer, A. Kathy;
Mehrtens, Nicola;
Hosmer, Mark;
Davidson, Michael;
Sabirli, Kivanc;
Mann, Robert G.;
Hilton, Matt;
Liddle, Andrew R.;
Viana, Pedro T. P.;
Campbell, Heather C.;
Collins, Chris A.;
Dubois, E. Naomi;
Freeman, Peter;
Harrison, Craig D.;
Hoyle, Ben;
Kay, Scott T.;
Kuwertz, Emma;
Miller, Christopher J.;
Nichol, Robert C.;
Sahlen, Martin;
Stanford, S. A.;
Stott, John P.
Submitted: 2010-10-04, last modified: 2011-06-15
The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters
using all publicly available data in the XMM-Newton Science Archive. Its main
aims are to measure cosmological parameters and trace the evolution of X-ray
scaling relations. In this paper we describe the data processing methodology
applied to the 5,776 XMM observations used to construct the current XCS source
catalogue. A total of 3,675 > 4-sigma cluster candidates with > 50
background-subtracted X-ray counts are extracted from a total non-overlapping
area suitable for cluster searching of 410 deg^2. Of these, 993 candidates are
detected with > 300 background-subtracted X-ray photon counts, and we
demonstrate that robust temperature measurements can be obtained down to this
count limit. We describe in detail the automated pipelines used to perform the
spectral and surface brightness fitting for these candidates, as well as to
estimate redshifts from the X-ray data alone. A total of 587 (122) X-ray
temperatures to a typical accuracy of < 40 (< 10) per cent have been measured
to date. We also present the methodology adopted for determining the selection
function of the survey, and show that the extended source detection algorithm
is robust to a range of cluster morphologies by inserting mock clusters derived
from hydrodynamical simulations into real XMM images. These tests show that the
simple isothermal beta-profiles is sufficient to capture the essential details
of the cluster population detected in the archival XMM observations. The
redshift follow-up of the XCS cluster sample is presented in a companion paper,
together with a first data release of 503 optically-confirmed clusters.
[28]
oai:arXiv.org:1103.1603 [pdf] - 1052567
An Unbiased Method of Modeling the Local Peculiar Velocity Field with
Type Ia Supernovae
Submitted: 2011-03-08
We apply statistically rigorous methods of nonparametric risk estimation to
the problem of inferring the local peculiar velocity field from nearby
supernovae (SNIa). We use two nonparametric methods - Weighted Least Squares
(WLS) and Coefficient Unbiased (CU) - both of which employ spherical harmonics
to model the field and use the estimated risk to determine at which multipole
to truncate the series. We show that if the data are not drawn from a uniform
distribution or if there is power beyond the maximum multipole in the
regression, a bias is introduced on the coefficients using WLS. CU estimates
the coefficients without this bias by including the sampling density making the
coefficients more accurate but not necessarily modeling the velocity field more
accurately. After applying nonparametric risk estimation to SNIa data, we find
that there are not enough data at this time to measure power beyond the dipole.
The WLS Local Group bulk flow is moving at 538 +- 86 km/s towards (l,b) = (258
+- 10 deg, 36 +- 11 deg) and the CU bulk flow is moving at 446 +- 101 km/s
towards (l,b) = (273 +- 11 deg, 46 +- 8 deg). We find that the magnitude and
direction of these measurements are in agreement with each other and previous
results in the literature.
[29]
oai:arXiv.org:1006.4334 [pdf] - 1360716
On Computing Upper Limits to Source Intensities
Submitted: 2010-06-22
A common problem in astrophysics is determining how bright a source could be
and still not be detected. Despite the simplicity with which the problem can be
stated, the solution involves complex statistical issues that require careful
analysis. In contrast to the confidence bound, this concept has never been
formally analyzed, leading to a great variety of often ad hoc solutions. Here
we formulate and describe the problem in a self-consistent manner. Detection
significance is usually defined by the acceptable proportion of false positives
(the TypeI error), and we invoke the complementary concept of false negatives
(the TypeII error), based on the statistical power of a test, to compute an
upper limit to the detectable source intensity. To determine the minimum
intensity that a source must have for it to be detected, we first define a
detection threshold, and then compute the probabilities of detecting sources of
various intensities at the given threshold. The intensity that corresponds to
the specified TypeII error probability defines that minimum intensity, and is
identified as the upper limit. Thus, an upper limit is a characteristic of the
detection procedure rather than the strength of any particular source and
should not be confused with confidence intervals or other estimates of source
intensity. This is particularly important given the large number of catalogs
that are being generated from increasingly sensitive surveys. We discuss the
differences between these upper limits and confidence bounds. Both measures are
useful quantities that should be reported in order to extract the most science
from catalogs, though they answer different statistical questions: an upper
bound describes an inference range on the source intensity, while an upper
limit calibrates the detection process. We provide a recipe for computing upper
limits that applies to all detection algorithms.
[30]
oai:arXiv.org:0906.0995 [pdf] - 1002464
Photometric Redshift Estimation Using Spectral Connectivity Analysis
Submitted: 2009-06-04
The development of fast and accurate methods of photometric redshift
estimation is a vital step towards being able to fully utilize the data of
next-generation surveys within precision cosmology. In this paper we apply a
specific approach to spectral connectivity analysis (SCA; Lee & Wasserman 2009)
called diffusion map. SCA is a class of non-linear techniques for transforming
observed data (e.g., photometric colours for each galaxy, where the data lie on
a complex subset of p-dimensional space) to a simpler, more natural coordinate
system wherein we apply regression to make redshift predictions. As SCA relies
upon eigen-decomposition, our training set size is limited to ~ 10,000
galaxies; we use the Nystrom extension to quickly estimate diffusion
coordinates for objects not in the training set. We apply our method to 350,738
SDSS main sample galaxies, 29,816 SDSS luminous red galaxies, and 5,223
galaxies from DEEP2 with CFHTLS ugriz photometry. For all three datasets, we
achieve prediction accuracies on par with previous analyses, and find that use
of the Nystrom extension leads to a negligible loss of prediction accuracy
relative to that achieved with the training sets. As in some previous analyses
(e.g., Collister & Lahav 2004, Ball et al. 2008), we observe that our
predictions are generally too high (low) in the low (high) redshift regimes. We
demonstrate that this is a manifestation of attenuation bias, wherein
measurement error (i.e., uncertainty in diffusion coordinates due to
uncertainty in the measured fluxes/magnitudes) reduces the slope of the
best-fit regression line. Mitigation of this bias is necessary if we are to use
photometric redshift estimates produced by computationally efficient empirical
methods in precision cosmology.
[31]
oai:arXiv.org:0905.4683 [pdf] - 1002383
Accurate parameter estimation for star formation history in galaxies
using SDSS spectra
Submitted: 2009-05-28
To further our knowledge of the complex physical process of galaxy formation,
it is essential that we characterize the formation and evolution of large
databases of galaxies. The spectral synthesis STARLIGHT code of Cid Fernandes
et al. (2004) was designed for this purpose. Results of STARLIGHT are highly
dependent on the choice of input basis of simple stellar population (SSP)
spectra. Speed of the code, which uses random walks through the parameter
space, scales as the square of the number of basis spectra, making it
computationally necessary to choose a small number of SSPs that are coarsely
sampled in age and metallicity. In this paper, we develop methods based on
diffusion map (Lafon & Lee, 2006) that, for the first time, choose appropriate
bases of prototype SSP spectra from a large set of SSP spectra designed to
approximate the continuous grid of age and metallicity of SSPs of which
galaxies are truly composed. We show that our techniques achieve better
accuracy of physical parameter estimation for simulated galaxies. Specifically,
we show that our methods significantly decrease the age-metallicity degeneracy
that is common in galaxy population synthesis methods. We analyze a sample of
3046 galaxies in SDSS DR6 and compare the parameter estimates obtained from
different basis choices.
[32]
oai:arXiv.org:0805.4136 [pdf] - 12977
Inference for the dark energy equation of state using Type IA supernova
data
Submitted: 2008-05-27, last modified: 2009-05-18
The surprising discovery of an accelerating universe led cosmologists to
posit the existence of "dark energy"--a mysterious energy field that permeates
the universe. Understanding dark energy has become the central problem of
modern cosmology. After describing the scientific background in depth, we
formulate the task as a nonlinear inverse problem that expresses the comoving
distance function in terms of the dark-energy equation of state. We present two
classes of methods for making sharp statistical inferences about the equation
of state from observations of Type Ia Supernovae (SNe). First, we derive a
technique for testing hypotheses about the equation of state that requires no
assumptions about its form and can distinguish among competing theories.
Second, we present a framework for computing parametric and nonparametric
estimators of the equation of state, with an associated assessment of
uncertainty. Using our approach, we evaluate the strength of statistical
evidence for various competing models of dark energy. Consistent with current
studies, we find that with the available Type Ia SNe data, it is not possible
to distinguish statistically among popular dark-energy models, and that, in
particular, there is no support in the data for rejecting a cosmological
constant. With much more supernova data likely to be available in coming years
(e.g., from the DOE/NASA Joint Dark Energy Mission), we address the more
interesting question of whether future data sets will have sufficient
resolution to distinguish among competing theories.
[33]
oai:arXiv.org:0802.4462 [pdf] - 1934218
The XMM Cluster Survey: Forecasting cosmological and cluster
scaling-relation parameter constraints
Sahlén, Martin;
Viana, Pedro T. P.;
Liddle, Andrew R.;
Romer, A. Kathy;
Davidson, Michael;
Hosmer, Mark;
Lloyd-Davies, Ed;
Sabirli, Kivanc;
Collins, Chris A.;
Freeman, Peter E.;
Hilton, Matt;
Hoyle, Ben;
Kay, Scott T.;
Mann, Robert G.;
Mehrtens, Nicola;
Miller, Christopher J.;
Nichol, Robert C.;
Stanford, S. Adam;
West, Michael J.
Submitted: 2008-02-29, last modified: 2009-04-26
We forecast the constraints on the values of sigma_8, Omega_m, and cluster
scaling relation parameters which we expect to obtain from the XMM Cluster
Survey (XCS). We assume a flat Lambda-CDM Universe and perform a Monte Carlo
Markov Chain analysis of the evolution of the number density of galaxy clusters
that takes into account a detailed simulated selection function. Comparing our
current observed number of clusters shows good agreement with predictions. We
determine the expected degradation of the constraints as a result of
self-calibrating the luminosity-temperature relation (with scatter), including
temperature measurement errors, and relying on photometric methods for the
estimation of galaxy cluster redshifts. We examine the effects of systematic
errors in scaling relation and measurement error assumptions. Using only (T,z)
self-calibration, we expect to measure Omega_m to +-0.03 (and Omega_Lambda to
the same accuracy assuming flatness), and sigma_8 to +-0.05, also constraining
the normalization and slope of the luminosity-temperature relation to +-6 and
+-13 per cent (at 1sigma) respectively in the process. Self-calibration fails
to jointly constrain the scatter and redshift evolution of the
luminosity-temperature relation significantly. Additional archival and/or
follow-up data will improve on this. We do not expect measurement errors or
imperfect knowledge of their distribution to degrade constraints significantly.
Scaling-relation systematics can easily lead to cosmological constraints 2sigma
or more away from the fiducial model. Our treatment is the first exact
treatment to this level of detail, and introduces a new `smoothed ML' estimate
of expected constraints.
[34]
oai:arXiv.org:0809.2800 [pdf] - 16406
Revealing components of the galaxy population through nonparametric
techniques
Submitted: 2008-09-16
The distributions of galaxy properties vary with environment, and are often
multimodal, suggesting that the galaxy population may be a combination of
multiple components. The behaviour of these components versus environment holds
details about the processes of galaxy development. To release this information
we apply a novel, nonparametric statistical technique, identifying four
components present in the distribution of galaxy H$\alpha$ emission-line
equivalent-widths. We interpret these components as passive, star-forming, and
two varieties of active galactic nuclei. Independent of this interpretation,
the properties of each component are remarkably constant as a function of
environment. Only their relative proportions display substantial variation. The
galaxy population thus appears to comprise distinct components which are
individually independent of environment, with galaxies rapidly transitioning
between components as they move into denser environments.
[35]
oai:arXiv.org:0807.2900 [pdf] - 314999
Exploiting Low-Dimensional Structure in Astronomical Spectra
Submitted: 2008-07-18
Dimension-reduction techniques can greatly improve statistical inference in
astronomy. A standard approach is to use Principal Components Analysis (PCA).
In this work we apply a recently-developed technique, diffusion maps, to
astronomical spectra for data parameterization and dimensionality reduction,
and develop a robust, eigenmode-based framework for regression. We show how our
framework provides a computationally efficient means by which to predict
redshifts of galaxies, and thus could inform more expensive redshift estimators
such as template cross-correlation. It also provides a natural means by which
to identify outliers (e.g., misclassified spectra, spectra with anomalous
features). We analyze 3835 SDSS spectra and show how our framework yields a
more than 95% reduction in dimensionality. Finally, we show that the prediction
error of the diffusion map-based regression approach is markedly smaller than
that of a similar approach based on PCA, clearly demonstrating the superiority
of diffusion maps over PCA for this regression task.
[36]
oai:arXiv.org:astro-ph/0510844 [pdf] - 77340
Massive Science with VO and Grids
Nichol, Robert;
Smith, Garry;
Miller, Christopher;
Freeman, Peter;
Genovese, Chris;
Wasserman, Larry;
Bryan, Brent;
Gray, Alexander;
Schneider, Jeff;
Moore, Andrew
Submitted: 2005-10-31
There is a growing need for massive computational resources for the analysis
of new astronomical datasets. To tackle this problem, we present here our first
steps towards marrying two new and emerging technologies; the Virtual
Observatory (e.g, AstroGrid) and the computational grid (e.g. TeraGrid, COSMOS
etc.). We discuss the construction of VOTechBroker, which is a modular software
tool designed to abstract the tasks of submission and management of a large
number of computational jobs to a distributed computer system. The broker will
also interact with the AstroGrid workflow and MySpace environments. We discuss
our planned usages of the VOTechBroker in computing a huge number of n-point
correlation functions from the SDSS data and massive model-fitting of millions
of CMBfast models to WMAP data. We also discuss other applications including
the determination of the XMM Cluster Survey selection function and the
construction of new WMAP maps.
[37]
oai:arXiv.org:astro-ph/0510406 [pdf] - 260658
Examining the Effect of the Map-Making Algorithm on Observed Power
Asymmetry in WMAP Data
Submitted: 2005-10-13
We analyze first-year data of WMAP to determine the significance of asymmetry
in summed power between arbitrarily defined opposite hemispheres, using maps
that we create ourselves with software developed independently of the WMAP
team. We find that over the multipole range l=[2,64], the significance of
asymmetry is ~ 10^-4, a value insensitive to both frequency and power spectrum.
We determine the smallest multipole ranges exhibiting significant asymmetry,
and find twelve, including l=[2,3] and [6,7], for which the significance -> 0.
In these ranges there is an improbable association between the direction of
maximum significance and the ecliptic plane (p ~ 0.01). Also, contours of least
significance follow great circles inclined relative to the ecliptic at the
largest scales. The great circle for l=[2,3] passes over previously reported
preferred axes and is insensitive to frequency, while the great circle for
l=[6,7] is aligned with the ecliptic poles. We examine how changing map-making
parameters affects asymmetry, and find that at large scales, it is rendered
insignificant if the magnitude of the WMAP dipole vector is increased by
approximately 1-3 sigma (or 2-6 km/s). While confirmation of this result would
require data recalibration, such a systematic change would be consistent with
observations of frequency-independent asymmetry. We conclude that the use of an
incorrect dipole vector, in combination with a systematic or foreground process
associated with the ecliptic, may help to explain the observed asymmetry.
[38]
oai:arXiv.org:astro-ph/0501056 [pdf] - 70168
Chandra Observations of MBM12 and Models of the Local Bubble
Submitted: 2005-01-04
Chandra observations toward the nearby molecular cloud MBM12 show
unexpectedly strong and nearly equal foreground O VIII and O VII emission. As
the observed portion of MBM12 is optically thick at these energies, the
emission lines must be formed nearby, coming either from the Local Bubble (LB)
or charge exchange with ions from the Sun. Equilibrium models for the LB
predict stronger O VII than O VIII, so these results suggest that the LB is far
from equilibrium or a substantial portion of O VIII is from another source,
such as charge exchange within the Solar system. Despite the likely
contamination, we can combine our results with other EUV and X-ray observations
to reject LB models which posit a cool recombining plasma as the source of LB
X-rays.
[39]
oai:arXiv.org:astro-ph/0308493 [pdf] - 554543
Chandra Multi-wavelength Project (ChaMP). II. First Results of X-ray
Source Properties
Kim, D. -W.;
Wilkes, B. J.;
Green, P. J.;
Cameron, R. A.;
Drake, J. J.;
Evans, N. R.;
Freeman, P.;
Gaetz, T. J.;
Ghosh, H.;
Harnden,, F. R.;
Karovska, M.;
Kashyap, V.;
Maksym, P. W.;
Ratzlaff, P. W.;
Schlegel, E. M.;
Silverman, J. D.;
Tananbaum, H. D.;
Vikhlinin, A. A.
Submitted: 2003-08-27
We present the first results of ChaMP X-ray source properties obtained from
the initial sample of 62 observations. The data have been uniformly reduced and
analyzed with techniques specifically developed for the ChaMP and then
validated by visual examination. Utilizing only near on-axis, bright X-ray
sources (to avoid problems caused by incompleteness and the Eddington bias), we
derive the Log(N)-Log(S) relation in soft (0.5-2 keV) and hard (2-8 keV) energy
bands. The ChaMP data are consistent with previous results of ROSAT, ASCA and
Chandra deep surveys. In particular, our data nicely fill in the flux gap in
the hard band between the Chandra Deep Field data and the previous ASCA data.
We check whether there is any systematic difference in the source density
between cluster and non-cluster fields and also search for field-to-field
variations, both of which have been previously reported. We found no
significant field-to-field cosmic variation in either test within the
statistics (~1 sigma) across the flux levels included in our sample. In the
X-ray color-color plot, most sources fall in the location characterized by
photon index = 1.5-2 and NH = a few x 10^20 cm^2, suggesting that they are
typical broad-line AGNs. There also exist a considerable number of sources with
peculiar X-ray colors (e.g., highly absorbed, very hard, very soft). We confirm
a trend that on average the X-ray color hardens as the count rate decreases.
Since the hardening is confined to the softest energy band (0.3-0.9 keV), we
conclude it is most likely due to absorption. We cross-correlate the X-ray
sources with other catalogs and describe their properties in terms of optical
color, X-ray-to-optical luminosity ratio and X-ray colors.
[40]
oai:arXiv.org:astro-ph/0308492 [pdf] - 554542
Chandra Multi-wavelength Project (ChaMP). I. First X-ray Source Catalog
Kim, D. -W.;
Cameron, R. A.;
Drake, J. J.;
Evans, N. R.;
Freeman, P.;
Gaetz, T. J.;
Ghosh, H.;
Green, P. J.;
Harnden,, F. R.;
Karovska, M.;
Kashyap, V.;
Maksym, P. W.;
Ratzlaff, P. W.;
Schlegel, E. M.;
Silverman, J. D.;
Tananbaum, H. D.;
Vikhlinin, A. A.;
Wilkes, B. J.;
Grimes, J. P.
Submitted: 2003-08-27
The Chandra Multi-wavelength Project (ChaMP) is a wide-area (~14 deg^2)
survey of serendipitous Chandra X-ray sources, aiming to establish fair
statistical samples covering a wide range of characteristics (such as absorbed
AGNs, high z clusters of galaxies) at flux levels (fX ~ 10^-15 - 10^-14 erg
sec-1 cm-2) intermediate between the Chandra Deep surveys and previous
missions. We present the first ChaMP catalog, which consists of 991 near
on-axis, bright X-ray sources obtained from the initial sample of 62
observations. The data have been uniformly reduced and analyzed with techniques
specifically developed for the ChaMP and then validated by visual examination.
To assess source reliability and positional uncertainty, we perform a series of
simulations and also use Chandra data to complement the simulation study. The
false source detection rate is found to be as good as or better than expected
for a given limiting threshold. On the other hand, the chance of missing a real
source is rather complex, depending on the source counts, off-axis distance (or
PSF), and background rate. The positional error (95% confidence level) is
usually < 1" for a bright source, regardless of its off-axis distance while it
can be as large as 4" for a weak source (~20 counts) at a large off-axis
distance (Doff-axis > 8'). We have also developed new methods to find spatially
extended or temporary variable sources and those sources are listed in the
catalog.
[41]
oai:arXiv.org:astro-ph/0204159 [pdf] - 48704
Is RX J185635-375 a Quark Star?
Drake, J. J.;
Marshall, H. L.;
Dreizler, S.;
Freeman, P. E.;
Fruscione, A.;
Juda, M.;
Kashyap, V.;
Nicastro, F.;
Pease, D. O.;
Wargelin, B. J.;
Werner, K.
Submitted: 2002-04-09
Deep Chandra LETG+HRC-S observations of the isolated neutron star candidate
RX J1856.5-3754 have been analysed to search for metallic and resonance
cyclotron spectral features and for pulsation behaviour. As found from earlier
observations, the X-ray spectrum is well-represented by a ~ 60 eV (7e5 K)
blackbody. No unequivocal evidence of spectral line or edge features has been
found, arguing against metal-dominated models. The data contain no evidence for
pulsation and we place a 99% confidence upper limit of 2.7% on the
unaccelerated pulse fraction over a wide frequency range from 1e-4 to 100 Hz.
We argue that the derived interstellar medium neutral hydrogen column density
of 8e19 <= N_H <= 1.1e20 per sq. cm favours the larger distance from two recent
HST parallax analyses, placing RX J1856.5-3754 at ~ 140 pc instead of ~ 60 pc,
and in the outskirts of the R CrA dark molecular cloud. That such a
comparatively rare region of high ISM density is precisely where an isolated
neutron star re-heated by accretion of interstellar matter would be expected is
either entirely coincidental, or current theoretical arguments excluding this
scenario for RX J1856.5-3754 are premature. Taken at face value, the combined
observational evidence -- a lack of spectral and temporal features and an
implied radius at infinity of 3.8-8.2 km that is too small for current neutron
star models -- points to a more compact object, such as allowed for quark
matter equations of state.
[42]
oai:arXiv.org:astro-ph/0108429 [pdf] - 44405
A Wavelet-Based Algorithm for the Spatial Analysis of Poisson Data
Submitted: 2001-08-27
Wavelets are scaleable, oscillatory functions that deviate from zero only
within a limited spatial regime and have average value zero. In addition to
their use as source characterizers, wavelet functions are rapidly gaining
currency within the source detection field. Wavelet-based source detection
involves the correlation of scaled wavelet functions with binned,
two-dimensional image data. If the chosen wavelet function exhibits the
property of vanishing moments, significantly non-zero correlation coefficients
will be observed only where there are high-order variations in the data; e.g.,
they will be observed in the vicinity of sources.
In this paper, we describe the mission-independent, wavelet-based source
detection algorithm WAVDETECT, part of the CIAO software package. Aspects of
our algorithm include: (1) the computation of local, exposure-corrected
normalized (i.e. flat-fielded) background maps; (2) the correction for exposure
variations within the field-of-view; (3) its applicability within the
low-counts regime, as it does not require a minimum number of background counts
per pixel for the accurate computation of source detection thresholds; (4) the
generation of a source list in a manner that does not depend upon a detailed
knowledge of the point spread function (PSF) shape; and (5) error analysis.
These features make our algorithm considerably more general than previous
methods developed for the analysis of X-ray image data, especially in the low
count regime. We demonstrate the algorithm's robustness by applying it to
various images.
[43]
oai:arXiv.org:astro-ph/0108426 [pdf] - 44402
Sherpa: a Mission-Independent Data Analysis Application
Submitted: 2001-08-27
The ever-increasing quality and complexity of astronomical data underscores
the need for new and powerful data analysis applications. This need has led to
the development of Sherpa, a modeling and fitting program in the CIAO software
package that enables the analysis of multi-dimensional, multi-wavelength data.
In this paper, we present an overview of Sherpa's features, which include:
support for a wide variety of input and output data formats, including the new
Model Descriptor List (MDL) format; a model language which permits the
construction of arbitrarily complex model expressions, including ones
representing instrument characteristics; a wide variety of fit statistics and
methods of optimization, model comparison, and parameter estimation;
multi-dimensional visualization, provided by ChIPS; and new interactive
analysis capabilities provided by embedding the S-Lang interpreted scripting
language. We conclude by showing example Sherpa analysis sessions.
[44]
oai:arXiv.org:astro-ph/9906395 [pdf] - 107116
Resonant Cyclotron Radiation Transfer Model Fits to Spectra from
Gamma-Ray Burst GRB870303
Submitted: 1999-06-24
We demonstrate that models of resonant cyclotron radiation transfer in a
strong field (i.e. cyclotron scattering) can account for spectral lines seen at
two epochs, denoted S1 and S2, in the Ginga data for GRB870303. Using a
generalized version of the Monte Carlo code of Wang et al. (1988,1989b), we
model line formation by injecting continuum photons into a static
plane-parallel slab of electrons threaded by a strong neutron star magnetic
field (~ 10^12 G) which may be oriented at an arbitrary angle relative to the
slab normal. We examine two source geometries, which we denote "1-0" and "1-1,"
with the numbers representing the relative electron column densities above and
below the continuum photon source plane. We compare azimuthally symmetric
models, i.e. models in which the magnetic field is parallel to the slab normal,
with models having more general magnetic field orientations. If the bursting
source has a simple dipole field, these two model classes represent line
formation at the magnetic pole, or elsewhere on the stellar surface. We find
that the data of S1 and S2, considered individually, are consistent with both
geometries, and with all magnetic field orientations, with the exception that
the S1 data clearly favor line formation away from a polar cap in the 1-1
geometry, with the best-fit model placing the line-forming region at the
magnetic equator. Within both geometries, fits to the combined (S1+S2) data
marginally favor models which feature equatorial line formation, and in which
the observer's orientation with respect to the slab changes between the two
epochs. We interpret this change as being due to neutron star rotation, and we
place limits on the rotation period.
[45]
oai:arXiv.org:astro-ph/9906394 [pdf] - 107115
Statistical Analysis of Spectral Line Candidates in Gamma-Ray Burst
GRB870303
Submitted: 1999-06-24
The Ginga data for the gamma-ray burst GRB870303 exhibit low-energy dips in
two temporally distinct spectra, denoted S1 and S2. S1, spanning 4 s, exhibits
a single line candidate at ~ 20 keV, while S2, spanning 9 s, exhibits
apparently harmonically spaced line candidates at ~ 20 and 40 keV. We evaluate
the statistical evidence for these lines, using phenomenological continuum and
line models which in their details are independent of the distance scale to
gamma-ray bursts. We employ the methodologies based on both frequentist and
Bayesian statistical inference that we develop in Freeman et al. (1999b). These
methodologies utilize the information present in the data to select the
simplest model that adequately describes the data from among a wide range of
continuum and continuum-plus-line(s) models. This ensures that the chosen model
does not include free parameters that the data deem unnecessary and that would
act to reduce the frequentist significance and Bayesian odds of the
continuum-plus-line(s) model. We calculate the significance of the
continuum-plus-line(s) models using the Chi-Square Maximum Likelihood Ratio
test. We describe a parametrization of the exponentiated Gaussian absorption
line shape that makes the probability surface in parameter space
better-behaved, allowing us to estimate analytically the Bayesian odds. The
significance of the continuum-plus-line models requested by the S1 and S2 data
are 3.6 x 10^-5 and 1.7 x 10^-4 respectively, with the odds favoring them being
114:1 and 7:1. We also apply our methodology to the combined (S1+S2) data. The
significance of the continuum-plus-lines model requested by the combined data
is 4.2 x 10^-8, with the odds favoring it being 40,300:1.
[46]
oai:arXiv.org:astro-ph/9601167 [pdf] - 94022
BATSE SD Observations of Hercules X-1
Submitted: 1996-01-29
The cyclotron line in the spectrum of the accretion-powered pulsar Her X-1
offers an opportunity to assess the ability of the BATSE Spectroscopy Detectors
(SDs) to detect lines like those seen in some GRBs. Preliminary analysis of an
initial SD pulsar mode observation of Her X-1 indicated a cyclotron line at an
energy of approximately 44 keV, rather than at the expected energy of
approximately 36 keV. Our analysis of four SD pulsar mode observations of Her
X-1 made during high-states of its 35 day cycle confirms this result. We
consider a number of phenomenological models for the continuum spectrum and the
cyclotron line. This ensures that we use the simplest models that adequately
describe the data, and that our results are robust. We find modest evidence
(significance Q ~ 10^-4-10^-2) for a line at approximately 44 keV in the data
of the first observation. Joint fits to the four observations provide stronger
evidence (Q ~ 10^-7-10^-4) for the line. Such a shift in the cyclotron line
energy of an accretion-powered pulsar is unprecedented.