Normalized to: Cisewski-Kehe, J.
[1]
oai:arXiv.org:2005.14083 [pdf] - 2103539
A Hermite-Gaussian Based Radial Velocity Estimation Method
Submitted: 2020-05-28
As the first successful technique used to detect exoplanets orbiting distant
stars, the Radial Velocity Method aims to detect a periodic Doppler shift in a
star's spectrum. We introduce a new, mathematically rigorous, approach to
detect such a signal that accounts for functional relationships of neighboring
wavelengths, minimizes the role of wavelength interpolation, accounts for
heteroskedastic noise, and easily allows for statistical inference. Using
Hermite-Gaussian functions, we show that the problem of detecting a Doppler
shift in the spectrum can be reduced to linear regression in many settings. A
simulation study demonstrates that the proposed method is able to accurately
estimate an individual spectrum's radial velocity with precision below 0.3 m/s.
Furthermore, the new method outperforms the traditional Cross-Correlation
Function approach by reducing the root mean squared error up to 15 cm/s. The
proposed method is also demonstrated on a new set of observations from the
EXtreme PREcision Spectrometer (EXPRES) for the star 51 Pegasi, and
successfully recovers estimates that agree well with previous studies of this
planetary system. Data and Python3 code associated with this work can be found
at https://github.com/parkerholzer/hgrv_method. The method is also implemented
in the open source R package rvmethod.
[2]
oai:arXiv.org:1908.07151 [pdf] - 2034397
Trend Filtering -- I. A Modern Statistical Tool for Time-Domain
Astronomy and Astronomical Spectroscopy
Submitted: 2019-08-19, last modified: 2020-01-10
The problem of denoising a one-dimensional signal possessing varying degrees
of smoothness is ubiquitous in time-domain astronomy and astronomical
spectroscopy. For example, in the time domain, an astronomical object may
exhibit a smoothly varying intensity that is occasionally interrupted by abrupt
dips or spikes. Likewise, in the spectroscopic setting, a noiseless spectrum
typically contains intervals of relative smoothness mixed with localized higher
frequency components such as emission peaks and absorption lines. In this work,
we present trend filtering, a modern nonparametric statistical tool that yields
significant improvements in this broad problem space of denoising $spatially$
$heterogeneous$ signals. When the underlying signal is spatially heterogeneous,
trend filtering is superior to any statistical estimator that is a linear
combination of the observed data---including kernel smoothers, LOESS, smoothing
splines, Gaussian process regression, and many other popular methods.
Furthermore, the trend filtering estimate can be computed with practical and
scalable efficiency via a specialized convex optimization algorithm, e.g.
handling sample sizes of $n\gtrsim10^7$ within a few minutes. In a companion
paper, we explicitly demonstrate the broad utility of trend filtering to
observational astronomy by carrying out a diverse set of spectroscopic and
time-domain analyses.
[3]
oai:arXiv.org:2001.03552 [pdf] - 2034594
Trend Filtering -- II. Denoising Astronomical Signals with Varying
Degrees of Smoothness
Submitted: 2020-01-10
Trend filtering---first introduced into the astronomical literature in Paper
I of this series---is a state-of-the-art statistical tool for denoising
one-dimensional signals that possess varying degrees of smoothness. In this
work, we demonstrate the broad utility of trend filtering to observational
astronomy by discussing how it can contribute to a variety of spectroscopic and
time-domain studies. The observations we discuss are (1) the Lyman-$\alpha$
forest of quasar spectra; (2) more general spectroscopy of quasars, galaxies,
and stars; (3) stellar light curves with planetary transits; (4) eclipsing
binary light curves; and (5) supernova light curves. We study the
Lyman-$\alpha$ forest in the greatest detail---using trend filtering to map the
large-scale structure of the intergalactic medium along quasar-observer lines
of sight. The remaining studies share broad themes of: (1) estimating
observable parameters of light curves and spectra; and (2) constructing
observational spectral/light-curve templates. We also briefly discuss the
utility of trend filtering as a tool for one-dimensional data reduction and
compression.
[4]
oai:arXiv.org:1909.11714 [pdf] - 1969065
Realizing the potential of astrostatistics and astroinformatics
Eadie, Gwendolyn;
Loredo, Thomas J.;
Mahabal, Ashish A.;
Siemiginowska, Aneta;
Feigelson, Eric;
Ford, Eric B.;
Djorgovski, S. G.;
Graham, Matthew;
Ivezic, Zeljko;
Borne, Kirk;
Cisewski-Kehe, Jessi;
Peek, J. E. G.;
Schafer, Chad;
Yanamandra-Fisher, Padma A.;
Young, C. Alex
Submitted: 2019-09-25
This Astro2020 State of the Profession Consideration White Paper highlights
the growth of astrostatistics and astroinformatics in astronomy, identifies key
issues hampering the maturation of these new subfields, and makes
recommendations for structural improvements at different levels that, if acted
upon, will make significant positive impacts across astronomy.
[5]
oai:arXiv.org:1904.11306 [pdf] - 1873325
A Preferential Attachment Model for the Stellar Initial Mass Function
Submitted: 2019-04-25
Accurate specification of a likelihood function is becoming increasingly
difficult in many inference problems in astronomy. As sample sizes resulting
from astronomical surveys continue to grow, deficiencies in the likelihood
function lead to larger biases in key parameter estimates. These deficiencies
result from the oversimplification of the physical processes that generated the
data, and from the failure to account for observational limitations.
Unfortunately, realistic models often do not yield an analytical form for the
likelihood. The estimation of a stellar initial mass function (IMF) is an
important example. The stellar IMF is the mass distribution of stars initially
formed in a given cluster of stars, a population which is not directly
observable due to stellar evolution and other disruptions and observational
limitations of the cluster. There are several difficulties with specifying a
likelihood in this setting since the physical processes and observational
challenges result in measurable masses that cannot legitimately be considered
independent draws from an IMF. This work improves inference of the IMF by using
an approximate Bayesian computation approach that both accounts for
observational and astrophysical effects and incorporates a physically-motivated
model for star cluster formation. The methodology is illustrated via a
simulation study, demonstrating that the proposed approach can recover the true
posterior in realistic situations, and applied to observations from
astrophysical simulation data.
[6]
oai:arXiv.org:1904.10065 [pdf] - 1894324
Modeling the Echelle Spectra Continuum with Alpha Shapes and Local
Regression Fitting
Submitted: 2019-04-22
Continuum normalization of echelle spectra is an important data analysis step
that is difficult to automate. Polynomial fitting requires a reasonably high
order model to follow the steep slope of the blaze function. However, in the
presence of deep spectral lines, a high order polynomial fit can result in
ripples in the normalized continuum that increase errors in spectral analysis.
Here, we present two algorithms for flattening the spectrum continuum. The
Alpha-shape Fitting to Spectrum algorithm (AFS) is completely data-driven,
using an alpha shape to obtain an initial estimate of the blaze function. The
Alpha-shape and Lab Source Fitting to Spectrum algorithm (ALSFS) incorporates a
continuum constraint from a lab source reference spectrum for the blaze
function estimation. These algorithms are tested on a simulated spectrum, where
we demonstrate improved normalization compared to polynomial regression for
continuum fitting. We show an additional application, using the algorithms for
mitigation of spatially correlated quantum efficiency variations and fringing
in the CCD detector of the EXtreme PREcision Spectrometer (EXPRES).
[7]
oai:arXiv.org:1811.08450 [pdf] - 1850614
Finding cosmic voids and filament loops using topological data analysis
Submitted: 2018-11-20, last modified: 2019-03-16
(abridged) We present the Significant Cosmic Holes in Universe (SCHU) method
for identifying cosmic voids and loops of filaments in cosmological datasets
and assigning their statistical significance using techniques from topological
data analysis. Persistent homology is used to find different dimensional holes.
For dark matter halo catalogs and galaxy surveys, the 0-, 1-, and 2-dimensional
holes can be identified with clusters, loops of filaments, and voids. The
procedure overlays halos/galaxies on a 3D grid, and a distance-to-measure (DTM)
function is calculated at each point of the grid. A filtration is generated
over the lower-level sets of the DTM across increasing threshold values. The
filtered simplicial complex can be used to summarize the birth/death times of
the different dimension homology group generators (i.e., the holes).
Persistence diagrams are produced from the dimension and birth/death times of
each homology group generator. Using the persistence diagrams and bootstrap
sampling, we explain how $p$-values can be assigned to each homology group
generator. The homology group generators on a persistence diagram are not, in
general, uniquely located back in the original dataset volume so we propose a
method for finding a representation of the homology group generators. This
method provides a novel, statistically rigorous approach for locating
informative generators in cosmological datasets, which may be useful for
providing complementary cosmological constraints on the effects of, for
example, the sum of the neutrino masses. The method is tested on a Voronoi foam
simulation, and then applied to a subset of the SDSS galaxy survey and a
cosmological simulation. Lastly, we calculate Betti functions for two of the
MassiveNuS simulations and discuss implications for using the persistent
homology of the density field to help break degeneracy in the cosmological
parameters.
[8]
oai:arXiv.org:1903.06796 [pdf] - 1850859
Astro2020 Science White Paper: The Next Decade of Astroinformatics and
Astrostatistics
Siemiginowska, A.;
Eadie, G.;
Czekala, I.;
Feigelson, E.;
Ford, E. B.;
Kashyap, V.;
Kuhn, M.;
Loredo, T.;
Ntampaka, M.;
Stevens, A.;
Avelino, A.;
Borne, K.;
Budavari, T.;
Burkhart, B.;
Cisewski-Kehe, J.;
Civano, F.;
Chilingarian, I.;
van Dyk, D. A.;
Fabbiano, G.;
Finkbeiner, D. P.;
Foreman-Mackey, D.;
Freeman, P.;
Fruscione, A.;
Goodman, A. A.;
Graham, M.;
Guenther, H. M.;
Hakkila, J.;
Hernquist, L.;
Huppenkothen, D.;
James, D. J.;
Law, C.;
Lazio, J.;
Lee, T.;
López-Morales, M.;
Mahabal, A. A.;
Mandel, K.;
Meng, X. L.;
Moustakas, J.;
Muna, D.;
Peek, J. E. G.;
Richards, G.;
Portillo, S. K. N.;
Scargle, J.;
de Souza, R. S.;
Speagle, J. S.;
Stassun, K. G.;
Stenning, D. C.;
Taylor, S. R.;
Tremblay, G. R.;
Trimble, V.;
Yanamandra-Fisher, P. A.;
Young, C. A.
Submitted: 2019-03-15
Over the past century, major advances in astronomy and astrophysics have been
largely driven by improvements in instrumentation and data collection. With the
amassing of high quality data from new telescopes, and especially with the
advent of deep and large astronomical surveys, it is becoming clear that future
advances will also rely heavily on how those data are analyzed and interpreted.
New methodologies derived from advances in statistics, computer science, and
machine learning are beginning to be employed in sophisticated investigations
that are not only bringing forth new discoveries, but are placing them on a
solid footing. Progress in wide-field sky surveys, interferometric imaging,
precision cosmology, exoplanet detection and characterization, and many
subfields of stellar, Galactic and extragalactic astronomy, has resulted in
complex data analysis challenges that must be solved to perform scientific
inference. Research in astrostatistics and astroinformatics will be necessary
to develop the state-of-the-art methodology needed in astronomy. Overcoming
these challenges requires dedicated, interdisciplinary research. We recommend:
(1) increasing funding for interdisciplinary projects in astrostatistics and
astroinformatics; (2) dedicating space and time at conferences for
interdisciplinary research and promotion; (3) developing sustainable funding
for long-term astrostatisics appointments; and (4) funding infrastructure
development for data archives and archive support, state-of-the-art algorithms,
and efficient computing.
[9]
oai:arXiv.org:1902.10159 [pdf] - 1840221
The Role of Machine Learning in the Next Decade of Cosmology
Ntampaka, Michelle;
Avestruz, Camille;
Boada, Steven;
Caldeira, Joao;
Cisewski-Kehe, Jessi;
Di Stefano, Rosanne;
Dvorkin, Cora;
Evrard, August E.;
Farahi, Arya;
Finkbeiner, Doug;
Genel, Shy;
Goodman, Alyssa;
Goulding, Andy;
Ho, Shirley;
Kosowsky, Arthur;
La Plante, Paul;
Lanusse, Francois;
Lochner, Michelle;
Mandelbaum, Rachel;
Nagai, Daisuke;
Newman, Jeffrey A.;
Nord, Brian;
Peek, J. E. G.;
Peel, Austin;
Poczos, Barnabas;
Rau, Markus Michael;
Siemiginowska, Aneta;
Sutherland, Dougal J.;
Trac, Hy;
Wandelt, Benjamin
Submitted: 2019-02-26
In recent years, machine learning (ML) methods have remarkably improved how
cosmologists can interpret data. The next decade will bring new opportunities
for data-driven cosmological discovery, but will also present new challenges
for adopting ML methodologies and understanding the results. ML could transform
our field, but this transformation will require the astronomy community to both
foster and promote interdisciplinary research endeavors.
[10]
oai:arXiv.org:1811.12718 [pdf] - 1830537
Measuring precise radial velocities and cross-correlation function
line-profile variations using a Skew Normal density
Submitted: 2018-11-30
Stellar activity is one of the primary limitations to the detection of
low-mass exoplanets using the radial-velocity (RV) technique. We propose to
estimate the variations in shape of the CCF by fitting a Skew Normal (SN)
density which, unlike the commonly employed Normal density, includes a skewness
parameter to capture the asymmetry of the CCF induced by stellar activity and
the convective blueshift. The performances of the proposed method are compared
to the commonly employed Normal density using both simulations and real
observations, with different levels of activity and signal-to-noise ratio. When
considering real observations, the correlation between the RV and the asymmetry
of the CCF and between the RV and the width of the CCF are stronger when using
the parameters estimated with the SN density rather than the ones obtained with
the commonly employed Normal density. Using the proposed SN approach, the
uncertainties estimated on the RV defined as the median of the SN are on
average 10% smaller than the uncertainties calculated on the mean of the
Normal. The uncertainties estimated on the asymmetry parameter of the SN are on
average 15% smaller than the uncertainties measured on the Bisector Inverse
Slope Span (BIS SPAN), which is the commonly used parameter to evaluate the
asymmetry of the CCF. We also propose a new model to account for stellar
activity when fitting a planetary signal to RV data. Based on simple
simulations, we were able to demonstrate that this new model improves the
planetary detection limits by 12% compared to the model commonly used to
account for stellar activity. The SN density is a better model than the Normal
density for characterizing the CCF since the correlations used to probe stellar
activity are stronger and the uncertainties of the RV estimate and the
asymmetry of the CCF are both smaller.
[11]
oai:arXiv.org:1809.06173 [pdf] - 1775689
Incorporating Uncertainties in Atomic Data Into the Analysis of Solar
and Stellar Observations: A Case Study in Fe XIII
Submitted: 2018-09-17
Information about the physical properties of astrophysical objects cannot be
measured directly but is inferred by interpreting spectroscopic observations in
the context of atomic physics calculations. Ratios of emission lines, for
example, can be used to infer the electron density of the emitting plasma.
Similarly, the relative intensities of emission lines formed over a wide range
of temperatures yield information on the temperature structure. A critical
component of this analysis is understanding how uncertainties in the underlying
atomic physics propagates to the uncertainties in the inferred plasma
parameters. At present, however, atomic physics databases do not include
uncertainties on the atomic parameters and there is no established methodology
for using them even if they did. In this paper we develop simple models for the
uncertainties in the collision strengths and decay rates for Fe XIII and apply
them to the interpretation of density sensitive lines observed with the EUV
Imagining spectrometer (EIS) on Hinode. We incorporate these uncertainties in a
Bayesian framework. We consider both a pragmatic Bayesian method where the
atomic physics information is unaffected by the observed data, and a fully
Bayesian method where the data can be used to probe the physics. The former
generally increases the uncertainty in the inferred density by about a factor
of 5 compared with models that incorporate only statistical uncertainties. The
latter reduces the uncertainties on the inferred densities, but identifies
areas of possible systematic problems with either the atomic physics or the
observed intensities.
[12]
oai:arXiv.org:1807.09273 [pdf] - 1722505
Statistical challenges in the search for dark matter
Algeri, Sara;
van Beekveld, Melissa;
Bozorgnia, Nassim;
Brooks, Alyson;
Casas, J. Alberto;
Cisewski-Kehe, Jessi;
Cyr-Racine, Francis-Yan;
Edwards, Thomas D. P.;
Iocco, Fabio;
Kavanagh, Bradley J.;
Mamužić, Judita;
Mishra-Sharma, Siddharth;
Rau, Wolfgang;
de Austri, Roberto Ruiz;
Safdi, Benjamin R.;
Scott, Pat;
Slatyer, Tracy R.;
Tsai, Yue-Lin Sming;
Vincent, Aaron C.;
Weniger, Christoph;
West, Jennifer Rittenhouse;
Wolpert, Robert L.
Submitted: 2018-07-24
The search for the particle nature of dark matter has given rise to a number
of experimental, theoretical and statistical challenges. Here, we report on a
number of these statistical challenges and new techniques to address them, as
discussed in the DMStat workshop held Feb 26 - Mar 3 2018 at the Banff
International Research Station for Mathematical Innovation and Discovery (BIRS)
in Banff, Alberta.