Normalized to: Babu, G.
[1]
oai:arXiv.org:1712.00356 [pdf] - 2103936
Some Optimizations on Detecting Gravitational Wave Using Convolutional
Neural Network
Submitted: 2017-12-01, last modified: 2020-05-29
This work investigates the problem of detecting gravitational wave (GW)
events based on simulated damped sinusoid signals contaminated with white
Gaussian noise. It is treated as a classification problem with one class for
the interesting events. The proposed scheme consists of the following two
successive steps: decomposing the data using a wavelet packet, representing the
GW signal and noise using the derived decomposition coefficients; and
determining the existence of any GW event using a convolutional neural network
(CNN) with a logistic regression output layer. The characteristics of this work
is its comprehensive investigations on CNN structure, detection window width,
data resolution, wavelet packet decomposition and detection window overlap
scheme. Extensive simulation experiments show excellent performances for
reliable detection of signals with a range of GW model parameters and
signal-to-noise ratios. While we use a simple waveform model in this study, we
expect the method to be particularly valuable when the potential GW shapes are
too complex to be characterized with a template bank.
[2]
oai:arXiv.org:2005.13025 [pdf] - 2102989
21st Century Statistical and Computational Challenges in Astrophysics
Submitted: 2020-05-26
Modern astronomy has been rapidly increasing our ability to see deeper into
the universe, acquiring enormous samples of cosmic populations. Gaining
astrophysical insights from these datasets requires a wide range of
sophisticated statistical and machine learning methods. Long-standing problems
in cosmology include characterization of galaxy clustering and estimation of
galaxy distances from photometric colors. Bayesian inference, central to
linking astronomical data to nonlinear astrophysical models, addresses problems
in solar physics, properties of star clusters, and exoplanet systems.
Likelihood-free methods are growing in importance. Detection of faint signals
in complicated noise is needed to find periodic behaviors in stars and detect
explosive gravitational wave events. Open issues concern treatment of
heteroscedastic measurement errors and understanding probability distributions
characterizing astrophysical systems. The field of astrostatistics needs
increased collaboration with statisticians in the design and analysis stages of
research projects, and to jointly develop new statistical methodologies.
Together, they will draw more astrophysical insights into astronomical
populations and the cosmos itself.
[3]
oai:arXiv.org:1911.02479 [pdf] - 1994455
Algorithms and Statistical Models for Scientific Discovery in the
Petabyte Era
Nord, Brian;
Connolly, Andrew J.;
Kinney, Jamie;
Kubica, Jeremy;
Narayan, Gautaum;
Peek, Joshua E. G.;
Schafer, Chad;
Tollerud, Erik J.;
Avestruz, Camille;
Babu, G. Jogesh;
Birrer, Simon;
Burke, Douglas;
Caldeira, João;
Caldwell, Douglas A.;
Carlberg, Joleen K.;
Chen, Yen-Chi;
Dong, Chuanfei;
Feigelson, Eric D.;
Golkhou, V. Zach;
Kashyap, Vinay;
Li, T. S.;
Loredo, Thomas;
Lucie-Smith, Luisa;
Mandel, Kaisey S.;
Martínez-Galarza, J. R.;
Miller, Adam A.;
Natarajan, Priyamvada;
Ntampaka, Michelle;
Ptak, Andy;
Rapetti, David;
Shamir, Lior;
Siemiginowska, Aneta;
Sipőcz, Brigitta M.;
Smith, Arfon M.;
Tran, Nhan;
Vilalta, Ricardo;
Walkowicz, Lucianne M.;
ZuHone, John
Submitted: 2019-11-04
The field of astronomy has arrived at a turning point in terms of size and
complexity of both datasets and scientific collaboration. Commensurately,
algorithms and statistical models have begun to adapt --- e.g., via the onset
of artificial intelligence --- which itself presents new challenges and
opportunities for growth. This white paper aims to offer guidance and ideas for
how we can evolve our technical and collaborative frameworks to promote
efficient algorithmic development and take advantage of opportunities for
scientific discovery in the petabyte era. We discuss challenges for discovery
in large and complex data sets; challenges and requirements for the next stage
of development of statistical methodologies and algorithmic tool sets; how we
might change our paradigms of collaboration and education; and the ethical
implications of scientists' contributions to widely applicable algorithms and
computational modeling. We start with six distinct recommendations that are
supported by the commentary following them. This white paper is related to a
larger corpus of effort that has taken place within and around the Petabytes to
Science Workshops (https://petabytestoscience.github.io/).
[4]
oai:arXiv.org:1905.09852 [pdf] - 1920949
AutoRegressive Planet Search: Application to the Kepler Mission
Submitted: 2019-05-23
The 4-year light curves of 156,717 stars observed with NASA's Kepler mission
are analyzed using the AutoRegressive Planet Search (ARPS) methodology
described by Caceres et al. (2019). The three stages of processing are: maximum
likelihood ARIMA modeling of the light curves to reduce stellar brightness
variations; constructing the Transit Comb Filter periodogram to identify
transit-like periodic dips in the ARIMA residuals; Random Forest classification
trained on Kepler Team confirmed planets using several dozen features from the
analysis. Orbital periods between 0.2 and 100 days are examined. The result is
a recovery of 76% of confirmed planets, 97% when period and transit depth
constraints are added. The classifier is then applied to the full Kepler
dataset; 1,004 previously noticed and 97 new stars have light curve criteria
consistent with the confirmed planets, after subjective vetting removes clear
False Alarms and False Positive cases. The 97 Kepler ARPS Candidate Transits
mostly have periods $P<10$ days; many are UltraShort Period hot planets with
radii $<1$% of the host star. Extensive tabular and graphical output from the
ARPS time series analysis is provided to assist in other research relating to
the Kepler sample.
[5]
oai:arXiv.org:1901.05116 [pdf] - 1920819
AutoRegressive Planet Search: Methodology
Submitted: 2019-01-15, last modified: 2019-05-14
The detection of periodic signals from transiting exoplanets is often impeded
by extraneous aperiodic photometric variability, either intrinsic to the star
or arising from the measurement process. Frequently, these variations are
autocorrelated wherein later flux values are correlated with previous ones. In
this work, we present the methodology of the Autoregessive Planet Search (ARPS)
project which uses Autoregressive Integrated Moving Average (ARIMA) and related
statistical models that treat a wide variety of stochastic processes, as well
as nonstationarity, to improve detection of new planetary transits. Providing a
time series is evenly spaced or can be placed on an evenly spaced grid with
missing values, these low-dimensional parametric models can prove very
effective. We introduce a planet-search algorithm to detect periodic transits
in the residuals after the application of ARIMA models. Our matched-filter
algorithm, the Transit Comb Filter (TCF), is closely related to the traditional
Box-fitting Least Squares and provides an analogous periodogram. Finally, if a
previously identified or simulated sample of planets is available, selected
scalar features from different stages of the analysis -- the original light
curves, ARIMA fits, TCF periodograms, and folded light curves -- can be
collectively used with a multivariate classifier to identify promising
candidates while efficiently rejecting false alarms. We use Random Forests for
this task, in conjunction with Receiver Operating Characteristic (ROC) curves,
to define discovery criteria for new, high fidelity planetary candidates. The
ARPS methodology can be applied to both evenly spaced satellite light curves
and densely cadenced ground-based photometric surveys.
[6]
oai:arXiv.org:1901.08003 [pdf] - 1820264
Autoregressive Times Series Methods for Time Domain Astronomy
Submitted: 2019-01-23
Celestial objects exhibit a wide range of variability in brightness at
different wavebands. Surprisingly, the most common methods for characterizing
time series in statistics -- parametric autoregressive modeling -- is rarely
used to interpret astronomical light curves. We review standard ARMA, ARIMA and
ARFIMA (autoregressive moving average fractionally integrated) models that
treat short-memory autocorrelation, long-memory $1/f^\alpha$ `red noise', and
nonstationary trends. Though designed for evenly spaced time series, moderately
irregular cadences can be treated as evenly-spaced time series with missing
data. Fitting algorithms are efficient and software implementations are widely
available. We apply ARIMA models to light curves of four variable stars,
discussing their effectiveness for different temporal characteristics. A
variety of extensions to ARIMA are outlined, with emphasis on recently
developed continuous-time models like CARMA and CARFIMA designed for
irregularly spaced time series. Strengths and weakness of ARIMA-type modeling
for astronomical data analysis and astrophysical insights are reviewed.
[7]
oai:arXiv.org:1302.0387 [pdf] - 1159458
VOStat: A Statistical Web Service for Astronomers
Submitted: 2013-02-02
VOStat is a Web service providing interactive statistical analysis of
astronomical tabular datasets. It is integrated into the suite of analysis and
visualization tools associated with the international Virtual Observatory (VO)
through the SAMP communication system. A user supplies VOStat with a dataset
extracted from the VO, or otherwise acquired, and chooses among $\sim 60$
statistical functions. These include data transformations, plots and summaries,
density estimation, one- and two-sample hypothesis tests, global and local
regressions, multivariate analysis and clustering, spatial analysis,
directional statistics, survival analysis (for censored data like upper
limits), and time series analysis. The statistical operations are performed
using the public domain {\bf R} statistical software environment, including a
small fraction of its $>4000$ {\bf CRAN} add-on packages. The purpose of VOStat
is to facilitate a wider range of statistical analyses than are commonly used
in astronomy, and to promote use of more advanced methodology in {\bf R} and
{\bf CRAN}.
[8]
oai:arXiv.org:1211.5602 [pdf] - 1158005
The Astrophysical Multimessenger Observatory Network (AMON)
Smith, M. W. E.;
Fox, D. B.;
Cowen, D. F.;
Mészáros, P.;
Tešić, G.;
Fixelle, J.;
Bartos, I.;
Sommers, P.;
Ashtekar, Abhay;
Babu, G. Jogesh;
Barthelmy, S. D.;
Coutu, S.;
DeYoung, T.;
Falcone, A. D.;
Finn, L. S.;
Gao, Shan;
Hashemi, B.;
Homeier, A.;
Márka, S.;
Owen, B. J.;
Taboada, I.
Submitted: 2012-11-23
We summarize the science opportunity, design elements, current and projected
partner observatories, and anticipated science returns of the Astrophysical
Multimessenger Observatory Network (AMON). AMON will link multiple current and
future high-energy, multimessenger, and follow-up observatories together into a
single network, enabling near real-time coincidence searches for multimessenger
astrophysical transients and their electromagnetic counterparts. Candidate and
high-confidence multimessenger transient events will be identified,
characterized, and distributed as AMON alerts within the network and to
interested external observers, leading to follow-up observations across the
electromagnetic spectrum. In this way, AMON aims to evoke the discovery of
multimessenger transients from within observatory subthreshold data streams and
facilitate the exploitation of these transients for purposes of astronomy and
fundamental physics. As a central hub of global multimessenger science, AMON
will also enable cross-collaboration analyses of archival datasets in search of
rare or exotic astrophysical phenomena.
[9]
oai:arXiv.org:1205.2064 [pdf] - 510411
Statistical Methods for Astronomy
Submitted: 2012-05-09
This review outlines concepts of mathematical statistics, elements of
probability theory, hypothesis tests and point estimation for use in the
analysis of modern astronomical data. Least squares, maximum likelihood, and
Bayesian approaches to statistical inference are treated. Resampling methods,
particularly the bootstrap, provide valuable procedures when distributions
functions of statistics are not known. Several approaches to model selection
and good- ness of fit are considered. Applied statistics relevant to
astronomical research are briefly discussed: nonparametric methods for use when
little is known about the behavior of the astronomical populations or
processes; data smoothing with kernel density estimation and nonparametric
regression; unsupervised clustering and supervised classification procedures
for multivariate problems; survival analysis for astronomical datasets with
nondetections; time- and frequency-domain times series analysis for light
curves; and spatial statistics to interpret the spatial distributions of points
in low dimensions. Two types of resources are presented: about 40 recommended
texts and monographs in various fields of statistics, and the public domain R
software system for statistical analysis. Together with its \sim 3500 (and
growing) add-on CRAN packages, R implements a vast range of statistical
procedures in a coherent high-level language with advanced graphics.
[10]
oai:arXiv.org:0908.4056 [pdf] - 27697
A statistical model for the relation between exoplanets and their host
stars
Submitted: 2009-08-27
A general model is proposed to explain the relation between the extrasolar
planets (or exoplanets) detected until June 2008 and the main characteristics
of their host stars through statistical techniques. The main goal is to
establish a mathematical relation among the set of variables which better
describe the physical characteristics of the host star and the planet itself.
The host star is characterized by its distance, age, effective temperature,
mass, metallicity, radius and magnitude. The exoplanet is described through its
physical parameters (radius and mass) and its orbital parameters (distance,
period, eccentricity, inclination and major semiaxis). As a first approach we
consider that only the mass of the exoplanet is being determined by the
physical properties of its host star. The proposed model is then validated
through statistical analysis. Finally we discuss the categorical behavior of
the dependent variable through binary models.
[11]
oai:arXiv.org:astro-ph/0612707 [pdf] - 88045
Object detection in multi-epoch data
Submitted: 2006-12-22
In astronomy multiple images are frequently obtained at the same position of
the sky for follow-up co-addition as it helps one go deeper and look for
fainter objects. With large scale panchromatic synoptic surveys becoming more
common, image co-addition has become even more necessary as new observations
start to get compared with co-added fiducial sky in real time. The standard
co-addition techniques have included straight averages, variance weighted
averages, medians etc. A more sophisticated nonlinear response chi-square
method is also used when it is known that the data are background noise limited
and the point spread function is homogenized in all channels. A more robust
object detection technique capable of detecting faint sources, even those not
seen at all epochs which will normally be smoothed out in traditional methods,
is described. The analysis at each pixel level is based on a formula similar to
Mahalanobis distance. The method does not depend on the point spread function.
[12]
oai:arXiv.org:astro-ph/0401404 [pdf] - 880699
Statistical Challenges in Modern Astronomy
Submitted: 2004-01-20
Despite centuries of close association, statistics and astronomy are
surprisingly distant today. Most observational astronomical research relies on
an inadequate toolbox of methodological tools. Yet the needs are substantial:
astronomy encounters sophisticated problems involving sampling theory, survival
analysis, multivariate classification and analysis, time series analysis,
wavelet analysis, spatial point processes, nonlinear regression, bootstrap
resampling and model selection. We review the recent resurgence of
astrostatistical research, and outline new challenges raised by the emerging
Virtual Observatory. Our essay ends with a list of research challenges and
infrastructure for astrostatistics in the coming decade.
[13]
oai:arXiv.org:astro-ph/9802085 [pdf] - 100256
Three types of gamma-ray bursts
Submitted: 1998-02-07
A multivariate analysis of gamma-ray burst (GRB) bulk properties is presented
to discriminate between distinct classes of GRBs. Several variables
representing burst duration, fluence and spectral hardness are considered. Two
multivariate clustering procedures are used on a sample of 797 bursts from the
Third BATSE Catalog: a nonparametric average linkage hierarchical agglomerative
clustering procedure validated with Wilks' $\Lambda^*$ and other MANOVA tests;
and a parametric maximum likelihood model-based clustering procedure assuming
multinormal populations calculated with the EM Algorithm and validated with the
Bayesian Information Criterion.
The two methods yield very similar results. The BATSE GRB population consists
of three classes with the following Duration/Fluence/Spectrum bulk properties:
Class I with long/bright/intermediate bursts, Class II with short/hard/faint
bursts, and Class III with intermediate/intermediate/soft bursts. One outlier
with poor data is also present. Classes I and II correspond to those reported
by Kouveliotou et al. (1993), but Class III is clearly defined here for the
first time.