Full-text search for arXiv

Portillo, Stephen K. N.

Normalized to: Portillo, S.

11 article(s) in total. 66 co-authors, from 1 to 9 common article(s). Median position in authors list is 2,0.

[1] oai:arXiv.org:2002.10464 [pdf] - 2131872

Dimensionality Reduction of SDSS Spectra with Variational Autoencoders

Portillo, Stephen K. N.; Parejko, John K.; Vergara, Jorge R.; Connolly, Andrew J.

Comments: 22 pages, 17 figures, accepted to AJ; code available at https://github.com/stephenportillo/SDSS-VAE

Submitted: 2020-02-24, last modified: 2020-07-09

High resolution galaxy spectra contain much information about galactic physics, but the high dimensionality of these spectra makes it difficult to fully utilize the information they contain. We apply variational autoencoders (VAEs), a non-linear dimensionality reduction technique, to a sample of spectra from the Sloan Digital Sky Survey. In contrast to Principal Component Analysis (PCA), a widely used technique, VAEs can capture non-linear relationships between latent parameters and the data. We find that a VAE can reconstruct the SDSS spectra well with only six latent parameters, outperforming PCA with the same number of components. Different galaxy classes are naturally separated in this latent space, without class labels having been given to the VAE. The VAE latent space is interpretable because the VAE can be used to make synthetic spectra at any point in latent space. For example, making synthetic spectra along tracks in latent space yields sequences of realistic spectra that interpolate between two different types of galaxies. Using the latent space to find outliers may yield interesting spectra: in our small sample, we immediately find unusual data artifacts and stars misclassified as galaxies. In this exploratory work, we show that VAEs create compact, interpretable latent spaces that capture non-linear features of the data. While a VAE takes substantial time to train (~1 day for 48000 spectra), once trained, VAEs can enable the fast exploration of large astronomical data sets.

[2] oai:arXiv.org:1907.04929 [pdf] - 2068905

Multiband Probabilistic Cataloging: A Joint Fitting Approach to Point Source Detection and Deblending

Feder, Richard M.; Portillo, Stephen K. N.; Daylan, Tansu; Finkbeiner, Douglas

Comments: 25 pages, 15 figures. Incorporated comments from referee process

Submitted: 2019-07-10, last modified: 2020-03-20

Probabilistic cataloging (PCAT) outperforms traditional cataloging methods on single-band optical data in crowded fields (Portillo et al. 2017). We extend our work to multiple bands, achieving greater sensitivity ($\sim$ 0.4 mag) and greater speed (500x) compared to previous single-band results. We demonstrate the effectiveness of multiband PCAT on mock data, both in terms of recovering accurate posteriors in the catalog space, and in directly deblending sources. When applied to Sloan Digital Sky Survey (SDSS) observations of M2, taking Hubble Space Telescope data as truth, our joint fit on $r$ and $i$ band data goes $\sim0.4$ mag deeper than single-band probabilistic cataloging and has a false discovery rate less than 20\% for F606W$\leq 20$. Compared to DAOPHOT, the two-band SDSS catalog fit goes nearly 1.5 magnitudes deeper using the same data, and maintains a lower false discovery rate down to F606W$\sim 20.5$. Given recent improvements in computational speed, multiband PCAT shows promise in application to large-scale surveys and is a plausible framework for joint analysis of multi-instrument observational data.

[3] oai:arXiv.org:1902.02374 [pdf] - 2068883

Photometric Biases in Modern Surveys

Portillo, Stephen K. N.; Speagle, Joshua S.; Finkbeiner, Douglas P.

Comments: 35 pages, 13 figures, accepted to AJ; code and data available online at https://github.com/joshspeagle/phot_bias

Submitted: 2019-02-06, last modified: 2020-02-14

Many surveys use maximum-likelihood (ML) methods to fit models when extracting photometry from images. We show these ML estimators systematically overestimate the flux as a function of the signal-to-noise ratio and the number of model parameters involved in the fit. This bias is substantially worse for resolved sources: while a 1% bias is expected for a 10$\sigma$ point source, a 10$\sigma$ resolved galaxy with a simplified Gaussian profile suffers a 2.5% bias. This bias also behaves differently depending how multiple bands are used in the fit: simultaneously fitting all bands leads the flux bias to become roughly evenly distributed between them, while fixing the position in "non-detection" bands (i.e. forced photometry) gives flux estimates in those bands that are biased low, compounding a bias in derived colors. We show that these effects are present in idealized simulations, outputs from the Hyper Suprime-Cam fake object pipeline (SynPipe), and observations from Sloan Digital Sky Survey Stripe 82. Prescriptions to correct for the ML bias in flux, and its uncertainty, are provided.

[4] oai:arXiv.org:1903.06796 [pdf] - 1850859

Astro2020 Science White Paper: The Next Decade of Astroinformatics and Astrostatistics

Comments: Submitted to the Astro2020 Decadal Survey call for science white papers

Submitted: 2019-03-15

Over the past century, major advances in astronomy and astrophysics have been largely driven by improvements in instrumentation and data collection. With the amassing of high quality data from new telescopes, and especially with the advent of deep and large astronomical surveys, it is becoming clear that future advances will also rely heavily on how those data are analyzed and interpreted. New methodologies derived from advances in statistics, computer science, and machine learning are beginning to be employed in sophisticated investigations that are not only bringing forth new discoveries, but are placing them on a solid footing. Progress in wide-field sky surveys, interferometric imaging, precision cosmology, exoplanet detection and characterization, and many subfields of stellar, Galactic and extragalactic astronomy, has resulted in complex data analysis challenges that must be solved to perform scientific inference. Research in astrostatistics and astroinformatics will be necessary to develop the state-of-the-art methodology needed in astronomy. Overcoming these challenges requires dedicated, interdisciplinary research. We recommend: (1) increasing funding for interdisciplinary projects in astrostatistics and astroinformatics; (2) dedicating space and time at conferences for interdisciplinary research and promotion; (3) developing sustainable funding for long-term astrostatisics appointments; and (4) funding infrastructure development for data archives and archive support, state-of-the-art algorithms, and efficient computing.

[5] oai:arXiv.org:1803.08931 [pdf] - 1805993

Mapping Distances Across the Perseus Molecular Cloud Using CO Observations, Stellar Photometry, and Gaia DR2 Parallax Measurements

Zucker, Catherine; Schlafly, Edward F.; Speagle, Joshua S.; Green, Gregory M.; Portillo, Stephen K. N.; Finkbeiner, Douglas P.; Goodman, Alyssa A.

Comments: Accepted for publication in The Astrophysical Journal

Submitted: 2018-03-23, last modified: 2018-10-17

We present a new technique to determine distances to major star-forming regions across the Perseus Molecular Cloud, using a combination of stellar photometry, astrometric data, and $\rm ^{12} CO$ spectral-line maps. Incorporating the Gaia DR2 parallax measurements when available, we start by inferring the distance and reddening to stars from their Pan-STARRS1 and 2MASS photometry, based on a technique presented in Green et al. 2014; Green et al. 2015 and implemented in their 3D "Bayestar" dust map of three-quarters of the sky. We then refine the Green et al. technique by using the velocity slices of a CO spectral cube as dust templates and modeling the cumulative distribution of dust along the line of sight towards these stars as a linear combination of the emission in the slices. Using a nested sampling algorithm, we fit these per-star distance-reddening measurements to find the distances to the CO velocity slices towards each star-forming region. This results in distance estimates explicitly tied to the velocity structure of the molecular gas. We determine distances to the B5, IC348, B1, NGC1333, L1448, and L1451 star-forming regions and find that individual clouds are located between $\approx 275-300$ pc, with typical combined uncertainties of $\approx 5\%$. We find that the velocity gradient across Perseus corresponds to a distance gradient of about 25 pc, with the eastern portion of the cloud farther away than the western portion. We determine an average distance to the complex of $294\pm 17$ pc, about 60 pc higher than the distance derived to the western portion of the cloud using parallax measurements of water masers associated with young stellar objects. The method we present is not limited to the Perseus Complex, but may be applied anywhere on the sky with adequate CO data in the pursuit of more accurate 3D maps of molecular clouds in the solar neighborhood and beyond.

[6] oai:arXiv.org:1711.09907 [pdf] - 1759253

Developing the 3-Point Correlation Function For the Turbulent Interstellar Medium

Portillo, Stephen K. N.; Slepian, Zachary; Burkhart, Blakesley; Kahraman, Sule; Finkbeiner, Douglas P.

Comments: 19 pages, 16 figures; version accepted by ApJ

Submitted: 2017-11-27, last modified: 2018-10-01

We present the first application of the angle-dependent 3-Point Correlation Function (3PCF) to the density fields magnetohydrodynamic (MHD) turbulence simulations intended to model interstellar (ISM) turbulence. Previous work has demonstrated that the angle-averaged bispectrum, the 3PCF's Fourier-space analog, is sensitive to the sonic and Alfv\'enic Mach numbers of turbulence. Here we show that introducing angular information via multipole moments with respect to the triangle opening angle offers considerable additional discriminatory power on these parameters. We exploit a fast, order $N_{\rm g} \log N_{\rm g}$ ($N_{\rm g}$ the number of grid cells used for a Fourier Transform) 3PCF algorithm to study a suite of MHD turbulence simulations with 10 different combinations of sonic and Alfv\'enic Mach numbers over a range from sub to super-sonic and sub to super-Alfv\'{e}nic. The 3PCF algorithm's speed for the first time enables full quantification of the time-variation of our signal: we study 9 timeslices for each condition, demonstrating that the 3PCF is sufficiently time-stable to be used as an ISM diagnostic. In future, applying this framework to 3-D dust maps will enable better treatment of dust as a cosmological foreground as well as reveal conditions in the ISM that shape star formation.

[7] oai:arXiv.org:1710.01785 [pdf] - 1682476

Too hot to handle? Analytic solutions for massive neutrino or warm dark matter cosmologies

Slepian, Zachary; Portillo, Stephen KN

Comments: 13 pages, 7 figures, MNRAS submitted

Submitted: 2017-10-04

We obtain novel closed form solutions to the Friedmann equation for cosmological models containing a component whose equation of state is that of radiation $(w=1/3)$ at early times and that of cold pressureless matter $(w=0)$ at late times. The equation of state smoothly transitions from the early to late-time behavior and exactly describes the evolution of a species with a Dirac Delta function distribution in momentum magnitudes $|\vec{p}_0|$ (i.e. all particles have the same $|\vec{p}_0|$). Such a component, here termed "hot matter", is an approximate model for both neutrinos and warm dark matter. We consider it alone and in combination with cold matter and with radiation, also obtaining closed-form solutions for the growth of super-horizon perturbations in each case. The idealized model recovers $t(a)$ to better than $1.5\%$ accuracy for all $a$ relative to a Fermi-Dirac distribution (as describes neutrinos). We conclude by adding the second moment of the distribution to our exact solution and then generalizing to include all moments of an arbitrary momentum distribution in a closed form solution.

[8] oai:arXiv.org:1703.01303 [pdf] - 1581748

Improved Point Source Detection in Crowded Fields using Probabilistic Cataloging

Portillo, Stephen K. N.; Lee, Benjamin C. G.; Daylan, Tansu; Finkbeiner, Douglas P.

Comments: 29 pages, 27 figures; changed prior on number of sources, added discussion on use of posterior sample catalogs in Section 4.2, added discussion of convergence in Section 5.4, expanded Introduction and Discussion, added cross-checks on assumed flux function index and PSF width in Appendices D and E; version to be published in the Astrophysical Journal

Submitted: 2017-03-03, last modified: 2017-08-07

Cataloging is challenging in crowded fields because sources are extremely covariant with their neighbors and blending makes even the number of sources ambiguous. We present the first optical probabilistic catalog, cataloging a crowded (~0.1 sources per pixel brighter than 22nd magnitude in F606W) Sloan Digital Sky Survey r band image from M2. Probabilistic cataloging returns an ensemble of catalogs inferred from the image and thus can capture source-source covariance and deblending ambiguities. By comparing to a traditional catalog of the same image and a Hubble Space Telescope catalog of the same region, we show that our catalog ensemble better recovers sources from the image. It goes more than a magnitude deeper than the traditional catalog while having a lower false discovery rate brighter than 20th magnitude. We also present an algorithm for reducing this catalog ensemble to a condensed catalog that is similar to a traditional catalog, except it explicitly marginalizes over source-source covariances and nuisance parameters. We show that this condensed catalog has a similar completeness and false discovery rate to the catalog ensemble. Future telescopes will be more sensitive, and thus more of their images will be crowded. Probabilistic cataloging performs better than existing software in crowded fields and so should be considered when creating photometric pipelines in the Large Synoptic Space Telescope era.

[9] oai:arXiv.org:1607.04637 [pdf] - 1641209

Inference of Unresolved Point Sources At High Galactic Latitudes Using Probabilistic Catalogs

Daylan, Tansu; Portillo, Stephen K. N.; Finkbeiner, Douglas P.

Comments:

Submitted: 2016-07-15, last modified: 2017-03-09

Detection of point sources in images is a fundamental operation in astrophysics, and is crucial for constraining population models of the underlying point sources or characterizing the background emission. Standard techniques fall short in the crowded-field limit, losing sensitivity to faint sources and failing to track their covariance with close neighbors. We construct a Bayesian framework to perform inference of faint or overlapping point sources. The method involves probabilistic cataloging, where samples are taken from the posterior probability distribution of catalogs consistent with an observed photon count map. In order to validate our method we sample random catalogs of the gamma-ray sky in the direction of the North Galactic Pole (NGP) by binning the data in energy and Point Spread Function (PSF) classes. Using three energy bins spanning $0.3 - 1$, $1 - 3$ and $3 - 10$ GeV, we identify $270\substack{+30 \\ -10}$ point sources inside a $40^\circ \times 40^\circ$ region around the NGP above our point-source inclusion limit of $3 \times 10^{-11}$/cm$^2$/s/sr/GeV at the $1-3$ GeV energy bin. Modeling the flux distribution as a power law, we infer the slope to be $-1.92\substack{+0.07 \\ -0.05}$ and estimate the contribution of point sources to the total emission as $18\substack{+2 \\ -2}$\%. These uncertainties in the flux distribution are fully marginalized over the number as well as the spatial and spectral properties of the unresolved point sources. This marginalization allows a robust test of whether the apparently isotropic emission in an image is due to unresolved point sources or of truly diffuse origin.

[10] oai:arXiv.org:1402.6703 [pdf] - 1362535

The Characterization of the Gamma-Ray Signal from the Central Milky Way: A Compelling Case for Annihilating Dark Matter

Daylan, Tansu; Finkbeiner, Douglas P.; Hooper, Dan; Linden, Tim; Portillo, Stephen K. N.; Rodd, Nicholas L.; Slatyer, Tracy R.

Comments: 30 pages, 34 figures

Submitted: 2014-02-26, last modified: 2015-03-17

Past studies have identified a spatially extended excess of $\sim$1-3 GeV gamma rays from the region surrounding the Galactic Center, consistent with the emission expected from annihilating dark matter. We revisit and scrutinize this signal with the intention of further constraining its characteristics and origin. By applying cuts to the \textit{Fermi} event parameter CTBCORE, we suppress the tails of the point spread function and generate high resolution gamma-ray maps, enabling us to more easily separate the various gamma-ray components. Within these maps, we find the GeV excess to be robust and highly statistically significant, with a spectrum, angular distribution, and overall normalization that is in good agreement with that predicted by simple annihilating dark matter models. For example, the signal is very well fit by a 36-51 GeV dark matter particle annihilating to $b\bar{b}$ with an annihilation cross section of $\sigma v = (1-3)\times 10^{-26}$ cm$^3$/s (normalized to a local dark matter density of 0.4 GeV/cm$^3$). Furthermore, we confirm that the angular distribution of the excess is approximately spherically symmetric and centered around the dynamical center of the Milky Way (within $\sim$$0.05^{\circ}$ of Sgr A$^*$), showing no sign of elongation along the Galactic Plane. The signal is observed to extend to at least $\simeq10^{\circ}$ from the Galactic Center, disfavoring the possibility that this emission originates from millisecond pulsars.

[11] oai:arXiv.org:1406.0507 [pdf] - 893668

Sharper Fermi LAT Images: instrument response functions for an improved event selection

Portillo, Stephen K. N.; Finkbeiner, Douglas P.

Comments: 6 pages, 6 figures; extended PSF analysis down to 100 MeV, typos corrected; to be published in The Astrophysical Journal

Submitted: 2014-06-02, last modified: 2014-09-28

The Large Area Telescope on the Fermi Gamma-ray Space Telescope has a point spread function with large tails, consisting of events affected by tracker ineffiencies, inactive volumes, and hard scattering; these tails can make source confusion a limiting factor. The parameter CTBCORE, available in the publicly available Extended Fermi LAT data, estimates the quality of each event's direction reconstruction; by implementing a cut in this parameter, the tails of the point spread function can be suppressed at the cost of losing effective area. We implement cuts on CTBCORE and present updated instrument response functions derived from the Fermi LAT data itself, along with all-sky maps generated with these cuts. Having shown the effectiveness of these cuts, especially at low energies, we encourage their use in analyses where angular resolution is more important than Poisson noise.