Full-text search for arXiv

Turmon, M.

Normalized to: Turmon, M.

11 article(s) in total. 31 co-authors, from 1 to 9 common article(s). Median position in authors list is 7,0.

[1] oai:arXiv.org:1601.04385 [pdf] - 1342128

Real-Time Data Mining of Massive Data Streams from Synoptic Sky Surveys

Djorgovski, S. G.; Graham, M. J.; Donalek, C.; Mahabal, A. A.; Drake, A. J.; Turmon, M.; Fuchs, T.

Comments: 14 pages, an invited paper for a special issue of Future Generation Computer Systems, Elsevier Publ. (2015). This is an expanded version of a paper arXiv:1407.3502 presented at the IEEE e-Science 2014 conf., with some new content

Submitted: 2016-01-17

The nature of scientific and technological data collection is evolving rapidly: data volumes and rates grow exponentially, with increasing complexity and information content, and there has been a transition from static data sets to data streams that must be analyzed in real time. Interesting or anomalous phenomena must be quickly characterized and followed up with additional measurements via optimal deployment of limited assets. Modern astronomy presents a variety of such phenomena in the form of transient events in digital synoptic sky surveys, including cosmic explosions (supernovae, gamma ray bursts), relativistic phenomena (black hole formation, jets), potentially hazardous asteroids, etc. We have been developing a set of machine learning tools to detect, classify and plan a response to transient events for astronomy applications, using the Catalina Real-time Transient Survey (CRTS) as a scientific and methodological testbed. The ability to respond rapidly to the potentially most interesting events is a key bottleneck that limits the scientific returns from the current and anticipated synoptic sky surveys. Similar challenge arise in other contexts, from environmental monitoring using sensor networks to autonomous spacecraft systems. Given the exponential growth of data rates, and the time-critical response, we need a fully automated and robust approach. We describe the results obtained to date, and the possible future developments.

[2] oai:arXiv.org:1407.3502 [pdf] - 1515683

Automated Real-Time Classification and Decision Making in Massive Data Streams from Synoptic Sky Surveys

Djorgovski, S. G.; Mahabal, A. A.; Donalek, C.; Graham, M. J.; Drake, A. J.; Turmon, M.; Fuchs, T.

Comments: 8 pages, IEEE conference format, to appear in the refereed proceedings of the IEEE e-Science 2014 conf., eds. C. Medeiros et al., IEEE, in press (2014). arXiv admin note: substantial text overlap with arXiv:1209.1681, arXiv:1110.4655

Submitted: 2014-07-13

[3] oai:arXiv.org:1404.1879 [pdf] - 806920

The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: SHARPs -- Space-weather HMI Active Region Patches

Bobra, Monica G.; Sun, Xudong; Hoeksema, J. Todd; Turmon, Michael J.; Liu, Yang; Hayashi, Keiji; Barnes, Graham; Leka, K. D.

Comments: 27 pages, 7 figures. Accepted to Solar Physics

Submitted: 2014-04-07

A new data product from the Helioseismic and Magnetic Imager (HMI) onboard the Solar Dynamics Observatory (SDO) called Space-weather HMI Active Region Patches (SHARPs) is now available. SDO/HMI is the first space-based instrument to map the full-disk photospheric vector magnetic field with high cadence and continuity. The SHARP data series provide maps in patches that encompass automatically tracked magnetic concentrations for their entire lifetime; map quantities include the photospheric vector magnetic field and its uncertainty, along with Doppler velocity, continuum intensity, and line-of-sight magnetic field. Furthermore, keywords in the SHARP data series provide several parameters that concisely characterize the magnetic-field distribution and its deviation from a potential-field configuration. These indices may be useful for active-region event forecasting and for identifying regions of interest. The indices are calculated per patch and are available on a twelve-minute cadence. Quick-look data are available within approximately three hours of observation; definitive science products are produced approximately five weeks later. SHARP data are available at http://jsoc.stanford.edu and maps are available in either of two different coordinate systems. This article describes the SHARP data products and presents examples of SHARP data and parameters.

[4] oai:arXiv.org:1404.1881 [pdf] - 806922

The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: Overview and Performance

Hoeksema, J. Todd; Liu, Yang; Hayashi, Keiji; Sun, Xudong; Schou, Jesper; Couvidat, Sebastien; Norton, Aimee; Bobra, Monica; Centeno, Rebecca; Leka, K. D.; Barnes, Graham; Turmon, Michael J.

Comments: 42 pages, 19 figures, accepted to Solar Physics

Submitted: 2014-04-07

The Helioseismic and Magnetic Imager (HMI) began near-continuous full-disk solar measurements on 1 May 2010 from the Solar Dynamics Observatory (SDO). An automated processing pipeline keeps pace with observations to produce observable quantities, including the photospheric vector magnetic field, from sequences of filtergrams. The primary 720s observables were released in mid 2010, including Stokes polarization parameters measured at six wavelengths as well as intensity, Doppler velocity, and the line-of-sight magnetic field. More advanced products, including the full vector magnetic field, are now available. Automatically identified HMI Active Region Patches (HARPs) track the location and shape of magnetic regions throughout their lifetime. The vector field is computed using the Very Fast Inversion of the Stokes Vector (VFISV) code optimized for the HMI pipeline; the remaining 180 degree azimuth ambiguity is resolved with the Minimum Energy (ME0) code. The Milne-Eddington inversion is performed on all full-disk HMI observations. The disambiguation, until recently run only on HARP regions, is now implemented for the full disk. Vector and scalar quantities in the patches are used to derive active region indices potentially useful for forecasting; the data maps and indices are collected in the SHARP data series, hmi.sharp_720s. Patches are provided in both CCD and heliographic coordinates. HMI provides continuous coverage of the vector field, but has modest spatial, spectral, and temporal resolution. Coupled with limitations of the analysis and interpretation techniques, effects of the orbital velocity, and instrument performance, the resulting measurements have a certain dynamic range and sensitivity and are subject to systematic errors and uncertainties that are characterized in this report.

[5] oai:arXiv.org:1310.1976 [pdf] - 1516225

Feature Selection Strategies for Classifying High Dimensional Astronomical Data Sets

Donalek, Ciro; A., Arun Kumar; Djorgovski, S. G.; Mahabal, Ashish A.; Graham, Matthew J.; Fuchs, Thomas J.; Turmon, Michael J.; Philip, N. Sajeeth; Yang, Michael Ting-Chang; Longo, Giuseppe

Comments: 7 pages, to appear in refereed proceedings of Scalable Machine Learning: Theory and Applications, IEEE BigData 2013

Submitted: 2013-10-07

The amount of collected data in many scientific fields is increasing, all of them requiring a common task: extract knowledge from massive, multi parametric data sets, as rapidly and efficiently possible. This is especially true in astronomy where synoptic sky surveys are enabling new research frontiers in the time domain astronomy and posing several new object classification challenges in multi dimensional spaces; given the high number of parameters available for each object, feature selection is quickly becoming a crucial task in analyzing astronomical data sets. Using data sets extracted from the ongoing Catalina Real-Time Transient Surveys (CRTS) and the Kepler Mission we illustrate a variety of feature selection strategies used to identify the subsets that give the most information and the results achieved applying these techniques to three major astronomical problems.

[6] oai:arXiv.org:1209.1681 [pdf] - 1515667

Flashes in a Star Stream: Automated Classification of Astronomical Transient Events

Djorgovski, S. G.; Mahabal, A. A.; Donalek, C.; Graham, M. J.; Drake, A. J.; Moghaddam, B.; Turmon, M.

Comments: 8 pages, to appear in refereed proceedings of the IEEE eScience 2012 conference, October 2012, IEEE Press

Submitted: 2012-09-07

An automated, rapid classification of transient events detected in the modern synoptic sky surveys is essential for their scientific utility and effective follow-up using scarce resources. This presents some unusual challenges: the data are sparse, heterogeneous and incomplete; evolving in time; and most of the relevant information comes not from the data stream itself, but from a variety of archival data and contextual information (spatial, temporal, and multi-wavelength). We are exploring a variety of novel techniques, mostly Bayesian, to respond to these challenges, using the ongoing CRTS sky survey as a testbed. The current surveys are already overwhelming our ability to effectively follow all of the potentially interesting events, and these challenges will grow by orders of magnitude over the next decade as the more ambitious sky surveys get under way. While we focus on an application in a specific domain (astrophysics), these challenges are more broadly relevant for event or anomaly detection and knowledge discovery in massive data streams.

[7] oai:arXiv.org:1111.3699 [pdf] - 1091687

Real Time Classification of Transient Events in Synoptic Sky Surveys

Mahabal, Ashish A.; Donalek, C.; Djorgovski, S. G.; Drake, A. J.; Graham, M. J.; Williams, R.; Chen, Y.; Moghaddam, B.; Turmon, M.

Comments: 3 pages, to appear in Proc. IAU 285, "New Horizons in Transient Astronomy", Oxford, Sept. 2011

Submitted: 2011-11-15

An automated, rapid classification of transient events detected in the modern synoptic sky surveys is essential for their scientific utility and effective follow-up using scarce resources. This problem will grow by orders of magnitude with the next generation of surveys. We are exploring a variety of novel automated classification techniques, mostly Bayesian, to respond to these challenges, using the ongoing CRTS sky survey as a testbed. We describe briefly some of the methods used.

[8] oai:arXiv.org:1111.0313 [pdf] - 433619

Discovery, classification, and scientific exploration of transient events from the Catalina Real-time Transient Survey

Mahabal, A. A.; Djorgovski, S. G.; Drake, A. J.; Donalek, C.; Graham, M. J.; Williams, R. D.; Chen, Y.; Moghaddam, B.; Turmon, M.; Beshore, E.; Larson, S.

Comments: 22 pages, 12 figures, invited review for the Bulletin of Astronomical Society of India

Submitted: 2011-11-01

Exploration of the time domain - variable and transient objects and phenomena - is rapidly becoming a vibrant research frontier, touching on essentially every field of astronomy and astrophysics, from the Solar system to cosmology. Time domain astronomy is being enabled by the advent of the new generation of synoptic sky surveys that cover large areas on the sky repeatedly, and generating massive data streams. Their scientific exploration poses many challenges, driven mainly by the need for a real-time discovery, classification, and follow-up of the interesting events. Here we describe the Catalina Real-Time Transient Survey (CRTS), that discovers and publishes transient events at optical wavelengths in real time, thus benefiting the entire community. We describe some of the scientific results to date, and then focus on the challenges of the automated classification and prioritization of transient events. CRTS represents a scientific and a technological testbed and precursor for the larger surveys in the future, including the Large Synoptic Survey Telescope (LSST) and the Square Kilometer Array (SKA).

[9] oai:arXiv.org:1110.4655 [pdf] - 428693

Towards an Automated Classification of Transient Events in Synoptic Sky Surveys

Djorgovski, S. G.; Donalek, C.; Mahabal, A.; Moghaddam, B.; Turmon, M.; Graham, M.; Drake, A.; Sharma, N.; Chen, Y.

Comments: Invited paper, 15 pages, to appear in Statistical Analysis and Data Mining (ASA journal), ref. proc. CIDU 2011 conf., eds. A. Srivasatva & N. Chawla, in press (2011)

Submitted: 2011-10-20

We describe the development of a system for an automated, iterative, real-time classification of transient events discovered in synoptic sky surveys. The system under development incorporates a number of Machine Learning techniques, mostly using Bayesian approaches, due to the sparse nature, heterogeneity, and variable incompleteness of the available data. The classifications are improved iteratively as the new measurements are obtained. One novel feature is the development of an automated follow-up recommendation engine, that suggest those measurements that would be the most advantageous in terms of resolving classification ambiguities and/or characterization of the astrophysically most interesting objects, given a set of available follow-up assets and their cost functions. This illustrates the symbiotic relationship of astronomy and applied computer science through the emerging discipline of AstroInformatics.

[10] oai:arXiv.org:0810.4527 [pdf] - 17807

Towards Real-time Classification of Astronomical Transients

Mahabal, A.; Djorgovski, S. G.; Williams, R.; Drake, A.; Donalek, C.; Graham, M.; Moghaddam, B.; Turmon, M.; Jewell, J.; Khosla, A.; Hensley, B.

Comments: 7 pages, 3 figures, to appear in proceedings of the Class2008 conference (Classification and Discovery in Large Astronomical Surveys, Ringberg Castle, 14-17 October 2008)

Submitted: 2008-10-24

Exploration of time domain is now a vibrant area of research in astronomy, driven by the advent of digital synoptic sky surveys. While panoramic surveys can detect variable or transient events, typically some follow-up observations are needed; for short-lived phenomena, a rapid response is essential. Ability to automatically classify and prioritize transient events for follow-up studies becomes critical as the data rates increase. We have been developing such methods using the data streams from the Palomar-Quest survey, the Catalina Sky Survey and others, using the VOEventNet framework. The goal is to automatically classify transient events, using the new measurements, combined with archival data (previous and multi-wavelength measurements), and contextual information (e.g., Galactic or ecliptic latitude, presence of a possible host galaxy nearby, etc.); and to iterate them dynamically as the follow-up data come in (e.g., light curves or colors). We have been investigating Bayesian methodologies for classification, as well as discriminated follow-up to optimize the use of available resources, including Naive Bayesian approach, and the non-parametric Gaussian process regression. We will also be deploying variants of the traditional machine learning techniques such as Neural Nets and Support Vector Machines on datasets of reliably classified transients as they build up.

[11] oai:arXiv.org:0802.3199 [pdf] - 1937497

Automated Probabilistic Classification of Transients and Variables

Mahabal, A. A.; Djorgovski, S. G.; Turmon, M.; Jewell, J.; Williams, R. R.; Drake, A. J.; Graham, M. G.; Donalek, C.; Glikman, E.

Comments: Latex, 4 pages, 3 figures, macros included. To appear in refereed proceedings of "Hotwiring the Transient Universe 2007", eds. A. Allan, R. Seaman, and J. Bloom, Astron. Nachr. vol. 329, March, 2008

Submitted: 2008-02-21

There is an increasing number of large, digital, synoptic sky surveys, in which repeated observations are obtained over large areas of the sky in multiple epochs. Likewise, there is a growth in the number of (often automated or robotic) follow-up facilities with varied capabilities in terms of instruments, depth, cadence, wavelengths, etc., most of which are geared toward some specific astrophysical phenomenon. As the number of detected transient events grows, an automated, probabilistic classification of the detected variables and transients becomes increasingly important, so that an optimal use can be made of follow-up facilities, without unnecessary duplication of effort. We describe a methodology now under development for a prototype event classification system; it involves Bayesian and Machine Learning classifiers, automated incorporation of feedback from follow-up observations, and discriminated or directed follow-up requests. This type of methodology may be essential for the massive synoptic sky surveys in the future.