Normalized to: Angryk, R.
[1]
oai:arXiv.org:2006.12224 [pdf] - 2119196
Machine Learning in Heliophysics and Space Weather Forecasting: A White
Paper of Findings and Recommendations
Nita, Gelu;
Georgoulis, Manolis;
Kitiashvili, Irina;
Sadykov, Viacheslav;
Camporeale, Enrico;
Kosovichev, Alexander;
Wang, Haimin;
Oria, Vincent;
Wang, Jason;
Angryk, Rafal;
Aydin, Berkay;
Ahmadzadeh, Azim;
Bai, Xiaoli;
Bastian, Timothy;
Boubrahimi, Soukaina Filali;
Chen, Bin;
Davey, Alisdair;
Fereira, Sheldon;
Fleishman, Gregory;
Gary, Dale;
Gerrard, Andrew;
Hellbourg, Gregory;
Herbert, Katherine;
Ireland, Jack;
Illarionov, Egor;
Kuroda, Natsuha;
Li, Qin;
Liu, Chang;
Liu, Yuexin;
Kim, Hyomin;
Kempton, Dustin;
Ma, Ruizhe;
Martens, Petrus;
McGranaghan, Ryan;
Semones, Edward;
Stefan, John;
Stejko, Andrey;
Collado-Vega, Yaireska;
Wang, Meiqi;
Xu, Yan;
Yu, Sijie
Submitted: 2020-06-22
The authors of this white paper met on 16-17 January 2020 at the New Jersey
Institute of Technology, Newark, NJ, for a 2-day workshop that brought together
a group of heliophysicists, data providers, expert modelers, and computer/data
scientists. Their objective was to discuss critical developments and prospects
of the application of machine and/or deep learning techniques for data
analysis, modeling and forecasting in Heliophysics, and to shape a strategy for
further developments in the field. The workshop combined a set of plenary
sessions featuring invited introductory talks interleaved with a set of open
discussion sessions. The outcome of the discussion is encapsulated in this
white paper that also features a top-level list of recommendations agreed by
participants.
[2]
oai:arXiv.org:1911.09061 [pdf] - 2001514
Challenges with Extreme Class-Imbalance and Temporal Coherence: A Study
on Solar Flare Data
Submitted: 2019-11-20
In analyses of rare-events, regardless of the domain of application,
class-imbalance issue is intrinsic. Although the challenges are known to data
experts, their explicit impact on the analytic and the decisions made based on
the findings are often overlooked. This is in particular prevalent in
interdisciplinary research where the theoretical aspects are sometimes
overshadowed by the challenges of the application. To show-case these
undesirable impacts, we conduct a series of experiments on a recently created
benchmark data, named Space Weather ANalytics for Solar Flares (SWAN-SF). This
is a multivariate time series dataset of magnetic parameters of active regions.
As a remedy for the imbalance issue, we study the impact of data manipulation
(undersampling and oversampling) and model manipulation (using class weights).
Furthermore, we bring to focus the auto-correlation of time series that is
inherited from the use of sliding window for monitoring flares' history.
Temporal coherence, as we call this phenomenon, invalidates the randomness
assumption, thus impacting all sampling practices including different
cross-validation techniques. We illustrate how failing to notice this concept
could give an artificial boost in the forecast performance and result in
misleading findings. Throughout this study we utilized Support Vector Machine
as a classifier, and True Skill Statistics as a verification metric for
comparison of experiments. We conclude our work by specifying the correct
practice in each case, and we hope that this study could benefit researchers in
other domains where time series of rare events are of interest.
[3]
oai:arXiv.org:1912.02743 [pdf] - 2010164
Toward Filament Segmentation Using Deep Neural Networks
Submitted: 2019-11-20
We use a well-known deep neural network framework, called Mask R-CNN, for
identification of solar filaments in full-disk H-alpha images from Big Bear
Solar Observatory (BBSO). The image data, collected from BBSO's archive, are
integrated with the spatiotemporal metadata of filaments retrieved from the
Heliophysics Events Knowledgebase (HEK) system. This integrated data is then
treated as the ground-truth in the training process of the model. The available
spatial metadata are the output of a currently running filament-detection
module developed and maintained by the Feature Finding Team; an international
consortium selected by NASA. Despite the known challenges in the identification
and characterization of filaments by the existing module, which in turn are
inherited into any other module that intends to learn from such outputs, Mask
R-CNN shows promising results. Trained and validated on two years worth of BBSO
data, this model is then tested on the three following years. Our case-by-case
and overall analyses show that Mask R-CNN can clearly compete with the existing
module and in some cases even perform better. Several cases of false positives
and false negatives, that are correctly segmented by this model are also shown.
The overall advantages of using the proposed model are two-fold: First, deep
neural networks' performance generally improves as more annotated data, or
better annotations are provided. Second, such a model can be scaled up to
detect other solar events, as well as a single multi-purpose module. The
results presented in this study introduce a proof of concept in benefits of
employing deep neural networks for detection of solar events, and in
particular, filaments.
[4]
oai:arXiv.org:1906.01062 [pdf] - 1925073
A Curated Image Parameter Dataset from Solar Dynamics Observatory
Mission
Submitted: 2019-06-03
We provide a large image parameter dataset extracted from the Solar Dynamics
Observatory (SDO) mission's AIA instrument, for the period of January 2011
through the current date, with the cadence of six minutes, for nine wavelength
channels. The volume of the dataset for each year is just short of 1 TiB.
Towards achieving better results in the region classification of active regions
and coronal holes, we improve upon the performance of a set of ten image
parameters, through an in depth evaluation of various assumptions that are
necessary for calculation of these image parameters. Then, where possible, a
method for finding an appropriate settings for the parameter calculations was
devised, as well as a validation task to show our improved results. In
addition, we include comparisons of JP2 and FITS image formats using supervised
classification models, by tuning the parameters specific to the format of the
images from which they are extracted, and specific to each wavelength. The
results of these comparisons show that utilizing JP2 images, which are
significantly smaller files, is not detrimental to the region classification
task that these parameters were originally intended for. Finally, we compute
the tuned parameters on the AIA images and provide a public API
(http://dmlab.cs.gsu.edu/dmlabapi) to access the dataset. This dataset can be
used in a range of studies on AIA images, such as content-based image retrieval
or tracking of solar events, where dimensionality reduction on the images is
necessary for feasibility of the tasks.
[5]
oai:arXiv.org:1810.08728 [pdf] - 1774812
Roadmap for Reliable Ensemble Forecasting of the Sun-Earth System
Nita, Gelu;
Angryk, Rafal;
Aydin, Berkay;
Banda, Juan;
Bastian, Tim;
Berger, Tom;
Bindi, Veronica;
Boucheron, Laura;
Cao, Wenda;
Christian, Eric;
de Nolfo, Georgia;
DeLuca, Edward;
DeRosa, Marc;
Downs, Cooper;
Fleishman, Gregory;
Fuentes, Olac;
Gary, Dale;
Hill, Frank;
Hoeksema, Todd;
Hu, Qiang;
Ilie, Raluca;
Ireland, Jack;
Kamalabadi, Farzad;
Korreck, Kelly;
Kosovichev, Alexander;
Lin, Jessica;
Lugaz, Noe;
Mannucci, Anthony;
Mansour, Nagi;
Martens, Petrus;
Mays, Leila;
McAteer, James;
McIntosh, Scott W.;
Oria, Vincent;
Pan, David;
Panesi, Marco;
Pesnell, W. Dean;
Pevtsov, Alexei;
Pillet, Valentin;
Rachmeler, Laurel;
Ridley, Aaron;
Scherliess, Ludger;
Toth, Gabor;
Velli, Marco;
White, Stephen;
Zhang, Jie;
Zou, Shasha
Submitted: 2018-10-19, last modified: 2018-10-29
The authors of this report met on 28-30 March 2018 at the New Jersey
Institute of Technology, Newark, New Jersey, for a 3-day workshop that brought
together a group of data providers, expert modelers, and computer and data
scientists, in the solar discipline. Their objective was to identify challenges
in the path towards building an effective framework to achieve transformative
advances in the understanding and forecasting of the Sun-Earth system from the
upper convection zone of the Sun to the Earth's magnetosphere. The workshop
aimed to develop a research roadmap that targets the scientific challenge of
coupling observations and modeling with emerging data-science research to
extract knowledge from the large volumes of data (observed and simulated) while
stimulating computer science with new research applications. The desire among
the attendees was to promote future trans-disciplinary collaborations and
identify areas of convergence across disciplines. The workshop combined a set
of plenary sessions featuring invited introductory talks and workshop progress
reports, interleaved with a set of breakout sessions focused on specific topics
of interest. Each breakout group generated short documents, listing the
challenges identified during their discussions in addition to possible ways of
attacking them collectively. These documents were combined into this
report-wherein a list of prioritized activities have been collated, shared and
endorsed.
[6]
oai:arXiv.org:1712.03998 [pdf] - 1602863
On the Prediction of >100 MeV Solar Energetic Particle Events Using GOES
Satellite Data
Submitted: 2017-12-11
Solar energetic particles are a result of intense solar events such as solar
flares and Coronal Mass Ejections (CMEs). These latter events all together can
cause major disruptions to spacecraft that are in Earth's orbit and outside of
the magnetosphere. In this work we are interested in establishing the necessary
conditions for a major geo-effective solar particle storm immediately after a
major flare, namely the existence of a direct magnetic connection. To our
knowledge, this is the first work that explores not only the correlations of
GOES X-ray and proton channels, but also the correlations that happen across
all the proton channels. We found that proton channels auto-correlations and
cross-correlations may also be precursors to the occurrence of an SEP event. In
this paper, we tackle the problem of predicting >100 MeV SEP events from a
multivariate time series perspective using easily interpretable decision tree
models.
[7]
oai:arXiv.org:1712.01402 [pdf] - 1736244
Data Handling and Assimilation for Solar Event Prediction
Submitted: 2017-12-04
The prediction of solar flares, eruptions, and high energy particle storms is
of great societal importance. The data mining approach to forecasting has been
shown to be very promising. Benchmark datasets are a key element in the further
development of data-driven forecasting. With one or more benchmark data sets
established, judicious use of both the data themselves and the selection of
prediction algorithms is key to developing a high quality and robust method for
the prediction of geo-effective solar activity. We review here briefly the
process of generating benchmark datasets and developing prediction algorithms.