sort results by

Use logical operators AND, OR, NOT and round brackets to construct complex queries. Whitespace-separated words are treated as ANDed.

Show articles per page in mode

Groth, Paul

Normalized to: Groth, P.

4 article(s) in total. 22 co-authors, from 1 to 3 common article(s). Median position in authors list is 4,0.

[1]  oai:arXiv.org:1401.2134  [pdf] - 1202657
10 Simple Rules for the Care and Feeding of Scientific Data
Comments: Accepted in PLOS Computational Biology. This paper was written collaboratively, on the web, in the open, using Authorea. The living version of this article, which includes sources and history, is available at http://www.authorea.com/3410/
Submitted: 2014-01-09
This article offers a short guide to the steps scientists can take to ensure that their data and associated analyses continue to be of value and to be recognized. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more, but our goal here is not to review that literature. Instead, we present a short guide intended for researchers who want to know why it is important to "care for and feed" data, with some practical advice on how to do that.
[2]  oai:arXiv.org:1006.4860  [pdf] - 1033307
The Application of Cloud Computing to the Creation of Image Mosaics and Management of Their Provenance
Comments: 15 pages, 3 figure
Submitted: 2010-06-24
We have used the Montage image mosaic engine to investigate the cost and performance of processing images on the Amazon EC2 cloud, and to inform the requirements that higher-level products impose on provenance management technologies. We will present a detailed comparison of the performance of Montage on the cloud and on the Abe high performance cluster at the National Center for Supercomputing Applications (NCSA). Because Montage generates many intermediate products, we have used it to understand the science requirements that higher-level products impose on provenance management technologies. We describe experiments with provenance management technologies such as the "Provenance Aware Service Oriented Architecture" (PASOA).
[3]  oai:arXiv.org:1005.4457  [pdf] - 170900
Pipeline-Centric Provenance Model
Comments: 9 pages, 4 figures
Submitted: 2010-05-24
In this paper we propose a new provenance model which is tailored to a class of workflow-based applications. We motivate the approach with use cases from the astronomy community. We generalize the class of applications the approach is relevant to and propose a pipeline-centric provenance model. Finally, we evaluate the benefits in terms of storage needed by the approach when applied to an astronomy application.
[4]  oai:arXiv.org:1005.2643  [pdf] - 166170
Metadata and provenance management
Comments:
Submitted: 2010-05-14
Scientists today collect, analyze, and generate TeraBytes and PetaBytes of data. These data are often shared and further processed and analyzed among collaborators. In order to facilitate sharing and data interpretations, data need to carry with it metadata about how the data was collected or generated, and provenance information about how the data was processed. This chapter describes metadata and provenance in the context of the data lifecycle. It also gives an overview of the approaches to metadata and provenance management, followed by examples of how applications use metadata and provenance in their scientific processes.