P-RECS'19

2nd International Workshop on Practical Reproducible Evaluation of Computer Systems

Topics | Program | Organization | Contact

The P-RECS’19 workshop will be held as a full-day meeting at ACM HPDC 2019 in Phoenix, Arizona, USA on June 24th, 2019. This year, HPDC runs under the ACM Federated Computing Research Conference. This large federated conference will assemble 11 affiliated conferences and will provide excellent opportunities for interdisciplinary networking and learning.

The P-RECS workshop focuses heavily on practical, actionable aspects of reproducibility in broad areas of computational science and data exploration, with special emphasis on issues in which community collaboration can be essential for adopting novel methodologies, techniques and frameworks aimed at addressing some of the challenges we face today. The workshop brings together researchers and experts to share experiences and advance the state of the art in the reproducible evaluation of computer systems, featuring contributed papers and invited talks.

Topics

We expect submissions from topics such as, but not limited to:

Experiment dependency management.
Software citation and persistence.
Data versioning and preservation.
Provenance of data-intensive experiments.
Tools and techniques for incorporating provenance into publications.
Automated experiment execution and validation.
Experiment portability for code, performance, and related metrics.
Experiment discoverability for re-use.
Cost-benefit analysis frameworks for reproducibility.
Usability and adaptability of reproducibility frameworks into already-established domain-specific tools.
Long-term artifact archiving for future reproducibility.
Frameworks for sociological constructs to incentivize paradigm shifts.
Policies around publication of articles/software.
Blinding and selecting artifacts for review while maintaining history.
Reproducibility-aware computational infrastructure.

Program


09:00-09:15	Welcome
09:15-10:15	Keynote (Dr. Carl Kesselman)
10:15-10:45	Coffee break
10:45-12:15	Paper Presentations 1
12:15-13:30	Lunch (hosted by HPDC/FCRC)
13:30-15:00	Paper Presentations 2
15:00-15:30	Coffee Break
15:30-16:30	Open discussion
16:30-17:00	Closing remarks

Keynote Address

Carl Kesselman (University of Southern California)

Title: Making Lightening Strike Twice: Achieving reproducibility and impact in a data-driven scientific environment.

Abstract: A cornerstone of the scientific method is the ability for one scientist to reproduce the results of another scientist. This requires that investigators take explicit steps such as ensuring that protocols are well defined, reagents and cell lines characterized and validated, etc. A critical aspect of this process is describing what data has been collected and how it is analyzed. While science has always been driven by the collection, analysis and sharing of data, technology advances have shifted data processing from the role of a final analysis step to a core and integral part of the scientific method. However, with the increased complexity of computational methods and shear volume of data, achieving reproducibility of a data-driven scientific investigation becomes correspondingly more difficult. In my talk, I will describe the properties that data in a scientific investigation should have to promote reproducibility. Specifically, reproducibility requires that data should be Findable, Interoperable, Accessible, or Reusable, or FAIR. I will describe methods and tools that can help promote reproducibility in data-driven scientific research and will illustrate with examples from FaceBase, an NIH funded consortium that is generating data associated with caniofacial development and malformation.

Bio: Dr. Carl Kesselman specializes in grid computing technologies. This term was developed by him and professor Ian Foster in the book The Grid: Blueprint for a New Computing Infrastructure. He and Foster are winners of the British Computer Society’s Lovelace Medal for their grid work. He is institute fellow at the University of Southern California’s Information Sciences Institute and a professor in the Epstein Department of Industrial and Systems Engineering, at the University of Southern California.

Papers Session 1

Dylan Chapp, Danny Rorabaugh, Duncan Brown, Ewa Deelman, Karan Vahi, Von Welch, Michela Taufer. Applicability study of the PRIMAD model to LIGO gravitational wave search workflows.
David Stockton, Astrid Prinz, Fidel Santamaria. Provenance and reproducibility in the automation of a standard computational neuroscience pipeline.
Von Welch, Ewa Deelman, Victoria Stodden, Michela Taufer. Initial Thoughts on Cybersecurity And Reproducibility.

Papers Session 2

Kyle Chard, Niall Gaffney, Matthew Jones, Kacper Kowalik, Bertram Ludascher, Jarek Nabrzyski, Victoria Stodden, Matthew Turk, Craig Willis. Implementing Computational Reproducibility in the Whole Tale Environment.
Matthew S. Krafczyk, August Shi, Adhithya Bhaskar, Darko Marinov, Victoria Stodden. Continuous Integration Strategies in the Scientific Software Context.
Andrea David, Mariette Souppe, Ivo Jimenez, Katia Obraczka, Sam Mansfield, Kerry Veenstra, Carlos Maltzahn. Reproducible Computer Network Experiments: A Case Study Using Popper.

Submission

Submit (single-blind) via EasyChair. We look for two categories of submissions:

Position papers. This category is for papers whose goal is to propose solutions (or scope the work that needs to be done) to address some of the issues outlined above. We hope that a research agenda comes out of this and that we can create a community that meets yearly to report on our status in addressing these problems.
Experience papers. This category consists of papers reporting on the authors’ experience in automating one or more experimentation pipelines. The committee will look for submissions reporting on their experience: what worked? What aspects of experiment automation and validation are hard in your domain? What can be done to improve the tooling for your domain? As part of the submission, authors need to provide a URL to the automation service they use (e.g., TravisCI, GitLabCI, CircleCI, Jenkins, etc.) so reviewers can verify that there is one or more automated pipelines associated to the submission.

Format

Authors are invited to submit manuscripts in English not exceeding 5 pages of content. The 5-page limit includes figures, tables and appendices, but does not include references, for which there is no page limit. Submissions must use the ACM Master Template (please use the sigconf format with default options).

Proceedings

The proceedings will be archived in both the ACM Digital Library and IEEE Xplore through SIGHPC. In addition, pre-print versions of the accepted articles will be published in this website (as allowed by ACM’s publishing policy).

Tools

These tools can be used used to automate your experiments (not an exhaustive list): CK, CWL, Popper, ReproZip, Sciunit, Sumatra.

Important Dates

Submissions due: ~~April 9~~ April 15, 2019 (AoE)
Acceptance notification: April 30, 2019
Camera-ready paper submission: May 9, 2019
Workshop: June 24, 2019

Organizers

Ivo Jimenez, UC Santa Cruz
Carlos Maltzahn, UC Santa Cruz
Jay Lofstead, Sandia National Laboratories
Fernando Chirigati, New York University

Program Committee

Jay Billings, Oak Ridge National Laboratory
Ronald Boisvert, NIST
Bruce R. Childers, University of Pittsburgh
Neil Chue Hong, Software Sustainability Institute, EPCC, University of Edinburgh
Robert Clay, Sandia National Labs
Michael Crusoe, Common Workflow Language
Dmitry Duplyakin, University of Utah
Torsten Hoefler, ETH Zurich
Fatma Imamoglu, University of California, Berkeley
Daniel S. Katz, University of Illinois Urbana-Champaign
Arnaud Legrand, CNRS / Inria / University of Grenoble
Tanu Malik, DePaul University
Robert Ricci, University of Utah
Victoria Stodden, University of Illinois at Urbana-Champaign
Violet Syrotiuk, Arizona State University
Michela Taufer, University of Tennessee Knoxville

Contact

Please address workshop questions to ivo@cs.ucsc.edu.