LAS 6292: Data Collection & Management

Reproducibility

           

Why should we practice

‘Reproducible Research’?

Why ‘Reproducible Research’?

  • Have you ever tried to reproduce someone else’s data analysis?

  • Have you ever tried to repeat your own (old) work?

  • What made it easy/hard to do so?

  • What would have to happen if you had to extend the analysis further?

  • If you caught a data error how easy/hard would it be to re-create the analysis?

  • What would happen if your collaborator is no longer available to walk you through their analysis?

  • What if you couldn’t remember the steps?

A Data Sharing Snafu in 3 Acts

Why ‘Reproducible Research’?

Scientific Integrity:

  • Rigor, trustworthiness, and transparency
  • Allows others to verify analyses, find & correct mistakes

Efficient Research:

  • Helps remember HOW and WHY you did something.

  • Speed up this project: Automation makes it faster (and easier) to redo analyses. Time saved can be spent doing other stuff.

  • Speed up future projects: Can use scripts from one project for another one.

Community Good:

  • Others can learn from your approach & methods

reproducible workflow: data aquisition, data processing, data analysis, data presentation.

The Reproducible Research Workflow

Reproducible Data Processing: Let’s Practice

picture of a peanut butter and jelly sandwich.

Reproducibility = Recipies

picture of the recipie for the classic mexican dish 'mole poblano'

picture of a peanut butter and jelly sandwich.