ReproNim module for dataprocessing: Reference

Key Points

Module overview
  • Reproducible research requires understanding all pieces of the (data) workflow

  • You should be familiar with the necessary elements and tools for reproducible analysis.

Lesson 1: Core concepts using an analysis workflow example
  • Reproducible analysis is technologically possible

  • Learning these technologies can help produce more reliable research output

  • Using such frameworks provide a better way to communicate information to colleagues and collaborators

Lesson 2: Annotate, harmonize, clean, and version data
  • What different file formats store and knowing where to find information

  • Using standards to simplify harmonization

Lesson 3: Create and maintain reproducible computational environments
  • Hardware and software comprise an analysis environment.

  • Software and hardware components interact through command options and environment variables.

Lesson 4: Create reusable and composable dataflow tools
  • Dataflow tools create abstraction of process from data.

  • Dataflow tools allow reuse and composition of tools.

Lesson 5: Use integration testing to revalidate analyses as data and software change
  • Continuous Integration makes software development more efficient.

  • Continuous Integration platform can be easily used with a GitHub account.

Lesson 6: Track provenance from data to results
  • Analysis can be captured in a way to repeat it

  • Understanding points of human interaction and decision making are essential for reproducibility

FIXME: more reference material.