Computational basis and ReproIn/DataLad: Reference

Key Points

Computational Basis
Shell: Getting around the “black box”
  • A command line shell is a powerful tool and learning additional ‘tricks’ can help make its use more efficient, less error-prone, and thus more reproducible

  • Shell scripting is the most accessible tool to automate execution of an arbitrary set of commands. This avoids manual retyping of the same commands and in turn avoids typos and erroneous analyses

(Neuro)Debian/Git/GitAnnex/DataLad: Distributions and Version Control
  • Distribution and version control systems allow for the efficient creation of tightly version-controlled computation environments

  • DataLad assists in creating a complete record of changes

ReproEnv: Virtual machines/Containers, Neurodocker
Lunch
  • Food is necessary for our survival

  • Food cannot be controled/distributed by Git, but recipes could/should

Neuroimaging Workflows
ReproIn/DataLad: A complete portable and reproducible fMRI study from scratch
  • we can implement a complete imaging study using DataLad datasets to represent units of data processing

  • each unit comprehensively captures all inputs and data processing leading up to it

  • this comprehensive capture facilitates re-use of units, and enables computational reproducibility

  • carefully validated intermediate results (captured as a DataLad dataset) are a candidate for publication with minimal additional effort

  • the outcome of this demo is available as a demo DataLad dataset from GitHub

FIXME: more reference material.