In science, reproducibility is paramount. A fundamental principle of science, reproducibility is the requirement that experimental results from independent laboratories should be commensurate. In scientific computation simulations, data munging and analysis pipelines are experimental analogs. To ensure that results are repeatable, it must be possible to unwind code and analysis to previous versions, and to replicate plots. The most essential requirement is that all previous versions of the code, data, and provenance metadata must be robustly and retrievably archived. The best practice in scientific computing is called version control.
Rather than inventing a system of indexed directories holding full versions of your code from each day in the lab, the best practice in software development is to use a version control system that automates archiving and retrieval of text documents such as source code.
This chapter will explain:
What version control is
How to use it for managing files on your computer
And how to use it for managing files in a collaboration
First up, this chapter will discuss what version control is and how it fits into the reproducible workflow of an effective researcher in the physical sciences.
Very briefly, version control is a way to:
Back up changing files
Store and access an annotated history
And manage merging of changes between different change sets
There are many tools to automate ...