
EDUCATION
A Quick Introduction to Version Control with
Git and GitHub
John D. Blischak
1
*, Emily R. Davenport
2
, Greg Wilson
3
1 Committee on Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, Illinois, United
States of America, 2 Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York,
United States of America, 3 Software Carpentry Foundation, Toronto, Ontario, Canada
* jdblischak@gmail.com
“This is part of the PLOS Computational Biology Education collection.”
Introduction to Version Control
Many scientists write code as part of their research. Just as experiments are logged in laboratory
notebooks, it is important to document the code you use for analysis. However, a few key prob-
lems can arise when iteratively developing code that make it difficult to document and track
which code version was used to create each result. First, you often need to experiment with
new ideas, such as adding new features to a script or increasing the speed of a slow step, but
you do not want to risk breaking the currently working code. One often-utilized solution is to
make a copy of the script before making new edits. However, this can quickly become a prob-
lem because it clutters your file system with uninformative filenames, e.g., analysis.sh,
analysis_02.sh, analysis_03.sh, etc. It is difficult to remember the differences
between the versions of the files and, more importantly, which version you used to produce
specific results, especially if you return to the code months later. Second, you will likely share
your code with multiple lab mates or collaborators, and they may have suggestions on how to
improve it. If you email the code to multiple people, you will have to manually incorporate all
the changes each of them sends.
Fortunately, software engineers have already developed software to manage these issues:
version control. A version control system (VCS) allows you to track the iterative changes you
make to your code. Thus, you can experiment with new ideas but always have the option to
revert to a specific past version of the code you used to generate particular results. Furthermore,
you can record messages as you save each successive version so that you (or anyone else)
reviewing the development history of the code is able to understand the rationale for the given
edits. It also facilitates collaboration. Using a VCS, your collaborators can make and save
changes to the code, and you can automatically incorporate these changes to the main code
base. The collaborative aspect is enhanced with the emergence of websites that host version-
controlled code.
In this quick guide, we introduce you to one VCS, Git (https://git-scm.com), and one online
hosting site, GitHub (https://github.com), both of which are currently popular among scientists
and programmers in general. More importantly, we hope to convince you that although mas-
tering a given VCS takes time, you can already achieve great benefits by getting started using a
few simple commands. Furthermore, not only does using a VCS solve many common problems
when writing code, it can also improve the scientific process. By tracking your code
PLOS Computational Biology | DOI:10.1371/journal.pcbi.1004668 January 19, 2016 1 / 18
OPEN ACCESS
Citation: Blischak JD, Davenport ER, Wilson G
(2016) A Quick Introduction to Version Control with
Git and GitHub. PLoS Comput Biol 12(1): e1004668.
doi:10.1371/journal.pcbi.1004668
Editor: Francis Ouellette, Ontario Institute for Cancer
Research, CANADA
Published: January 19, 2016
Copyright: © 2016 Blischak et al. This is an open
access article distributed under the terms of the
Creative Commons Attribution License, which permits
unrestricted use, distribution, and reproduction in any
medium, provided the original author and source are
credited.
Funding: JDB is supported by National Institutes of
Health grant AI087658 awarded to Yoav Gilad. The
funders had no role in study design, data collection
and analysis, decision to publish, or preparation of
the manuscript.
Competing Interests: The authors have declared
that no competing interests exist.