
Introduction
In the last post I provided an overview of databases and data management systems. This post is the first in a 5 part series on version control. The objective of this post is to provide an overview of version control for managing code. This is in preparation for using version control in future posts for sharing computer code with the readers of this blog to facilitate learning. As there are plenty of online resources that cover version control in detail (e.g., Pro Git), my goal here is not to provide a comprehensive lesson in version control. Instead the goal is to provide a basic understanding, and get readers up and running quickly so we can use version control in future posts to share and manage code.
What is Version Control?
Put simply, version control is an efficient means of collaborating on the writing and management of computer files. In version control a version history of files is maintained, thus changes made to a particular file are tracked and saved through time. While version control can be used to manage most any type of computer file, it is most commonly used to manage computer code files. The freely available e-book “Pro Git: Everything you need to know about Git” written by Scott Chacon and Ben Straub and published by Apress provides a nice overview of version control here: 1.1 Getting Started – About Version Control.
Version control is a system that records changes to a file, or set of files, over time so that you can recall specific versions later.
Chacon and Straub 2020
Key Concepts
There are four basic concepts in version control: Repositories, Committing, Pushing, and Pulling. Repositories are akin to folders on a computer file system, and are used to organize files across projects. There are two types of repositories: local and remote. Local repositories exist on individual computers, while remote repositories exist on remote servers (e.g., the “cloud”) and are accessed via the internet. Committing is how the version history of a file is saved. You can think of a commit as a snapshot of a file at a given point in time. Pushing and pulling are used to keep local and remote repositories in sync. Pushing is the process of uploading files, and their version history, from a local repository to a remote repository. Pulling is the process of downloading files, and their version history, from a remote repository to a local repository. In the next post I’ll walk you through these concepts in more detail, and we’ll set up local and remote repositories, and commit, push, and pull files.
Repositories
Local

Local repositories are used for managing files by individual users on a single computer. The most commonly used software for managing local repositories is git. Git is “a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.” You can download the installation files for git here: https://git-scm.com/downloads.
Remote

A number of options exist for remote repositories, of which GitHub, GitLab, and Bitbucket are three of the more common. All three offer free and paid plans for developers and teams, and, in addition to basic version control functionality (e.g., committing), include Wikis for project metadata, and various for tools for collaborating with a team (e.g. “Issues“). I’ll be using GitHub in future posts to help readers get started with version control, and to share code with readers.
Recommended Reading
As a supplement to this post, I encourage you to read sections 1.1 and 1.3 of “Pro Git: Everything you need to know about Git”
Next Time on Elfinwood Data Science Blog
In this post I provided an overview of version control. In the next post I’ll continue the series on version control, and provide a quick guide for getting up and running with Git.
If you like this post then please consider subscribing to this blog (see below) or following me on social media. If you’d like to follow this blog, but you don’t want to follow by subscribing through WordPress then a couple of options exist. You can “Like” the Elfinwood Data Science Blog Facebook Page or send me your email using the form on the Contacts Page and I’ll add you to the blog’s email list.
Literature Cited
Chacon S. and B. Straub. 2020. Pro Git: Everything you need to know about Git. Version 2.1.264. Apress. New York, NY. 521 pp. Online here: https://git-scm.com/book/en/v2 (accessed 2020-09-26).
*Note that the OCTOCAT® logo design is an exclusive trademark registered in the United States by GitHub, Inc.
Follow My Blog
Join 5 other subscribers
Copyright © 2020, Aaron Wells
8 thoughts on “Version Control: An Overview”