Creating a git repository
Warning
To properly start this training session, you must be within a folder where it makes sense to work during the course:
mkdir ~/training-reproducible-research-area
cd ~/training-reproducible-research-area
Let's start by running this command:
git --version
You should get something like git version 2.37.1 (Apple Git-137.1)
. That means git is well installed on your computer and you can go ahead. Then try the following command and see what happens:
git status
Tip
If you try to run git status
in a non-Git directory, it will say that it is not a git repository. The way this works is that Git adds a hidden directory .git/
in the root of a Git tracked directory (run ls -a
to see it). This hidden directory contains all information and settings Git needs in order to run and version track your files. This also means that your Git-tracked directory is self-contained, i.e. you can simply delete it and everything that has to do with Git in connection to that directory will be gone
Let start by retrieving the Reproducible Research course git repository somewhere on internet:
git clone https://github.com/SouthGreenPlatform/training_reproducible_research
The git clone
command copies a remote git repository locally. No worries, we will review this concept later. You now have a local directory called training_reproducible_research
, run ls -l
to see it
Move in it and run the git status
command again, what do you see?
cd training_reproducible_research
git status
The directory you are looking is a version-tracked directory. Indeed the git status
command should return something like this:
On branch main
Your branch is up to date with 'origin/main'.
nothing to commit, working tree clean
Note
git repository == version-tracked directory
Let's now create our own git repository. First get out of the training_reproducible_research
directory and go back to the git_tutorial
one:
cd ..
Warning
From here you must be back to the ~/training-reproducible-research-area
directory
In order to create a new Git repository, we first need a directory to track. Let's create one and move inside:
mkdir git_tutorial
cd git_tutorial
This directory is not yet a version-tracked directory, you can check it using again the git status
command. Now we can initialise Git with the following command:
git init
The directory is now a version-tracked directory. How can you know? Run the command git status
, which will probably return something like this:
On branch main
No commits yet
nothing to commit (create/copy files and use "git add" to track)
The text nothing to commit (create/copy files and use "git add" to track)
tells us that while we are inside a directory that Git is currently tracking, there are currently no files being tracked; let's add some!
Copy the following files from the training_reproducible_research
directory we get at the beginning of the excercice into your git_tutorial
directory:
cp ../training_reproducible_research/tutorials/git/Dockerfile .
cp ../training_reproducible_research/tutorials/git/Snakefile .
cp ../training_reproducible_research/tutorials/git/config.yml .
cp ../training_reproducible_research/tutorials/git/environment.yml .
Once you have done that, run git status
again. It will tell you that there are files in the directory that are not version tracked by Git.
Note
For the purpose of this tutorial, the exact contents of the files you just copied are not important, but we provide a brief overview here:
- The
environment.yml
file contains the Conda environment with all the software used for a defined analysis. - The
Snakefile
andconfig.yml
are both used to define a Snakemake workflow. - The
Dockerfile
contains the recipe for making a Docker container for a defined analysis.
Quick recap
We have used two git
commands this far:
git init
tells Git to track the current directory.git status
is a command you should use a lot. It will tell you, amongst other things, the status of the current directory (version-tracked or not), the status of files contained in your reposiory (Tracked or not), of a Git clone in relation to the online remote repository, etc.