Git Shallow Clone

Learn via video courses
Topics Covered

Overview

In this article, we are going to learn about Git Shallow Clone. Before getting started with the topic, let us get a short overview of the topic.

Git Shallow Clone: Whenever you are cloning any repository in Git by using the git clone command, It usually clones the entire repository which includes all of the files as well as the commits history and every commit of files ever made. If your repo size is small then it is fine, but in case your repo consists of a huge history of commits, then cloning the entire repo is not a good practice. You will relate to this more if you have limited data for your local repository. In this scenario, Git shallow clone can help you, as it will only clone the latest commits history or a particular depth of history of your own choice, and not the entire commit history resulting in reducing the clone size of your repository.

Pre-requisites

Before getting started with the topic, you must have a clear understanding of a few topics like :

Introduction to Shallow Clones

Whenever you are cloning any repository in Git by using the git clone command. It usually clones the entire repository which includes all of the files as well as the commits history and every commit of files ever made. If your repository size is small then it is fine, but in case your repository consists of a huge history of commits, then cloning the entire repository is not a good practice. You will relate to this more if you have limited data for your local repository.

In this scenario, Git shallow clone can help you, as it will only clone the latest commits history or a particular depth of history of your choice, and not the entire commit history resulting in reducing the clone size of your repository. Hence, if you are cloning any project, and it consists of history from hundreds of commits, or years of history, then you can use the shallow clone to clone a particular depth of commit history.

However, there is compensation for using a shallow clone that is it deviates from at least a single Git distribution assumption, and you might not be prepared to accept those compensations.

These considerations are more likely to be beneficial or even required if you are interacting with Git at that size and dealing with a very big mono repo.

Make sure you are familiar with how Git saves your data, including trees, commits, and blob objects, before diving into this topic.

Shallow Git Clone Benefits

Let us now learn about some of the benefits of shallow git clones. Apart from limiting the size of a large repository in Git which we are cloning in our local system, there are other benefits also of shallow git clone, let us discuss them in points.

  • Using the shallow git clone, we can easily search for the latest commit history and solve issues in Git.
  • Using the shallow git clone, you can clone only the particular depth of history of your choice.
  • With the help of a shallow git clone, you can skip the useless commits history from the repository.
  • Shallow git clone consumes very less disk memory.
  • A shallow git clone executes more quickly than a regular git clone.

How to Perform a Shallow Git Clone

Now, let us learn how we can perform a shallow git clone in a repository, with the help of syntax.

Syntax:

Explanation:

A shallow git clone is accomplished in this manner. You can mention the depth as a parameter in the cloning command, and after the shallow git clone execution is finished, you can check the clone depth by running the git log -oneline command, and you will see a particular depth of history of your choice.

You can also access a single branch, using the shallow git clone command:

Fewer files are obtained while using git shallow clone. Thus, they replicate more quickly. Comments and builds may be sent more quickly.

Steps to Perform a Shallow Git Clone

To execute a shallow git clone, follow these steps:

  • Get the repository's HTTP address in order to clone it.
  • Write the git clone command along with the option --depth 1 (git clone --depth 1 [remote URL]). Basically, if you give depth as 1 that denotes that you are only interested in the latest commits.
  • Then, execute the git clone command in the terminal.
  • After that, go to the directory, which you have cloned using the shallow git clone.
  • Check the clone depth by running the git log -oneline command.

Shallow Git Clone Surprise

The most surprising thing about shallow git clones for any developer is, whenever we perform the shallow git clone operation, only a particular branch is cloned to the local repository. We can use the --single-branch option, for cloning a particular branch in the local system, and if we use this, it means that we cannot get any other branches in our local system. If we try to change to some other branch that is present in the remote repository, we will end up getting the pathspec error.

shallow git clone surprise and pathspec error

Shallow Clone a Specific Branch

A shallow git clone will by default work on the master branch. But if you'd rather git clone a particular branch, all you have to do is include the branch name with the --branch switch.

For example, to shallow git clone the repository hello.git with a branch would do:

Shallow Clone to A Directory

When building continuous integration routines that need to clone numerous branches of a single repository and each clone operation needs a different directory, the option to give a directory name is useful. At the end of the shallow command, we mention the directory we want to shallow clone.

For example, to shallow git clone the repository myrepo.git with directory dir do:

Shallow Clone on Git in Linux

Let us now look at some examples of shallow clones on Git in Linux for more clear understanding.

Example 1:

Let us look at an example where we will be cloning a Java-design-patterns repository in the usual way, and also in a shallow clone way to get a clear understanding of the difference in size in both cases.

Let us now follow these steps :

Step 1: We will clone a Java-design-patterns repository in normal format

Step 2: To get the size of the repository we will be using the following command

Output:

The output is as follows:

cloning java design patterns repository

Step 3: Now, we will be cloning the Java-design-patterns repository in shallow git clone foramt.

Step 4: To get the size of the repository we will be using the following command

Output:

The output is as follows:

output size of repository

We can observe the size difference between both cloned repositories. The one we cloned normally has a greater size(57MB) than the one with a shallow clone(35MB). Hence, shallow clone helps in reducing the usage of disk space.

Example 2:

Let us look at an example where we will be cloning a scikit-learn repository in the usual way, and also in a shallow clone way to get a clear understanding of the difference in size in both cases.

Let us now follow these steps :

Step 1: We will clone a scikit-learn repository in normal format

Step 2: To get the size of the repository we will be using the following command

Output:

The output is as follows:

size of scikit learn repository

Step 3: Now, we will be cloning the scikit-learn repository in shallow git clone format.

Step 4: To get the size of the repository we will be using the following command

Output:

The output is as follows:

cloning scikit learn repository

We can observe there is a huge difference in the size of both cloned repositories. The one we cloned normally has a greater size (174MB) than the one with a shallow clone (31MB). Hence, shallow clone helps in reducing the usage of disk space.

Example 3:

Now let us look at an example, where we will be cloning multiple branches of the java-design-patterns repository. The --no-single-branch flag will be used in the example that follows. If the --no-single-branch option is included, depth is applied for every branch rather than just the one that the shallow clone expects by default. Consequently, we will shallow clone a number of branches from the Django source.

Step 1: For shallow cloning with multiple branches, use the following command :

Step 2: Now get the size use the following command :

Output:

The output is as follows:

shallow cloning with multiple branches

We can observe that this time the size is increased from 35MB to 45MB this is due to cloning multiple branches from the repository.

Step 3: To check all the branches you can run these commands.

Output:

The output is as follows:

check all branches

Conclusion

In this article, we learned about the Git Shallow Clone. Let us recap the points we discussed throughout the article:

  • Whenever you are cloning any repository in Git by using the git clone command. It usually clones the entire repository which includes all of the files as well as the commits history and every commit of files ever made.
  • Using Git shallow clone you will be able to only clone the latest commits history or a particular depth of history of your own choice, and not the entire commit history resulting in reducing the clone size of your repository.
  • There is compensation with using a shallow clone that is it deviates from at least a single Git distribution assumption, and you might not be prepared to accept those compensations.
  • Apart from limiting the size of a large repository in Git which we are cloning in our local system, there are other benefits also of shallow git clone, such Using the shallow git clone, we can easily search for the latest commit history and solve issues in Git, shallow git clone consumes very less disk memory, a shallow git clone executes more quickly than a regular git clone, etc.
  • Syntax of shallow git clone is git clone --depth [depth] [remote URL]. You can mention the depth as a parameter in the cloning command, and after the shallow git clone execution is finished, you can check the clone depth by running the git log -oneline command, and you will see a particular depth of history of your choice.
  • The most surprising thing about shallow git clones for any developer is, whenever we perform the shallow git clone operation, only a particular branch is cloned to the local repository.
  • A shallow git clone will by default work on the master branch. But if you'd rather git clone a particular branch, all you have to do is include the branch name with the --branch switch.
  • When building continuous integration routines that need to clone numerous branches of a single repository and each clone operation needs a different directory, the option to give a directory name is useful. At the end of the shallow command, we mention the directory we want to shallow clone.
  • Then we have seen some examples of shallow git clone in Linux.