What is Git Resetting?
Overview
Git reset command is used to undo changes made to the repository. It provides three different options to reset i.e. hard, soft, and mixed. A mixed reset is the default reset. Hard changes all three trees of the git. Whereas, hard only changes the staging index and commit history and soft only changes commit history.
Pre-requisites
- basic git commands
- git commit
- git status
Introduction to Git Reset
The git reset command helps us undo changes in the git repository. It is a complex and versatile tool for undoing changes. It can be risky and data loss may occur if not done with precautions and care.
When working in a team, very often we commit mistakes. This affects the project in the long or short run if not handled. It becomes impossible to track as new commits, files, and features are added. Sometimes we might commit the wrong file or commit the right file to the wrong branch. The git reset command is useful when we wish to reverse the progress of the project and take it back to the previous working state.
Before we see the git reset command in detail, let us first understand the three trees of Git.
Three Trees of Git
Git is a free and open-source distributed version control system used to manage projects. Every file or modification that is made within the git project, goes through three states Modified, staged, and committed. These three states are nothing but three trees of git. These three trees ensure no data is lost, all the modifications by any team member of the project are accommodated without data loss, and the project is in a stable state. Let us now look at these three trees individually in brief.
The Working Directory (Modified)
The very first tree is the working directory or modified state. This tree tracks the changes made to the local repository and represents the immediate changes made to the files and directory on the local level or the personal computer.
Any new files created or modifications made by the user which are not pushed to the remote repository are said to be in the working directory. Let us look at an example of the sample project to understand the working directory.
We will first create a repository and initially commit it to create a remote repository as well.
Here, we have first created a folder git_reset_sample with the help of mkdir. Then using cd we open the newly created folder in the terminal. Now using git init we mark this folder as a git repository. Next, we just made an initial commit to save this state of the repository.
Now let us create a file and make changes to it to understand the working directory.
We have created a file hellofile.py using the echo command. Now, the git status command shows us that one file exists such that it is locally edited and is neither tracked nor committed to the remote repository. This file is said to be currently in the modified state or the working directory.
Staging Index (Staged)
The next stage is the staging area or staging index. A staging area is a place before the actual implementation of the changes i.e. this area consists of all the files which are modified and are ready to be committed or merged to the remote repository. This is like packing all the stuff together which is to be transported.
The staging index tree tracks working directory changes that are to be saved to the project in the next commit. The git status command's output only displays changes between the Staging index and the commit history. To track the staging index working, we will be using the git ls-file -s command. It shows information about the files in the index, the staged contents' mode bits along with its object name and stage number in the terminal's output.
Let us apply the command to the previously created repository.
The git file is added to the staging index by using the git add file_name command. The git ls-files -s command shows a 1 file residing in the staging area. The very first code that we see is its mode bits i.e. in which mode the file is operated, next is the hashcode of the contents of the file i.e. object name and 0 is the stage number of the file along with the file name.
Now, we have created a file and added it to the staging area. The git ls-files -s now shows two files in the staging area. As we change the contents of the file2.txt we see that the object name of the file changes. But the first file hellofile.py is not modified and its object name remains the same. Object name returns the Git SHA-1 hash that changes with the modifications in the file. It is stored to track changes and identify pointers to commits.
The git status command only shows files that should be added to the staging area with the red color and the files to be committed with the green color. It is thus, important to note that the git status command output is useful for tracking changes but it doesn't display details associated with the staging index.
Commit History (Committed)
Once the files and modifications are added to the staging index tree these are ready to be committed i.e. ready for permanent changes. The final tree of the three git trees is the Commit history tree. This makes the changes permanent and lets the user track all the changes made to the repository using commit history.
The git commit -m "message here" command makes a commit to the remote repository and saved the changes added to the staging index. The git log command returns the commit history of the project.
How it Works?
The git reset command is very much similar to the git checkout command. The git checkout command solely operates on the HEAD whereas the git reset command acts on both HEAD as well as MAIN.
Let us look at the example below:
Here a, b, c, and d are the commits, and HEAD and MAIN pointers point to the last commit i.e. d.
Git Checkout Command
The git checkout command moves the HEAD pointer to the specified commit in this case to the commit b. It doesn't affect the branch shown above. It simply when asked will make changes on the branch starting at b. The repository is now in a detached HEAD state.
Git Reset Command
The git reset command shifts both the pointers i.e. HEAD and MAIN to the specified commit. It modifies the state of the repository back to the specified commit. All the changes after it was deleted. This modification always happens in the third tree, the commit tree of the three git trees.
Three Core Forms of Invocation
The git reset command has three operating modes namely - --soft, --mixed, and --hard. By default, when we use git reset the --mixed option is chosen and it resets the commit wherever the HEAD pointer points. The HEAD pointer can be shifted to any other commit using the git checkout command. The git reset command takes a commit code to precisely undo only unwanted commits.
Hard
This option of the git reset command is the most direct, dangerous, and often used. This option affects all three trees of Git i.e. the working tree, the stagging index, and the commit tree.
All the commits made after the specified commits are deleted. Then staging area and working directory are first cleared and then updated to match the progress with the specified commit. This implies that all the work hanging out in the working directory and staging area is lost. Files added after the commit are deleted and all the modifications are also removed. The project is thus, brought to the previous working state.
To demonstrate this, let us take an example. We will use the same above-created repository. We will first create a file named hardcommit.txt. Next, we have appended the text to the file2.txt.
As shown in the status, there are two modifications. Git status shows the two modifications that we made above.
Now we will commit the hardcommit.txt file by adding it to the staging area and then using the git commit command.
Yet file2.txt is neither added to the staging area nor committed.
Let us now check the three stages i.e. git status to check the working directory, git ls-files -s to check the staging area, and git log to check the commit history.
Next, let us see the changes executed by the git reset --hard command.
The git status command previously had two pending commits i.e. one in the local or working directory and the other in the staging area. Both of these files are removed from the status.
The git ls-files -s command now shows only two files i.e. file2.txt and hellofile.py. The rest of the two files are deleted. The file tempfile.txt was also in the staging area before the reset.
The git log shows only one commit as the reset command was called on the first commit, thus all the committed commits are reversed.
Git hard commit removes the changes from all three trees of git.
Mixed
When no operation is mentioned, the reset command operates in this mode. The staging index is reset to the state of the specified commit. The commits made after the specified commits are also removed from the log or commit history.
To demonstrate mixed reset, we will take the same example and perform the mixed reset from the reset state itself.
First, we will create two new files mixedcommit.txt and temp.txt. We also append text to the file file2. Then we add the fixed commit file to the staging area and commit it to the repository. Next, we add temp.txt to the staging area. Any modifications made to file 2 are still in the working directory.
As we can see that the changes made to the staging area are pushed back to the working directory in the git status part. The staging area is cleared as seen using git ls-files -s as well as the commits after the specified commit are also removed from the repository which is shown in the git log.
Unlike the --hard option, the --mixed option moves the staging area changes to the working directory. Just like the --hard commit it also clears the staging area and commit history for the changes made after the specified commit.
Soft
The --soft option of the git reset command resets only the commit history i.e. removes the commits made after the specified commit. It leaves the working directory and the staging area untouched.
To demonstrate the effect of the --soft option let us first apply git reset --hard to clear the previously made changes.
Now, let us again create two new files namely temp.txt and softcommit.txt. We will also append text to the file2.txt. Next, we will add softcommit.txt to the staging area and then commit it to our git repository. We will now add temp.txt to the staging area. As we see again we have a file file2.txt that is only edited in the working directory, it is neither staged nor committed. The file softcommit.txt is committed and temp.txt is in the staging area.
We next performed the git reset --soft command. This command removed the commits after the specified commit but it didn't make any changes to the staging area or the working directory.
Note: if we don't mention the commit in the git reset command by default it will act on the commit pointing to HEAD.
The image below explains the three options available for the git reset command.
Real-world Scenarios
Let us take an example of the live project where you are a member of the team. Now, by mistake, you commit a change to the master branch but you were asked to make this change in the other existing branch. Given below is the visualization of the above scenario:
Now, to solve the issue first of all we need the most recent commit to be on the existing branch. Since the master currently points to the recent commit, our task is to make the master point to its previous pointer, copy the recent commit on the master branch to the existing branch and reset the master branch.
This can be done using below steps:
-
Step 1: git checkout existing We will first switch the HEAD pointer to the existing branch using the checkout command.
-
Step 2: git cherry-pick master Next, we will copy the most recent commit on the master branch on the existing branch using the git cherry-pick command.
The current repository looks something like this:
- Step 3: git checkout master Now we will move the HEAD pointer back to the master branch.
- Step 4: git reset --hard HEAD Lastly, we will apply the git reset --hard command to reset the master branch back to its previous state. This deletes the commit made by mistake on it.
The above example demonstrates how we can use git commands wisely to solve complex problems without losing important data.
Conclusion
- Git reset command is a complex command used to undo changes in the git repository. Though it is risky and may lead to data loss if not used carefully.
- Every file in git undergoes three stages, the working directory is where changes are made by the user in the initial state.
- Next these changes are added to the staging index, which stores all the potential changes that can be committed in the next commit. The commit history makes permanent changes to the repository and stores the history of the changes.
- Git resetting has three core forms of invocation. These are hard, mixed, and soft.
- The --hard option clears the working directory as well as the staging area. It also clears the commit history of all the commits made after the specified commit or HEAD.
- The --mixed or the default option clears the staging area, moves the contents of the staging area to the working directory and it also changes the commit history.
- The --soft option only changes the commit history. It doesn't touch the staging area or the working directory.