Exploring Git : How it works under the hood ?

Aviral AsthanaAviral Asthana
3 min read

Concept of Git

Git is build on the concept of snapshots. It means that git stores the snapshot of the file at the moment rather than storing the differences. Each time you commit a change a snapshot of the file is captured at the moment and stored by Git.

If the file doesn't change git stores the link of the previous file instead of a new snapshot.

Three main areas

  1. Working Directory : This is the area where the file lives and the changes are made in the files.

  2. Staging Area : Before committing changes, the files are staged in the staging area.

  3. Repository (.git directory) : Located in the root directory of your project, this is where the git stores your project and commit history, branches and the tags.

Git Objects

  • BLOB (Binary Large Object): These stores the file data which is the content of the file but not the metadata of the file

  • Trees : The trees which is a git object stores information about the directory. A tree points to the Blob or the tree objects and store the file's metadata.

  • Commits : A commit object points to a tree object which shows the state of a directory at a certain point in time, the parent commit (previous commit) and the metadata such as the author, committer and the commit message.

  • Tags : Tags are pointers to specific commit, often used to mark release points.

Commit Graph

Git stores the history of commits as a Directed Acyclic Graph (DAG). Each commit points back to one or more commits. This structure allows git to efficiently manage branches and merges.

How Git stores data ?

Once you make changes in a file and stage it into staging area. Then after committing the changes the following happens :

  • Creation of Blob object

  • Creates a tree object : Stores the directory structure and metadata are stored as tree objects. Points to blob objects representing the files.

  • Creates a commit object : Commit object points to the tree object which represents the state of the directory at that point of time.

Branching and Merging

Branches in git are simply pointers to specific commits. When you create a new branch, git creates a new pointer.

Merging involves combining changes from two branches.

There are two ways by which git merges two branches :

  1. Fast-Forward Merge : If the current branch is an ancestor of the branch being merged in, Git simply moves the branch pointer forward. If master branch has not diverged then instead of making new commit it master will just point to the latest commit of the feature branch.

  2. Three-Way Merge : If branches have diverged, git performs a three way merge, creating a new commit with the two parents.

Garbage Collection

Git periodically runs a garbage collection process to clean-up unreferenced objects, which might occur after operations like rebasing, resettling or deleting branches. These files are there in git before being deleted.

Thank you for reading. Keep reading. Keep learning

2
Subscribe to my newsletter

Read articles from Aviral Asthana directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Aviral Asthana
Aviral Asthana