Gotcha, Git!

Rohith KRohith K
7 min read

We've all pulled our hairs out at some point due to Git's confusing intricacies. One such scenario could be when you see the dreaded message - "Your branch and X branch have diverged ....". You would neither be able to pull nor push and ARGH it's frustrating!

Why does it happen? - Git treats its commits and their histories as immutable. When multiple collaborators are on a repo pushing updates, and altering branches and their histories, it is natural for a shared branch to diverge in each collaborator's environment.

The most common cause of diverging branches is a 3-way merge. Let's take a look at what it is. Say there is a repository with "main" as its default branch. You checkout a new "feature" branch from main and start pushing updates to your feature branch.

In the illustration above each alphabet represents a commit. It can be seen that the feature branch was checked-out off of commit D - to which the HEAD currently points. Say updates are published to the main branch while you're still developing your feature branch. Now the commit tree would look something like this :

The branches have diverged. Your next step would be to pull in the updates published to the main branch into your feature branch. How can that be done?

Git Merge to the rescue! Say you're looking to publish your changes to the main branch now. Since "main" and "feature" have diverged you would have to merge both the branches to linearize the history. Note that we've started our discussion by trying to understand a 3-way merge. In the example above, D is the point of diversion or a common ancestor for both branches, if you will. The HEAD points to I. As far as Git is concerned our objective essentially involves merging I and G from D. Hence, a 3-way merge.

J is the HEAD now, after the merge. The illustration above depicts the scenario where you merge the feature branch into the main branch. Notice that the commit histories are untouched and the merge operation is creating a new commit here (so you would be prompted to write a commit message).

You know, you can also merge the main branch into your feature branch. And perhaps continue development on your feature branch. This operation however would not affect the main branch. The HEAD would remain the same post the merge. If you do this and finally merge your feature branch into the main branch your git tree will look something like this:

I^ is the HEAD after the first merge into the feature branch. J* is the First merge commit. You push a couple of updates: K & L commits; Then check-out the main branch and merge the feature branch thereby creating M** merge commit which is the final commit to which the HEAD finally points.

Conflicts

It is important to note that often a merge operation might not go through once you execute the command $ git merge [branch] . When working on a shared repo/branch conflicts arise. A conflict is when git by itself is unable to resolve how it would go about merging a particular part of a file that was modified by more than one collaborator. (You can enforce an algorithm/strategy for git to follow. More about this here: merge strategies)

How can these pesky conflicts be resolved? Use the editor of your choice! But where are these conflicts? Run the $ git status command to see the files that are unmerged or untracked - they are shown in red color. Ladies and gentlemen, we got him! Open these files in an editor (I use vim :smug:) and you will see these special characters <<<<<, =====, >>>>>. Typically it shows which changes are from the main branch and which ones are from your branch. Make the changes you desire, remove these special characters, and commit this change. You can go ahead with your merge now with a happy face :)

If you don't want to go ahead with a merge when you are faced with conflicts, you can simply abort the merge using $ git merge --abort

Enter Rebase

We've seen what the merge operation is about. We've also noted how the histories aren't tampered with during merges. Now, let's take a look at the Rebase operation. Rebasing is essentially moving the commits off of a development branch and arranging them sequentially over the main branch's HEAD.

Ok, let's circle back. What does rebase mean? Let's consider the same example we did at the very first - where our feature branch diverges from the main branch.

Now what? Let's rebase it. Make sure you're on the feature branch. $ git checkout feature. Lo and behold $ git rebase main. What happens now? The commits E, F, and G are moved to a staging area/ holding area where these are sequentially replayed and placed on top of the main's HEAD. Let me visualize the steps for you:

Voila! A lot happened here. (For brevity, I've conflated the steps into a single one in step2) But what happened? Notice that J has been the HEAD before we rebased and even after. Now, in step1, from the common ancestor of both branches, the commits in the feature branch were taken to a Staging area. Git is essentially creating copies of the original commits. Consequently, their commit hashes are going to be different now. In step2, the copies from the staging area are replayed sequentially in the same order on top of the HEAD. But, the HEAD still points to J. Because a rebase anchors the commits off of the feature branch and onto the main but doesn't merge these changes. Now it is you who has to go ahead and merge the copies onto the main branch. This is not a 3-way merge. Notice that after the replay, there is no divergence between the branches. It is a fast-forward merge. And owing to that, there would be no merge commit! (You won't be prompted to enter a commit message in vim, yay!)

A very important thing to note here is the commit history has changed! Because of the copies and their brand new hashes. A simple scenario leading to pitfalls is when you are working with a collaborator on a feature branch. In the meantime, the main branch has been updated. Noticing this, you've rebased your feature branch but the other collaborator working on the same feature branch hasn't. Your collaborator and you end up having different commit histories in the feature branch despite having the same changes. Git sees these as different changes and you would need to resolve these later on. This is a simple example but rebasing could lead to numerous such pitfalls.

BE VERY CAREFUL WHEN REBASING BRANCHES IN A SHARED ENVIRONMENT.

Merge and Rebase aren't 2 different teams solving different problems but they are one team.

Pulling off a pull

A pull operation is fetch + merge.

Say, you've new changes in your local branch and there are changes in the remote branch that you would like to have in your local branch you can pull them. This is a no-fluff, quick, and compact operation. Let's take a look at 3 popular types of pulls.

  1. $ git pull --ff

    This operation fetches the remote changes and merges them to your local thereby leading to a merge commit. (ready for the commit message prompt?) If there are no changes in your local, there will be no merge commit and it will be a fast-forward merge.

  2. $ git pull --rebase

    This operation fetches the remote changes and reorders them such that your local changes are anchored on top of your newly fetched changes. So your git history would look like you've worked on the fetched changes first and then you began making your local changes.

  3. $ git pull --ff-only

    This would just fetch the changes and ask you what to do next in case there are new local changes (diverging branches). If there are none, it would merge.

Congrats on making it through to the end of the article! I hope you'll feel more confident now when dealing with pesky git conflicts :)

0
Subscribe to my newsletter

Read articles from Rohith K directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rohith K
Rohith K

Currently pursuing masters in comp sci at the UIUC. Previously worked as a software engineer in Bengaluru. Experienced with building event driven systems at scale.