Mastering Your Git Workflow: Syncing Your Fork with upstream

shubhangi singhshubhangi singh
6 min read

Git is a key tool for collaborative software development. While most developers are comfortable with pushing and pulling from a single origin remote, understanding how to work with two remote repositories, particularly by leveraging the upstream remote, unlocks a more robust and flexible workflow. This is especially true when dealing with project forks.

This article will explain the upstream concept and guide you through a common scenario: maintaining a disconnected fork in sync with a new, primary project repository.

The Dual-Remote Paradigm: Why upstream Matters

When you contribute to an open-source project or work with a team that uses a fork-based workflow, you typically start by creating your own fork of the main project repository. This fork resides in your personal account (e.g., on GitHub), providing you with a sandbox to make changes without directly affecting the original codebase. When you clone your fork to your local machine, your local repository's default remote, origin, points to your personal fork.

The challenge arises when the original project repository is somehow discontinued or changed. Your personal fork then becomes a "disconnected fork," as it no longer has a direct lineage to the active main project. If you have made changes in this disconnected fork, you'll want to integrate them into a new project repository. This is where the upstream remote becomes invaluable.

Step-by-Step: Syncing Your Disconnected Fork with a New origin

Let's walk through a practical example of how to set up and manage your local repository with both origin (your new fork) and upstream (the disconnected fork containing your past work).

Scenario: You had forked an original project, which has since been discontinued or deleted. You have a local copy of this disconnected fork containing your changes. Now, you need to establish a connection to a new project repository (which you will fork and designate as your new origin) and then bring your changes from the disconnected fork into this new origin via upstream.

1. Forking the New Repo (Conceptual Step)

Before proceeding, you would have created a new fork of the new primary project repository (e.g., dummy-original-user/dummy-neworiginal-project.git) to your own account (e.g., dummy-your-username/dummy-your-new-fork.git). This step is typically performed directly on the hosting platform (GitHub, GitLab, etc.).

2. Disconnected Fork: Setting Your origin

You have a local copy of your disconnected fork. Its origin is likely pointing to the old, now-irrelevant remote. You need to update this origin to point to your new fork of the active project. This is where you will push your integrated changes.

Bash

git remote set-url origin git@dummy-github.com:dummy-your-username/dummy-your-new-fork.git

Explanation: This command updates the URL for your origin remote. It's crucial that origin now points to your personal copy of the newly forked repository.

3. Adding the upstream Remote

Now, let's add your disconnected fork repository (the one with your existing changes) as your upstream remote. This is the source from which you'll pull your past work.

Bash

git remote add upstream git@dummy-github.com:dummy-original-user/dummy-disconnected-fork-project.git

Explanation: This command creates a new remote named upstream and associates it with the URL of your disconnected fork repository (which effectively holds your "upstream" changes in this specific scenario).

To verify that both origin and upstream are correctly configured, run:

Bash

git remote -v

You should see output similar to this:

origin  git@dummy-github.com:dummy-your-username/dummy-your-new-fork.git (fetch)
origin  git@dummy-github.com:dummy-your-username/dummy-your-new-fork.git (push)
upstream        git@dummy-github.com:dummy-original-user/dummy-disconnected-fork-project.git (fetch)
upstream        git@dummy-github.com:dummy-original-user/dummy-disconnected-fork-project.git (push)

4. Pulling Changes from upstream

With upstream configured, you can now fetch and integrate the latest changes from your disconnected fork (your past work) into your current local repository.

Bash

git pull upstream

Explanation: This command fetches all branches and their respective commits from the upstream remote (your disconnected fork) and automatically merges any changes into your current local branch. This brings your existing changes into your newly linked local repository.

5. Viewing upstream Branches

After pulling, you might want to inspect the branches available in your upstream (disconnected) repository. This helps you identify specific branches where you previously worked.

Bash

git branch -a

Explanation: This command lists all local branches and all remote-tracking branches. Remote-tracking branches for upstream will be prefixed with remotes/upstream/, showing you what branches existed in your disconnected fork.

You might see entries like:

  dummy-my-feature-branch
* main
  remotes/origin/main
  remotes/upstream/bug-414331107
  remotes/upstream/bug-414534957
  remotes/upstream/bug-abort-call-composer-listing
  remotes/upstream/bug-abort-import-api-composer-listing
  remotes/upstream/bug-accelerator-type-vertex-create-form
  remotes/upstream/main
  remotes/upstream/feature-new-dashboard

6. Integrating Specific upstream Branches into Your Workflow

Let's say you want to bring a specific branch from your disconnected fork (now upstream) into your workflow, perhaps to prepare it for the new project.

Step 1: Checkout to Your Main Branch (or a relevant base branch)

It's often good practice to ensure your main branch is up-to-date before creating new branches from upstream changes.

Bash

git checkout main

Step 2: Create a New Local Branch Tracking an upstream Branch

Now, create a new local branch that is based on and tracks a specific upstream branch from your disconnected fork. This example uses a hypothetical release branch.

Bash

git checkout -b dummy-project-release-v2.0 upstream/dummy-project-release-v2.0

Explanation: This command creates a new local branch named dummy-project-release-v2.0 and configures it to track the dummy-project-release-v2.0 branch from the upstream remote. This means your new local branch will immediately contain all the changes from that upstream branch.

When you've made local commits on your branch and then pull changes from upstream, using rebase instead of a simple merge can lead to a cleaner, linear commit history. Rebasing reapplies your local commits on top of the latest upstream changes.

Bash

git rebase upstream/dummy-project-release-v2.0

Explanation: This command takes your current branch's commits (if any are ahead of upstream/dummy-project-release-v2.0) and reapplies them on top of the latest commit from upstream/dummy-project-release-v2.0. This results in a cleaner history, making it easier to follow changes. You may encounter merge conflicts during a rebase, which you'll need to resolve.

Rebase Configuration (Optional but Recommended for Regular Pulls):

For a more automated rebase experience when you git pull, you can configure Git to always rebase. Use this setting with caution and only if you understand its implications, as rebase rewrites history:

Bash

git config pull.rebase true

Explanation: With this setting, any git pull operation will automatically attempt to rebase your local changes onto the fetched branch, rather than performing a merge.

8. Final Push to Your New Fork (origin)

After successfully integrating the upstream changes (your past work) into your local branch (and resolving any conflicts), the final step is to push these updates to your new fork on GitHub.

Bash

git push origin dummy-project-release-v2.0

Explanation: This command pushes the dummy-project-release-v2.0 branch from your local repository to your origin remote. Now, your new fork on GitHub (or other hosting service) is up-to-date with the upstream changes you've integrated.

image credit: stackoverflow

Conclusion: Empowering Your Git Workflow

By understanding and effectively utilizing the upstream remote, you transform your Git workflow from a simple push-and-pull to a sophisticated synchronization mechanism. This approach is particularly powerful when migrating changes from a discontinued project or managing complex codebases across multiple remote sources. Embracing the upstream remote empowers you to stay synchronized, contribute seamlessly, and maintain a clean and coherent commit history.

0
Subscribe to my newsletter

Read articles from shubhangi singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

shubhangi singh
shubhangi singh