A Simple Introduction to Git and GitHub for Beginners

Vaibhav GagnejaVaibhav Gagneja
7 min read

Introduction

Git is a distributed version control system (VCS) that enables developers to track changes in their code and collaborate on projects efficiently. Created by Linus Torvalds in 2005 to manage the development of the Linux kernel, Git has since become the most widely used VCS in software development.

Unlike older version control systems that rely on a central server, Git allows each developer to have a complete copy of the entire codebase (including its history) on their local machine. This decentralized approach enhances collaboration and enables developers to work offline seamlessly.

Key Features of Git

  • Efficient Branching

  • Merging

  • Commit History with Cryptographic Integrity

These features make Git the go-to tool for version control in modern software development.

What is GitHub?

GitHub is a web-based hosting platform for Git repositories. It allows developers to store, share, and collaborate on code with others. While GitHub is not the only platform for hosting Git repositories, it is one of the most widely used and popular platforms for collaboration and sharing code.

Checking if Git is Installed

To check if Git is installed on your system, use the following command:

git --version

If Git is not installed, follow this guide to install it on your machine.

Initializing a Git Repository

To initialize a Git repository in your project, use the following command:

git init

Create or Clone a Project

  • To create a new project, simply initialize the Git repository in the project directory using git init.

  • To clone an existing project, use the command:

    • git clone [url]

Git Workflow: File Stages

Git tracks files through three primary stages before they are committed to the version history:

  1. Untracked: The file exists but is not yet part of Git’s version control.

  2. Staged: The file has been added to Git’s version control, but changes haven’t been committed.

  3. Committed: The changes have been committed to the repository.

To check the current status of your files, use:

git status

Git Fundamentals

Adding and Committing Changes

  • Staging a file: To add an untracked file to the staging area, use:

    • git add
  • Committing changes: After staging the changes, commit them with a message:

    • git commit -m "<commit message>"
  • Viewing commit history: To view a concise history of commits, use:

    • git log --oneline

Internal Working of Git

Git stores its data in a hidden directory called .git. Within this directory, Git uses objects stored in the .git/objects folder.

  • Commit Object: A commit is a type of object in Git.

  • Tree Object: Represents a directory in Git and contains references to other trees (subdirectories) and blob objects (files).

  • Blob Object: Stores the content of a file.

To view the content of Git objects (commits, trees, blobs) without directly interacting with object files, use:

git cat-file -p <object_hash>

Git Configuration

Git allows you to set configuration options at different levels:

  • Global configuration: Applies to all repositories for a user.

  • Local configuration: Specific to a single repository.

To set a configuration option, use:

git config <level> <key> <value>

To delete configuration values or sections:

  • Delete a single value: git config --unset <key>

  • Delete multiple values of the same key: git config --unset-all <key>

  • Remove an entire section: git config --remove-section <section>

Git Commits

When a commit is made, Git creates a commit object that includes several pieces of information:

  • A reference to the snapshot of the staged content.

  • Details about who created and who committed it.

  • The commit message.

  • Links to previous commits.

  • The first commit has no previous commit (no parent).

  • A regular commit has one parent.

  • A merge commit has multiple parents.

Staging Files and Commits

Staging files involves computing a checksum (SHA-1 hash) for each file, storing that version as a blob in the Git repository, and adding the checksum to the staging area.

When committing, Git checksums each subdirectory and stores them as tree objects. The commit object then points to the root project tree, allowing the recreation of the snapshot.

Branching in Git

The git branch command manages branches:

  • git branch lists all branches, with an asterisk indicating the currently checked-out branch.

  • git branch <branch_name> creates a new branch pointer at the current commit.

Working on different branches allows for isolated development of features or fixes.

Merging in Git

Merging integrates changes from one branch into another. The command git merge <branch_name> merges the specified branch into the currently checked-out branch.

  • If the branches have diverged, Git performs a "three-way merge" and creates a new merge commit with two parents (the tips of the merged branches).

  • Sometimes, changes don’t conflict, and Git can perform a "fast-forward" merge, simply moving the pointer of the current branch to the tip of the merged branch without creating a new commit.

Visualizing Commit History

The command git log --oneline --decorate --graph --all visualizes the commit history, showing branch pointers and history divergence.

Default Branch Naming

The git branch command shows the current branch (marked with an asterisk). By default, the initial branch is often named master, but it is increasingly being updated to main or trunk.

GitHub recently changed the default branch for new repositories to main. Using main as the default branch is recommended. The default initial branch name can be configured globally with:

git config --global init.defaultBranch main

Switching Branches

  • git checkout <branch_name> is an older command that serves multiple purposes, including switching branches.

  • For simply switching branches, git switch is recommended.

  • To create and switch to a new branch, use: git switch -c <new_branch_name>

  • The older equivalent is: git checkout -b <new_branch_name>

Git and Github

Rebasing

Rebasing is another way to integrate changes from one branch into another. It works by taking the commits of the feature branch and replaying them on top of the target branch, effectively changing the base of the feature branch.

  • The command git rebase <target_branch> (when on the feature branch) replays its commits onto the <target_branch>.

  • Rebasing results in a cleaner, linear history that can be easier to read and understand.

Warning: Never rebase a public branch like main that others are collaborating on, as it can cause significant issues with history synchronization.

Fetching and Pushing

  • Fetching retrieves commits and objects from a remote repository but does not integrate these changes into your local branches. git fetch <remote_name>

  • Adding a remote repository: git remote add <name> <url>

To link a local repository to a remote GitHub repository: git remote add origin <repository_url>

  • To verify the remote connection: git ls-remote

  • Pushing sends your local commits and branches to the specified remote repository: git push <remote_name> <branch_name>

    • For example: git push origin main

Pulling from Remote Repositories

Pulling combines fetching and merging: git pull <remote_name> <branch_name>

This fetches changes from the remote and automatically merges them into your current local branch.

GitHub and Pull Requests

The typical workflow for contributing to a team project involves:

  1. Creating a branch.

  2. Making changes.

  3. Pushing the branch to a remote repository (often a fork).

  4. Creating a pull request through the GitHub interface.

A pull request (PR) is a feature of platforms like GitHub that allows you to propose changes from a branch in your repository to another branch in the same or another repository.

Gitignore File

The .gitignore file specifies intentionally untracked files that Git should ignore. This is useful for excluding temporary files, build artifacts, node modules, etc.:

  • .gitignore files can exist in subdirectories, and their rules apply to the files within those directories.

  • Common wildcard patterns like * (matches zero or more characters) are supported.

Forking and Contributing to a Team Project

To contribute to a team project (e.g., the "Mega Corp" repo), you typically need your own fork of the repository:

  • A fork is a copy of the original repository in your own account that you can modify without affecting the original.

  • Forking is not a Git operation itself but a feature offered by Git hosting services like GitHub, GitLab, and Bitbucket.


Resources

For more in-depth learning on Git, here are some helpful resources:

This concludes the guide to understanding Git and GitHub. Happy coding!

1
Subscribe to my newsletter

Read articles from Vaibhav Gagneja directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vaibhav Gagneja
Vaibhav Gagneja