Getting Started with Git: Essential Commands for Beginners
Introduction
This guide will walk through Git basics, covering essential commands for setting up, tracking files, committing changes, and working with version history.
Setting Up Git
Configure User Information
git config --global user.email "me@example.com"
git config --global user.name "My name"
Explanation: This sets your email and name globally, which Git will use to identify your commits.
Why It Works
These commands configure Git to store your identity, which will appear with each commit. This information is stored in a configuration file at ~/.gitconfig
for global settings or in .git/config
for specific repositories.
How It Works
When you run git config
, Git interacts with the file system to write these settings into the configuration file. This is done using the OS’s file handling functionalities to read and write plain text.
Initializing a Repository
Create a Directory and Initialize Git
mkdir checks
cd checks
git init
Output:
Initialized empty Git repository in /home/user/checks/.git/
Explanation: git init
creates an empty repository in the current directory, which Git will use to track file changes.
Why It Works
git init
creates a hidden .git
directory in the current folder, marking it as a Git repository. This directory contains all necessary files and subdirectories, such as HEAD
, config
, objects
, and refs
, to track versions and commits.
How It Works
File System Interaction: Git creates the
.git
directory and multiple subdirectories, leveraging OS functions to create, write, and organize files.Programming Logic: Git uses hashes (SHA-1 checksums) to track files and changes, storing these in the
objects
directory. Each time a new change is committed, Git generates a new hash to uniquely identify the snapshot of the project’s files.
Basic Commands for Navigation and File Tracking
Listing Files in a Directory
ls -la
Output:
total 12
drwxrwxr-x 3 user user 4096 Nov 8 18:16 .
drwxr-xr-x 18 user user 4096 Nov 8 18:16 ..
drwxrwxr-x 7 user user 4096 Nov 8 18:16 .git
Explanation: This command lists all files, including hidden ones like the .git
directory, indicating Git is now tracking this directory.
Adding Files to Git
Copying a File and Adding to Staging Area
cp ../disk_usage.py .
ls -l
Output:
total 4
-rw-rw-r-- 1 user user 657 Nov 8 18:26 disk_usage.py
Adding a File to Git
git add disk_usage.py
git status
Output:
On branch master
No commits yet
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: disk_usage.py
Explanation: git add
stages disk_
usage.py
, preparing it for the next commit.
Why It Works
git add
stages files by copying a reference of the files into the “staging area.” The staging area is essentially a snapshot of the files marked to be included in the next commit.
How It Works
In-Memory Data Management: Git creates an in-memory copy of file references in the staging area.
File System: Git creates a blob (binary large object) in the
.git/objects
directory for each file in its current state. Each blob is identified by a unique hash based on the file’s content, allowing Git to store multiple versions without redundancy.
Why It Works
git status
shows:
Tracked Files: Files that Git is actively monitoring in the repository. It will show whether these files have been modified, staged, or committed.
Untracked Files: New files in the working directory that haven’t been added to Git yet.
Staging Area Status: Lists changes that are staged and ready for the next commit.
How It Works
When you run git status
, Git goes through the following backend processes:
Comparing Working Directory and Staging Area:
- Git examines each tracked file to see if it matches the version stored in the staging area. If it detects any differences, it flags these files as "modified" in the output.
Comparing Staging Area and Latest Commit:
- Git compares the files in the staging area against the last committed snapshot (stored in
.git/objects
). If there are differences, it marks them as "staged changes" ready for commit.
- Git compares the files in the staging area against the last committed snapshot (stored in
Checking for Untracked Files:
- Git scans for new files in the working directory that aren’t in the staging area or the
.git/objects
directory. These files are shown as "untracked" and are not yet under version control.
- Git scans for new files in the working directory that aren’t in the staging area or the
Committing Changes
Creating a Commit
git commit
On running this, Git will open a text editor for you to enter a commit message, such as "Initial commit."
Output:
[master (root-commit) 49d610b] Initial commit
1 file changed, 1 insertion(+)
create mode 100644 disk_usage.py
Explanation: git commit
saves the changes to the repository. This example used a text editor for the message.
Why It Works
Committing creates a permanent snapshot of the current project state, storing the information as a new commit object in the .git/objects
directory. Each commit has a unique identifier based on the hash of its contents and a pointer to the parent commit.
How It Works
Snapshot Creation: Git doesn’t store differences between commits; instead, it creates a full snapshot of each version in the
.git/objects
directory. Commits are linked to parent commits, forming a linked list-like structure called the commit history.File System: Git generates new directories and files within
.git/objects
to store these snapshots, leveraging the OS’s file I/O operations.Hashing: Git uses SHA-1 hashing to create a unique ID for each commit, referencing the exact state of the project.
Viewing History and Changes
Checking Commit History
git log
Output:
commit d8e139cc4f7dcd13b75cff67cfb68527e24c59c5 (HEAD -> master)
Author: My name <me@example.com>
Date: Thu Jul 11 17:19:32 2019 +0200
Initial commit
Explanation: git log
displays a list of commits, showing author, date, and commit message.
Viewing Detailed Differences
git log -p
Output: Shows patches, or code changes, made in each commit.
Explanation: git log -p
helps track exact changes across commits, especially useful for code review.
Why It Works
Git can display history and differences because each commit is a fully independent snapshot with metadata that links it to previous commits, forming a chain of changes.
How It Works
Linked List Structure: Each commit points to its parent, creating a linked list structure. The
git log
command traverses this chain, printing each commit.Diff Calculation:
git log -p
calculates diffs between snapshots by comparing content stored in the blobs within the.git/objects
directory. Git uses hashing to efficiently compare versions and compute the minimal set of changes.
Working with .gitignore
Creating and Using a .gitignore
File
echo ".DS_STORE" > .gitignore
git add .gitignore
git commit -m "Add .gitignore file, ignoring .DS_STORE files"
Output:
[master abb0632] Add .gitignore file, ignoring .DS_STORE files
1 file changed, 1 insertion(+)
create mode 100644 .gitignore
Explanation: .gitignore
excludes specific files from tracking, keeping the repository clean.
Why It Works
The .gitignore
file specifies patterns for files and directories that should not be tracked. Git reads this file and automatically excludes matching files from version control.
How It Works
Pattern Matching: Git uses simple pattern-matching logic (e.g., wildcards like
*
for matching) to identify files that fit the.gitignore
criteria.OS Interaction:
.gitignore
is a regular text file on the OS level, but Git interprets its contents when executing commands likegit add
andgit status
.
Renaming and Deleting Files
Renaming Files
git mv disk_usage.py check_free_space.py
git status
Output:
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
renamed: disk_usage.py -> check_free_space.py
Deleting Files
git rm check_free_space.py
git commit -m "Remove unused file"
Explanation: git mv
renames, while git rm
deletes files from the repository and staging area.
Why It Works
Git manages renaming and deletion through commands that track changes in the staging area. A renamed file maintains its history, while a deleted file is marked as removed from tracking.
How It Works
File System Operations: Git renames or removes files at the OS level and marks these changes in the
.git/objects
for future commits.In-Memory Management: Git records file movements and deletions in the staging area to later apply them in a commit, ensuring that the repository history reflects these changes.
Skipping the Staging Area
Committing All Changes Directly
git commit -a -m "Quick commit with staging"
Output:
[master 033f27a] Quick commit with staging
1 file changed, 4 insertions(+), 1 deletion(-)
Explanation: git commit -a
stages all modified files automatically before committing.
Why It Works
The -a
flag stages all modified files automatically, skipping the explicit git add
step.
How It Works
Automation of Staging: Git programmatically stages any files detected as modified by comparing them to their last committed state in
.git/objects
.Temporary Staging Area: Git uses an in-memory staging area to track modifications for files that are about to be committed.
Undoing Changes Before Committing
Reverting Changes to a File
git checkout -- all_checks.py
git status
Output:
On branch master
nothing to commit, working tree clean
Explanation: git checkout <file>
discards changes in a specific file, returning it to the last committed state.
Why It Works
These commands discard uncommitted changes by resetting files to their last committed state.
How It Works
In-Memory and File System Rollback:
git checkout
fetches the last committed version from.git/objects
and overwrites the working directory file.Staging Area Modification:
git reset
only affects the staging area, unmarking changes but not altering files in the working directory.
Amending Commits
Modifying the Last Commit
git commit --amend
Git opens a text editor to edit the commit message. This can be used to correct small mistakes in the last commit.
Why It Works
Amending allows you to modify the latest commit, including its message or the files included.
How It Works
Rewrites Last Commit: Git replaces the last commit in the
.git/objects
directory with a new commit, preserving the old commit’s parent and updating the branch pointer.Temporary Data Storage: During amending, Git stores temporary data in memory to allow modifications to the commit message or content without creating a new commit.
Git’s OS and Programming Functionality
SHA-1 Hashing: Git relies heavily on SHA-1, a cryptographic hash function, to create unique IDs for files and commits. This hashing system is critical for data integrity.
Object Database Structure: Git’s
.git/objects
directory organizes files using the hash as the directory and file name, making it easy for Git to retrieve specific versions or files quickly.Compression and Delta Storage: To optimize storage, Git compresses file contents and calculates deltas (changes) for more efficient storage, especially for large repositories.
File System Interface: Git interacts directly with the OS file system to manage the
.git
folder, files, and directory structures. This integration allows Git to handle massive repositories with minimal overhead.
In the next section, we will study all these features by coding our own basic version. However, first, we will go through the prerequisites: file processing systems and database management system concepts.
Subscribe to my newsletter
Read articles from Abhishek Dubey directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Abhishek Dubey
Abhishek Dubey
👋 Hey there! I’m Abhishek Dubey, a passionate technologist and travel enthusiast driven by the belief that technology has the power to bridge gaps and build connections that transform lives. With over 10 years in coding, my journey spans from hands-on roles in the stone industry to spearheading projects that merge innovation with real-world applications. Currently, I’m the mastermind behind Sampresha, a platform designed for avid travelers who feel underserved by mainstream travel platforms. Sampresha aims to create inclusive experiences by tapping into my love for exploration and deep industry insight. I'm also building BuyMyShare, a centeralized real estate tailored for middle class . When I’m not writing code or brainstorming the next big feature, you’ll find me sharing insights on development, building community-driven solutions, and exploring new trends in full-stack and Reading something interesting , Drawing or Pottery. I love making tech accessible, whether that’s through speaking sessions, or guiding new devs. Let’s connect, share, and build together on this incredible journey of tech and innovation! 🚀