A Beginner’s Guide to Git and Version Control in DevOps

Table of contents
- 📘 Introduction to Git
- 📜 Background
- 🧭 Centralized Version Control System (CVCS)
- 🌐 Distributed Version Control System (DVCS)
- ⚖️ Difference Between CVCS and DVCS
- 🛠️ Stages of Git and Its Terminology
- 🗺️ Git Workflow Diagram
- 🎯Benefits of a Version Control System
- 🏗️Understanding Git Architecture and Workflow Elements
- 🚀Advantages of Git
- 📂Types of Git Repositories

📘 Introduction to Git
Git is a powerful tool for software configuration and source code management. It enables developers to efficiently track changes and manage different versions of code. By allowing multiple developers to collaborate on a project simultaneously, Git ensures that each contributor's changes are preserved without the risk of overwriting others' work.
📜 Background
Before source code management, companies working on a product would divide the development into small teams or pods. Each team developed their portion of the code and manually sent it for integration. There was a risk that one developer's code might not be compatible with another's. This manual process was very hectic, and sometimes a single person or manager had to collect and merge the code from each team.
Question-1: What is Source Code Management?
Answer: Source code management helps preserve code by creating versions with each check-in, typically resulting in versions like myfile V1, myfile V2, and so on.
🧭 Centralized Version Control System (CVCS)
Used before Git.
A repository is a folder or storage space for everyone's code on a remote server, like GitHub or any cloud service.
A commit means saving your code in the repository.
Everyone is aware of each other's code and who worked on what, no matter where they are located.
⚠️Drawbacks of CVCS
If the central repository is corrupted or goes down, data can be lost.
An internet connection is needed for pulling and pushing code, and this process can be slow.
The code is not available locally, so you always need to be connected to a network to do anything.
Since everything is centralized, if the central server fails, all data can be lost.
Example: SVN tool.
🌐 Distributed Version Control System (DVCS)
In a Distributed Version Control System (DVCS), each contributor has a local copy or clone of the main repository. This means everyone maintains a local repository that includes all the files and metadata from the main repository. An internet connection is only necessary when pushing code to the remote repository.
Git is a software tool used for version control. Before Git, tools like BitKeeper and Mercurial were commonly used.
It's important to note that Git and GitHub are different; Git is the version control software, while GitHub is a service provider that offers storage for your repositories.
⚖️ Difference Between CVCS and DVCS
Feature | CVCS | DVCS |
Local Copy | A client needs to get a local copy of the source from the server. | Each client can have a local branch/repository with complete history. |
Commit Changes | Changes are committed to the central source on the server. | Changes are committed locally and then pushed to the server. |
Learning Curve | Easy to learn and set up. | Can be difficult for beginners as multiple commands need to be remembered. |
Branching | Working on branches is difficult; developers often face merge conflicts. | Branching is easier and developers face fewer conflicts. |
Offline Access | Does not provide offline access. | Works offline and only requires internet for pushing code. |
Speed | Slower, as every command communicates with the server. | Faster, as most operations are local. |
Server Dependency | If the server is down, developers cannot work. | Developers can continue working using their local copies. |
🛠️ Stages of Git and Its Terminology
📈Evolution of VCS
Local Version Control System → Centralized Version Control System → Distributed Version Control System
Git (Free Software) → Service providers (GitHub, GitLab, etc.) offer repositories
📊Stages of Git (Workflow)
When you run
git init
in your project folder, your local directory becomes a local repository. It creates three stages:Workspace / Working Directory
Staging Area / Local Area
Local Repository
🧠Analogy: Moving your code from the working directory to the staging area using git add .
is like going to a mall and picking 3 shirts. Moving your final code to the local repository by committing is like choosing 1 shirt to buy.
Commit: Like taking a snapshot of your code. It saves your code to the local repository with a unique commit ID.
The commit ID is a 40-character alphanumeric code generated after committing.
Tag: Indicates the purpose of the code.
You can remember everything using commit IDs and tags.
For example, if you have
login.html
andregistration.html
in the staging area, and you commit them, they move to the local repository—just like choosing 1 shirt out of 3.
🗺️ Git Workflow Diagram
When you 'pull' code from GitHub for the first time, it is downloaded to your VM.
After making changes, developers should 'push' their code to the remote repository.
Developers should 'pull' the latest code before pushing their changes. This ensures everyone has the most recent code and helps speed up project delivery.
Each commit, push, or pull is assigned a unique commit ID, which includes code details and the developer’s name, ensuring transparency.
🎯Benefits of a Version Control System
In the past, versioning meant saving the same file as Version1, Version2, Version3, and so on, which resulted in multiple copies of the same folder. For example, 20 MB (Version1) + 30 MB (Version2) = 50 MB of hardware space wasted.
🔍Example:
Version1 has 1000 lines of code, using 20MB of storage.
In Version2, only 4 lines are changed, taking up 1 MB of space.
VCS saves just the 1 MB of changes, keeping everything else the same.
This saves you 49 MB of storage.
Version Control Systems (VCS) solve this by taking a "snapshot" of changes in the staging area, saving only the changes.
- 📝Note: In CVCS, developers usually make changes and commit them directly to the repository. However, Git uses a different approach. Git does not track every modified file when you perform a commit. Instead, it looks for files in the staging area. Only the files in the staging area are included in the commit, not all modified files.
🏗️Understanding Git Architecture and Workflow Elements
🗄️Repository
A place where all your code is stored, similar to a folder on a server.
It is related to one product or project.
Changes are specific to that particular repository.
🖥️Server
Stores all repositories, like your VM or GitHub.
Holds metadata.
🗂️Metadata
Includes commit ID, messages, and more.
Similar to the information stored with photos on your phone.
💻Working Directory / Workspace
This is where you view and edit files.
You can work on one branch at a time.
🏷️Tag
Assigns a meaningful name to a specific version.
Once a tag is created, it remains unchanged even if new commits are made.
📸Snapshots
Show data at a specific moment in time.
Always incremental—store only the changes, not the whole copy.
Similar to taking a photo with your phone.
📝Commit
Saves changes in the local repository.
Creates a 40-character alphanumeric commit ID.
Uses a SHA-1 checksum.
Even a small change (like a dot) alters the commit ID.
Helps track changes.
Also known as a SHA-1 hash.
🔐SHA-1 Hash: Suppose VM1 has 1000 lines of code, and Git generates a checksum like 2141876. VM2 has the exact same code, so the checksum is also 2141876. If VM2 adds just 2 lines, Git recalculates the checksum, which might become 2141896. This change helps Git detect even the smallest modifications.
🆔Commit ID / Version ID / Version
Used to identify each change.
Helps track who modified the file.
📤Push
The push operation copies changes from a local repository to a remote or central repository.
It is used to store changes permanently in the Git repository.
📥Pull
The pull operation copies changes from a remote repository to a local machine.
It is used to synchronize between repositories.
🌿Branch
A repository can have multiple branches for different tasks.
Each task has its own branch.
All branches are eventually merged.
Useful for parallel development.
You can create a branch from another branch.
Changes are specific to that branch.
The default branch is 'master'.
Files created in the workspace are visible in any branch until committed. Once committed, the file belongs to that branch.
The master branch contains all versions (e.g., V1, V2, V3…).
Best practice: Commit all changes to the master branch.
When you create a branch, the data is copied from that branch — for example, 'Feature_branch1_01' is copied from 'Feature_branch1', not from 'master'.
🚀Advantages of Git
Free and open-source.
Fast and lightweight — most operations are local.
Example: Copying 5 GB of data from your PC to a USB drive takes time because it uses external hardware. But copying the same data from the C drive to the D drive is much faster since it's within the same system. Git works the same way — local commits are fast because they don’t rely on external servers.
Security: Uses SHA-1 cryptographic hash to name and identify objects.
No need for powerful hardware.
Easier branching: Creating a new branch copies all code to the new branch.
📂Types of Git Repositories
Bare Repository (Central Repo):
Used only for storing and sharing.
All central repositories are bare.
Non-Bare Repository (Local Repo):
Used for modifying files.
All local repositories are non-bare.
🧠 Final Thoughts
This article introduces Git, a powerful tool for software configuration and source code management, which enables efficient version control and collaboration among developers. It highlights the evolution from Local to Centralized (CVCS) and finally to Distributed Version Control Systems (DVCS) like Git. Key differences between CVCS and DVCS are outlined, emphasizing Git's advantages such as offline access, faster operations, easier branching, and enhanced storage efficiency. The workflow through Git's stages—Working Directory, Staging Area, and Local Repository—is explained along with essential Git concepts like commits, branches, and repositories. The article also underscores the benefits of Git's decentralized nature, security features, and its ease of use for parallel development.
---
- Written by Pankaj Roy | DevOps & Cloud Enthusiast
Subscribe to my newsletter
Read articles from Pankaj Roy directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
