Chapter 3:Mastering Git: Installation, Architecture, and Collaborative Workflow for Developers
Table of contents
- Topics Covered in Chapter 3:
- Introduction to Git
- The Problem Developers Face Without VCS
- Example of OS Naming Convention:
- Real-World Example:
- What is Version Control Software (VCS)?
- Types of Version Control Systems (VCS)
- 1. Local Version Control System (Local VCS)
- 2. Centralized Version Control System (Centralized VCS)
- Installation of Git Software
In this chapter, we will delve into the fundamentals of Git, one of the most popular Version Control Systems (VCS). You'll learn what a VCS is, why it's essential for developers, and how Git is used in real-world projects. We will guide you through Git installation, its architecture, and various essential Git commands and operations. Furthermore, we'll explore Git’s branching strategies, project review processes, and troubleshooting common issues. By the end of this chapter, you'll have a strong foundation for managing code changes efficiently in collaborative environments.
Topics Covered in Chapter 3:
Introduction
What is Version Control Software (VCS) and Types of VCS
Git Software Installation
Git Project Architecture
Git Commands and Operators
Git Command Execution
a) Using the Command Line
b) Using IDEs like Eclipse, Spring Tool Suite (STS), IntelliJ IDEA
Git Account Creation
a) Public Repository Creation
b) Private Repository Creation
Git Folder Structure
Git Branching Strategy
Developer Strategy
Master Branch
Release
Hotfix
Fourth
Project Review Process (Code Merge and Pull Request Review)
Real-Time Git Problems and How to Fix Them
Introduction to Git
Git is a distributed Version Control System (VCS) created by Linus Torvalds in 2005, originally for managing the development of the Linux kernel. It is currently maintained by Junio Hamano. Git has become an indispensable tool for developers due to its ability to efficiently track code changes, manage collaboration across teams, and maintain a detailed history of every code modification.
Git is widely used for:
Tracking code changes: It helps to monitor the history of changes made to the project over time.
Tracking contributions: It records who made each change, making it easy to follow the contributions of different developers.
Facilitating collaboration: Multiple developers can work together, even on different parts of the project, with Git merging their work seamlessly.
The Problem Developers Face Without VCS
Consider a developer working on their local machine, writing code based on specific requirements. These requirements could come from a client, the person or entity who provides the specifications for the software. As the developer works, they might get feedback from the client asking for modifications or improvements to the code.
Without a version control system, tracking every single change made to the code can become incredibly difficult. Imagine a scenario where the developer makes multiple changes to the same file—each time trying to meet the new client requirements. Over time, it becomes harder to remember what was changed, when, and why. Additionally, different developers may work on various modules within the same project (e.g., dev1 working on module 1, dev2 on module 2), which further complicates keeping track of changes.
In operating systems, when you try to save a file with the same name in the same directory, the system automatically renames the new file to avoid overwriting the old one. This is commonly seen when you are managing multiple versions of the same file, like when saving images, documents, or other files.
Example of OS Naming Convention:
Let’s say you have a file named Sample.png on your system.
First File: You save the file as Sample.png.
Second File: Now, if you try to save another file with the exact same name (Sample.png), the operating system automatically renames the new file to something like Sample (1).png.
Third File: If you save yet another file with the same name, the OS will name it Sample (2).png, and so on.
This method ensures that the original file is preserved, and newer versions of the file are also saved without the need for you to manually rename them.
Real-World Example:
Old file: Sample.png (created on Monday).
New file: Sample (1).png (created on Tuesday after edits).
In short, this automatic naming convention helps developers and users manage files without the risk of accidentally overwriting older versions. Similarly, in Version Control Systems like Git, each file version is saved under a different commit or version number, allowing easy tracking of changes without losing previous versions of files.
To address this issue, we need a Version Control System (VCS), which helps developers track and manage changes across different versions of the code.
What is Version Control Software (VCS)?
Version Control Software (VCS) is a system that records changes made to files or sets of files over time so that you can recall specific versions later. This is crucial for managing changes in software projects, especially when multiple people work on the same codebase.
For example, you might have different versions of a file:
JDK 1.0V
JDK 1.1V
JDK 1.2V
Every time a change is made to the source code, a new version is created. VCS allows developers to go back to previous versions of the code if needed, compare changes, and collaborate seamlessly with others.
Types of Version Control Systems (VCS)
There are three main types of Version Control Systems (VCS):
Local Version Control System (Local VCS)
Centralized Version Control System (Centralized VCS)
Distributed Version Control System (Distributed VCS)
1. Local Version Control System (Local VCS)
A Local Version Control System is used to manage and maintain file versions on a developer's local machine. This type of VCS keeps track of changes made to files without needing a central repository. Here’s a detailed explanation:
Key Features:
- File Versioning: The developer saves different versions of a file locally. For example, if a developer saves a file as File v1.0, the next version might be File v1.1. This is done manually by the developer, who can choose where to save each version.
Example Scenario:
Imagine a developer working on a project on a machine with multiple drives (C, D, E, F).
Initial Save: The developer saves the first version of their file (let's say an image or a document) as Sample.png in the C drive.
Next Change: After making changes, the developer intends to save the new version as Sample v1.1.png. However, they might accidentally save this version in another location, such as the D drive or the E drive.
Drawbacks:
Confusion in File Locations:
- Developers often forget where specific versions of their files are stored across different drives. This can lead to overwriting files unintentionally or copying from the wrong location, causing confusion and potential data loss.
Risk of Data Loss:
- If the developer's hard disk becomes corrupted, or if there are issues like a virus or malware, there is a high risk of losing all the saved versions of files. For example, if the hard drive fails, the developer could lose all their work permanently.
Accidental Deletion:
- Developers may inadvertently delete important files or previous versions, making it difficult to recover lost data. Since there is no centralized backup, retrieving accidentally deleted files can be a significant challenge.
Summary:
While Local VCS provides basic version tracking for individual developers, it is not suitable for collaborative projects or large codebases due to its limitations in tracking changes, preventing data loss, and managing files across multiple drives.
2. Centralized Version Control System (Centralized VCS)
To address the drawbacks of Local Version Control Systems (Local VCS), we have Centralized Version Control Systems (Centralized VCS). This model is designed to allow multiple developers to collaborate more effectively by maintaining a single repository that contains all versions of the project files.
Key Features:
Centralized Repository: In a Centralized VCS, all developers store their code in a central server (repository). This server is typically a high-speed and powerful computer dedicated to maintaining the codebase.
Workstation Setup:
Workstation/PC 1: Developer’s laptop 1.
Workstation/PC 2: Developer’s laptop 2.
Workstation/PC 3: Developer’s laptop 3.
Once developers finish writing their code, they push it to the central repository. This means that all code is maintained in one location, facilitating collaboration among developers.
Collaboration:
Developers can easily check out files from the central repository to their local machines to make changes.
After making changes, they can push those changes back to the central repository.
Common Centralized VCS software includes SVN (Subversion), Perforce, and others.
How It Works:
Check Out: This action involves taking code from the central repository to a developer’s local machine. It allows developers to work with the latest version of the project.
Push: This action involves sending code changes from a developer’s local machine back to the central repository.
Advantages:
Visibility: All developers have a clear understanding of what others are doing, which enhances collaboration.
Control: Administrators have full control over who can make changes, simplifying project management.
Disadvantages:
Single Point of Failure: If the central server experiences any issues, such as a hardware failure or network problems, it can halt collaboration entirely.
Network Dependency: Since all computers are connected through the internet, if the network slows down or fails, developers cannot access the repository to check out or push changes.
Risk of Data Loss: If the hard disk of the centralized server becomes corrupted and proper backups are not maintained, there is a significant risk of losing all stored data.
Conclusion:
Centralized Version Control Systems have been the standard for many years and provide an effective way for teams to collaborate on code. However, their reliance on a single central server introduces risks that teams must manage through robust backup strategies and network reliability.
In summary, understanding the different types of Version Control Systems (Local VCS, Centralized VCS, and Distributed VCS) is crucial for effective software development. Each system has its strengths and weaknesses, and the choice of which to use often depends on the project's requirements and the team's collaboration needs.
Installation of Git Software
Installing Git software involves a few steps and understanding the different components that make up Git: the Git server and the Git client. Let’s break this down in detail.
1. Download Git Software
Before installation, you need to download the Git software. Git comes in two main components:
Git Server
Git Client
Git Server
The Git Server acts as a centralized repository for storing and managing source code. It is the largest host of source code in the world and serves as a platform where multiple developers can collaborate on projects. Some popular Git server tools include:
GitHub: A widely used platform that hosts a large volume of source code. It offers various features for version control and collaboration.
Bitbucket: Another popular Git hosting service that supports both Git and Mercurial repositories.
GitLab: An open-source platform that provides a full DevOps lifecycle and is designed to help teams collaborate on code.
Example Scenario
Assuming you have a project for CityBank with various services such as cards, offers, loans, and payments, these services might be managed by different vendors like Infosys and TCS. The source code for each service would be stored in a Git server (e.g., GitHub or GitLab) as follows:
Cards: Source code
Payments: Source code
Offers: Source code
Loans: Source code
Developers access this central repository to maintain and update the source code for their respective services.
Connecting to the Git Server
To connect to the Git server, developers need specific credentials:
URL: A unique address pointing to the Git server, e.g.,
http://repo.citybank:9999
.Username: Each developer has a unique username.
Password: Each developer has a unique password.
For example:
Developer 1:
Username: Rohit Gawande
Password:
xxxxxxxxxx
Developer 2:
Username: Ronit Gawande
Password:
xxxxxxxxxx
All developers work on the same codebase and collaborate by pulling the latest code from the Git server and pushing their changes back to the repository.
Physical Location of Git Server
The Git server is not typically visible to users. It can be installed on cloud platforms like AWS, Azure, or any other data center. The actual infrastructure might be managed by the company or a third-party service provider.
When a developer joins a company, the team lead or manager shares the Git server URL, username, and password. Each developer uses this information to connect to the Git server and access the source code.
Git Client
The Git Client is a tool used to connect to the Git server. Installing the Git client allows developers to interact with the repository hosted on the Git server. The Git client comes with several tools:
Git Bash: A command-line interface that allows developers to run Linux commands.
Git GUI: A graphical user interface that enables developers to perform Git operations visually, making it easier to manage changes.
Git CMD: Command-line tools for Windows where developers can enter the Git server URL, username, and password to execute commands.
Installation Process
- The Git client is typically provided as an executable file (.exe) that can be installed with just a few clicks. The installation process is user-friendly, allowing developers to get up and running quickly.
Summary of Components
Git: The client tool that allows interaction with repositories.
GitHub: The server software where repositories are maintained.
Git Architecture
The Git architecture is designed to manage code changes in a structured way, facilitating collaboration among developers. It primarily revolves around three main areas: the Working Area, the Stage Area, and the Local Repository. Here's a detailed breakdown:
1. Working Area
The Working Area (also known as the Working Directory) is where developers write and edit their code on their local machines. It resides in directories like C:, D:, or E:\ on Windows, or
/home/user/
on Linux.This area allows developers to maintain their source code locally before it is versioned or shared with others.
Key Functions:
Developers create, edit, and delete files in this area as part of their daily tasks.
It is the initial place where code changes occur before they are prepared for version control.
2. Stage Area
- The Stage Area (or Index) acts as a buffer between the Working Area and the Local Repository. It holds changes that are ready to be committed.
How It Works:
When developers are satisfied with their code changes, they can stage them using the command:
git add <filename>
This command moves the modified files from the Working Area to the Stage Area, marking them for inclusion in the next commit.
The Stage Area serves as an indication that the developer intends to include these changes in the next commit.
Key Functions:
It allows developers to select which changes to include in the next commit.
Multiple files can be staged at once, enabling selective commits based on the developer's needs.
3. Local Repository
- The Local Repository is where all committed changes are stored. It resides within the
.git
directory in the Working Area and contains the complete history of commits for the project.
How It Works:
After staging the code, developers can commit the changes to the Local Repository with a command like:
git commit -m "Commit message describing changes"
This creates a new commit in the Local Repository, effectively capturing the state of the code at that point in time.
Key Functions:
It maintains a complete history of all changes, allowing developers to revert to previous states or understand the evolution of the codebase.
Developers can manage their changes locally without affecting the remote repository until they push updates.
Important Note on Pushing Changes
From the Working Area, developers cannot directly push changes to the remote repository (like GitHub). The flow requires that changes first be staged in the Stage Area, then committed to the Local Repository, and finally pushed to the remote repository.
The correct sequence of commands is as follows:
Stage the changes:
git add <filename>
Commit the changes:
git commit -m "Your message"
Push the committed changes to the remote repository:
git push origin <branch-name>
Pushing to Remote Repository
Once changes are committed to the Local Repository, developers can push their changes to a remote repository (like GitHub or Bitbucket) using the command:
git push origin <branch-name>
This command sends all the committed changes from the Local Repository to the remote repository. However, it requires the following details:
URL of the remote repository
Username and Password (or authentication tokens, depending on the setup)
Pulling from Remote Repository
Conversely, if developers want to incorporate changes made by others in the remote repository into their Local Repository, they use the command:
git pull origin <branch-name>
This fetches the latest changes from the remote repository and merges them into the Local Repository, keeping the developer’s code up-to-date.
Summary of Workflow
Working Area: Write and edit code.
Stage Area: Use
git add
to stage changes.Local Repository: Use
git commit
to commit staged changes.Push to Remote: Use
git push
to send changes to the remote repository.Pull from Remote: Use
git pull
to fetch and merge changes from the remote repository.
Conclusion
Git's architecture promotes a structured workflow for version control, allowing developers to manage their code efficiently. By understanding the roles of the Working Area, Stage Area, and Local Repository, developers can effectively navigate the process of writing, staging, committing, pushing, and pulling code changes, leading to smoother collaboration and project management.
Additionally, check out my other related posts:
Chapter 1: Understanding the Fundamentals of Programming: A comprehensive introduction to programming principles that lays the groundwork for future development.
Chapter 2: Fundamentals of Java: An in-depth look at Java, one of the most widely-used programming languages, including key concepts and practical applications.
Don’t miss out on my ongoing series:
DSA (Data Structures and Algorithms): Dive into essential algorithms and data structures that are fundamental to programming.
Full Stack JavaScript Development: Explore the journey of building dynamic web applications using JavaScript and its frameworks.
Connect with me on social media for more insights and updates:
LinkedIn
GitHub
LeetCode
Rohit Gawande
Full Stack Java Developer | Blogger | Coding Enthusiast
Subscribe to my newsletter
Read articles from Rohit Gawande directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Rohit Gawande
Rohit Gawande
🚀 Tech Enthusiast | Full Stack Developer | System Design Explorer 💻 Passionate About Building Scalable Solutions and Sharing Knowledge Hi, I’m Rohit Gawande! 👋I am a Full Stack Java Developer with a deep interest in System Design, Data Structures & Algorithms, and building modern web applications. My goal is to empower developers with practical knowledge, best practices, and insights from real-world experiences. What I’m Currently Doing 🔹 Writing an in-depth System Design Series to help developers master complex design concepts.🔹 Sharing insights and projects from my journey in Full Stack Java Development, DSA in Java (Alpha Plus Course), and Full Stack Web Development.🔹 Exploring advanced Java concepts and modern web technologies. What You Can Expect Here ✨ Detailed technical blogs with examples, diagrams, and real-world use cases.✨ Practical guides on Java, System Design, and Full Stack Development.✨ Community-driven discussions to learn and grow together. Let’s Connect! 🌐 GitHub – Explore my projects and contributions.💼 LinkedIn – Connect for opportunities and collaborations.🏆 LeetCode – Check out my problem-solving journey. 💡 "Learning is a journey, not a destination. Let’s grow together!" Feel free to customize or add more based on your preferences! 😊