Written by: Nurul Hasan
For: Fellow C++ learners and curious minds
Special thanks to: ChatGPT.

Let’s understand the full behind-the-scenes system behavior — from writing a C++ program, compiling it, linking it, and finally executing it. This will include how the operating system handles processes, file descriptors (FDs), fork(), execve(), and how everything fits together during execution.

🚀 Starting Point: You Open a Terminal

You open a terminal (e.g., bash, zsh, gnome-terminal):

The OS creates a new process for the terminal application.
This process is assigned:
- A PID (Process ID)
- Virtual memory
- File Descriptor table
- Other metadata: parent PID, UID, GID, etc.
FDs 0, 1, 2 are initialized:
- FD 0 → stdin (keyboard)
- FD 1 → stdout (terminal output)
- FD 2 → stderr (terminal error output)

✅ At this point, you have an interactive shell running.

✍️ You Write Your Code

Let’s say you have the following C++ files:

main.cpp          → contains `main()`
logic.cpp         → contains `doSomething()`
utils.cpp         → contains `helperFunction()`

Each .cpp file is a translation unit — a partial program that needs to be compiled and later linked.

🏗️ Step 1: Compilation — Translating Code to Object Files

You run:

g++ -c main.cpp    → main.o
g++ -c logic.cpp   → logic.o
g++ -c utils.cpp   → utils.o

Each g++ -c command:

Forks a child process.
That child calls execve() to launch the compiler (g++ binary).
g++:
- Opens the source file using open("file.cpp", O_RDONLY) → maybe FD 3
- Creates the .o file using open("file.o", O_WRONLY | O_CREAT | O_TRUNC, 0644) → maybe FD 4
- Compiles source to machine code, writes to .o file.
Then exits, returning control to the shell.

✅ Each compilation step is independent and sequential unless run using make -j (we’ll cover parallelism later).

🔗 Step 2: Linking — Creating the Final Executable

Now you run:

g++ main.o logic.o utils.o -o myApp

Here’s what happens:

Shell forks a new process.
Child execs g++, now running the compiler driver.
Internally, g++ calls the linker (usually ld) as a subprocess.
ld:
- Opens all .o files
- Resolves all symbols (e.g., where is doSomething() defined?)
- Writes the final single binary executable file myApp

To create myApp, the linker does:

open("myApp", O_WRONLY | O_CREAT | O_TRUNC, 0755) → maybe FD 4

FD 4 now points to the output binary.
It writes the ELF headers, code, data segments into myApp.

✅ This does not run your app, just builds it.

🏃 Step 3: You Run the Program

You type:

./myApp

Here’s what happens under the hood:

Shell calls fork() → creates a child process.
The child process calls:
```
 execve("./myApp", ...)
```

This tells the OS: “Replace my current program (shell) with this new one (myApp).”

What the OS does next:

Opens the file myApp
Parses the ELF(Executable and Linkable Format.) binary format
Maps sections into memory:
- .text → executable code
- .data, .bss → global/static data
- Stack and heap
Closes the binary file (after mapping it)
Initializes:
- Registers (including program counter)
- argc, argv, and envp
Starts executing main() in your code

✅ Your compiled C++ program is now running as a new process, with:

A new PID
Its own virtual memory
A new file descriptor table
FDs 0, 1, 2 still point to the terminal (same as parent)

💡 Summary So Far

✅ From the moment you opened the terminal to the point your C++ program runs:

Terminal runs as a process with stdin/stdout/stderr.
You compile your .cpp files into .o files → each g++ -c is a new process.
You link .o files using g++ → internally calls ld linker to produce the final binary.
Running ./myApp:
- Shell forks → child
- Child execs → replaces memory with new program
- OS maps code/data/stack/heap
- Process runs main() with inherited FDs (0, 1, 2)

🔑 What is an FD (File Descriptor)?

A File Descriptor (FD) is a simple integer number that the Linux kernel uses to represent an open file, socket, or I/O resource within a process.

Think of it like a small ticket stub that points to an open file or resource — the process uses this stub to read/write to that file, terminal, socket, etc.

Each process has its own FD table, and each entry in that table points to an open file object in the kernel.

📂 Understanding FD Behavior During Compilation vs Execution

Let’s now make this mental model fully accurate using your drawer analogy — which fits perfectly.

🧱 During Compilation (`g++ main.cpp -o myApp`)

When we pass main.cpp to the compiler:

We are not accessing the actual file content directly, rather we are using a token (file descriptor) — like putting the file into a drawer, and the drawer gives us a number (FD) that links to the file in the background.

So:

main.cpp is just plain text, not executable.
It is data for the compiler to read.
The OS does:
```
  open("main.cpp", O_RDONLY) → FD 3
```
The compiler reads from FD 3.

Since it’s non-executable data, the OS treats it just like a text file:

It allocates an FD.
That FD links to the file on disk.
The compiler reads through that FD — like reading a piece of paper through a slot in a locked drawer.

✅ Because it's not executable, the OS doesn't do any memory mapping or ELF parsing. It’s just:
"Here's the drawer handle (FD), read your text."

⚙️ During Execution of an Executable (e.g., `./myApp`)

Now the behavior completely changes, because we're executing, not just reading data.

Instead, the OS:

Opens myApp → gets FD 3 (let’s say)
Parses the ELF header to understand how to map it into memory
Uses mmap() or equivalent to map the binary sections of the file into virtual memory:
- .text → executable code
- .data, .bss → global/static variables
Stack, heap are initialized
Closes the FD!

✅ Once the code is in memory, the file is no longer needed.

So again, your statement holds:

The file is opened via FD, mapped to memory, then unlinked (closed).
Execution happens from memory, not through the FD.

The FD was only the handle to read and load the code, not to run it directly.

🎬 Now: What About Video Files and VLC?

Let’s apply the same principle to video files like .mp4.

When you double-click a file like movie.mp4, the following happens:

The OS resolves the default program for .mp4 → maybe VLC.

It runs:

 execve("/usr/bin/vlc", ["vlc", "movie.mp4"], ...)

That creates a new process for VLC.

VLC itself does:

 int fd = open("movie.mp4", O_RDONLY); // gets FD 3 maybe

FD 3 is now pointing to the raw binary data of the video file.

VLC:

Reads that data using the FD
Decodes the video and audio in memory
Sends output to:
- The screen (via GPU)
- The speakers (via ALSA or PulseAudio)

✅ The video file is never directly executed — it is read via FD as data, just like the .cpp source file during compilation.

🧠 Final Perspective: FD as a Drawer Handle

File Type	Treated As	Opened via FD?	Executed via FD?	FD Closed After Use?
`.cpp`	Plain data	✅ Yes	❌ No	✅ After read
Executable	Binary code	✅ Yes	❌ Not directly	✅ After `mmap()`
`.mp4`	Media data	✅ Yes	❌ No	✅ After playback

✅ In all these cases, the FD is just the drawer handle — a temporary link to a file, used for reading or loading, but not kept open unnecessarily.

🧠 What Is a Process?

A process is an instance of a running program — it is how the operating system runs, isolates, and manages code execution.

🔧 Components of a Process

When a process is created, the OS assigns and manages several key components:

Component	Description
PID (Process ID)	Unique identifier for the process.
Parent PID (PPID)	The PID of the process that created (forked) it.
Virtual Memory Space	Includes sections for code (`.text`), data (`.data`, `.bss`), heap, stack.
File Descriptor Table	A table of open file descriptors (FDs like stdin, stdout, stderr).
Execution Context	CPU state: program counter, registers, etc.
Environment Variables	User and system-defined vars (e.g., `$PATH`, `$HOME`).
Process Metadata	Priority, state, user ID, resource usage, etc.

🏁 How Is a Process Created?

Let’s use a real example involving your terminal and g++.

📟 Step 1: You Open a Terminal

Your desktop environment (like GNOME, KDE) is already running.
- It has a process running the GUI and background services.
- You click the terminal icon (say, GNOME Terminal).
GNOME creates the terminal window using:
```
 fork();      // Spawns a new child process
 execve("/usr/bin/gnome-terminal", ...)  // Loads terminal program into memory
```
- This new process now becomes the terminal.
- The parent process is the desktop environment (e.g., gnome-shell).
- It inherits:
  - Environment variables
  - File descriptors (like for the display, sound)
  - Working directory
The child’s memory is replaced with the terminal binary using execve().

🧠 execve() = "execute a program, replacing this process’s memory"

The terminal process is now running with its own:
- PID (e.g., 2205)
- FD table
- Allocated memory for the terminal program

🛠️ Step 2: You Run `g++ main.cpp -o app`

Now you're inside the terminal and you type:

g++ main.cpp -o app

Here’s what happens:

The terminal is running a shell (bash, zsh, etc.).
The shell receives the command and calls:
```
 fork();      // Creates a new child process
 execve("/usr/bin/g++", ["g++", "main.cpp", "-o", "app"], ...)
```
- A new child process is created to run g++.
- It gets its own:
  - PID (e.g., 2210)
  - FD table (inherits 0, 1, 2 — stdin, stdout, stderr)
  - Virtual memory space
  - Environment variables
Once g++ finishes compiling:
- It returns control to the shell (the parent process).
- The child process calls exit() → it's removed from the process table.

✅ So now you're back at the shell prompt.

🚀 Step 3: You Run the Compiled Program `./app`

Now you execute:

./app

Again:

The shell (bash) calls:

 fork();    // Create child
 execve("./app", ["./app"], ...)   // Replace child memory with app binary

The ./app process is now a new process:
- Own PID (e.g., 2215)
- Memory layout initialized:
  - .text, .data, .bss, heap, stack
- FDs 0, 1, 2 connected to your terminal
- ELF binary loaded via FD and mmap() into memory
- FD used for the binary is closed after loading
Once your program finishes:
- It calls exit()
- Process dies
- Control returns to the shell

🧾 Recap of Flow:

Step	Action	Result
Open Terminal	Parent (GUI) forks & execs	New process: Terminal GUI
Type `g++ ...`	Shell forks & execs	New process: `g++` compiler runs
Type `./app`	Shell forks & execs	New process: Your app runs

In all cases:

fork() creates a copy of the parent process

execve() replaces the child’s memory with the new program

New FDs, PID, memory are assigned

When child exits, control goes back to the parent (usually your shell)

till now what we have discussed is the sequential processing, let discuss parallel processing..

⚙️ What Is Parallel Processing?

Parallel processing means:

Running multiple processes at the same time, so work gets done concurrently — ideally using multiple CPU cores.

This is useful when:

Tasks are independent
You have multi-core hardware
You want to speed things up (compilation, server requests, file processing)

🧪 Example 1: Compiling Multiple `.cpp` Files in Parallel

Earlier, you compiled files like this:

g++ -c main.cpp
g++ -c logic.cpp
g++ -c utils.cpp

This ran one after another, wasting time if you had multiple cores.

✅ Using a Makefile with Parallelism

🎯 Makefile

all: main.o logic.o utils.o
    g++ main.o logic.o utils.o -o app

%.o: %.cpp
    g++ -c $< -o $@

🧨 Run It With Parallelism:

bashCopyEditmake -j3

This means:

Make can spawn up to 3 separate g++ processes at once
These processes run in parallel, each on its own core if available

🔄 How it works internally:

Make forks a process for each file:

 cppCopyEditfork(); execve("g++", ["g++", "-c", "main.cpp", ...]);
 fork(); execve("g++", ["g++", "-c", "logic.cpp", ...]);
 fork(); execve("g++", ["g++", "-c", "utils.cpp", ...]);

Each process has:
- Its own memory
- FD table (stdin/out/err, plus source/output files)
- PID
The OS schedules these across CPU cores.

✅ Result: Faster builds using real parallelism.

🌐 Example 2: Web Server Handling Multiple Clients in Parallel

Let’s say you build a simple HTTP server.

🛑 Sequential Model (Not Parallel):

cppCopyEditwhile (true) {
    int client_fd = accept(server_fd, ...);
    handle_request(client_fd); // Blocks: only one client at a time
    close(client_fd);
}

Only one request at a time
Others must wait
Bad for performance

✅ Fork-Based Parallel Model

cppCopyEditwhile (true) {
    int client_fd = accept(server_fd, ...);
    if (fork() == 0) {
        // In child process
        handle_request(client_fd);
        close(client_fd);
        exit(0); // Exit child
    }
    close(client_fd); // Parent closes its copy
}

Each request handled by a new process
OS runs child processes in parallel
Each has its own memory, FD, stack

✅ This is how Apache, PostgreSQL, and others used to handle concurrency.

🔗 Example 3: Pipe with Two Processes Running in Parallel

🧵 What Is a Pipe in Unix?

A pipe is a one-way communication channel used to pass data from one process to another, without using intermediate files.

Think of it like this:

🧪 Imagine process A is pouring data into a hose, and process B is drinking from the other end.

📦 A Pipe Connects:

STDOUT (output) of one process
STDIN (input) of another process

This allows:

One process to write data into the pipe
Another process to read that data in real-time

🔧 How Is a Pipe Created?

The shell or program uses the pipe() system call:

int pipefd[2];
pipe(pipefd); // pipefd[0] = read end, pipefd[1] = write end

Then it forks two processes and assigns:

Process 1: writes to pipefd[1]
Process 2: reads from pipefd[0]

🧠 Pipes Work Like:

A shared buffer in the kernel
Data written to the pipe by one process gets buffered
Data is then read by the other process

🔗 Example: `yes hello | head -n 5`

Let’s use this example now to see a pipe in action:

yes hello | head -n 5

🛠 Step-by-Step Breakdown

✅ Step 1: The Shell Sets Up a Pipe

int pipefd[2];
pipe(pipefd); // pipefd[1] is write end, pipefd[0] is read end

✅ Step 2: The Shell Forks Two Child Processes

🔹 Child Process 1 — `yes hello`

Writes infinite "hello\n" to stdout
Shell redirects its stdout → pipefd[1]

So now, yes is writing into the pipe.

🔹 Child Process 2 — `head -n 5`

Reads from stdin
Shell redirects its stdin ← pipefd[0]

So now, head is reading from the pipe.

🧪 What Happens During Execution?

yes hello starts spamming:
```
 hello\nhello\nhello\n...
```
into the pipe.
head -n 5 reads from the pipe, line by line.

It reads:
```
 hello
 hello
 hello
 hello
 hello
```
Once head reads 5 lines:
- It closes its end of the pipe.
- It exits.
yes is still trying to write, but suddenly:
- The read end of the pipe is gone.
- The kernel sends a SIGPIPE signal to yes.
- yes is terminated.

✅ What You See on Your Terminal

Only what head prints:

hello
hello
hello
hello
hello

Even though yes was generating infinite lines, only 5 made it through — because head controlled the read and exited after 5.

🧵 Process vs Thread

Both processes and threads represent independent flows of execution, but they differ in how they manage resources, isolation, and performance characteristics.

🔍 1. What Is a Process?

A process is an independent instance of a running program, managed by the OS.

🧱 Key Characteristics:

Feature	Description
Has its own memory space	Completely isolated from other processes
Has its own PID, FD table, and stack
Created with `fork()` (heavyweight)
Expensive to create and switch between
Safer — crashing one process doesn’t affect others

🔍 2. What Is a Thread?

A thread is a lightweight unit of execution within a process. Multiple threads share the same memory.

🧱 Key Characteristics:

Feature	Description
Shares memory with other threads in the same process
Each thread has its own stack, program counter, and registers
Created with `pthread_create()` in C/C++, or `std::thread` in C++
Fast to create and context switch
Riskier — a crash in one thread can corrupt shared memory

🧪 Real-World Example: Web Server

Let’s say you build a server to handle client requests.

🔁 Using Processes:

int client = accept(...);
if (fork() == 0) {
    handle_request(client);
    exit(0);
}

Each client request spawns a new process
These processes run in parallel
Completely isolated
Expensive in terms of system resources

🧵 Using Threads:

int client = accept(...);
std::thread t(handle_request, client);
t.detach();

Each client handled by a new thread
Shares memory with other threads
Fast and efficient
More complex to handle safely (due to shared data)

📊 Comparison Table

Feature	Process	Thread
Memory	Separate address space	Shared address space
Creation cost	High (`fork`)	Low (`pthread_create` / `std::thread`)
Communication	Via IPC (pipes, sockets, shared memory)	Direct via shared variables
Crash impact	Isolated – doesn’t affect others	Shared fate – may crash the whole process
Use case examples	Chrome tabs, microservices, CLI commands	Server threads, GUI responsiveness, AI tasks
Scheduling	Managed by OS (context switch = expensive)	Managed by OS or language runtime

🧠 Analogy

🍱 Processes = Bento Boxes

Each box is self-contained.
Opening one doesn’t mess with the others.
More overhead, but safer.

🍜 Threads = Compartments in a Bowl

All in the same bowl (shared memory).
Fast to create new compartments (threads).
But if one spills, the whole bowl is messy.

✅ Summary

When to Use Processes	When to Use Threads
Need isolation and safety (e.g., running untrusted code)	Need speed and shared memory
Crashes should be contained	Threads must coordinate carefully (mutexes, locks)
Example: Browser tabs (separate PIDs)	Example: Game engine threads (render, audio, input)

Thank you for reading through this article. I'm currently learning and exploring some of these concepts myself, and while going through them, I thought it might be helpful to write things down in a way that’s easy to revisit and understand. If it helped you too, I’m really glad.

That’s all — just wanted to share what I’m learning. Thanks again for taking the time to read ❤️.

Inside the Operating System: A Journey Through Compilation, Execution, and Concurrency

🚀 Starting Point: You Open a Terminal

✍️ You Write Your Code

🏗️ Step 1: Compilation — Translating Code to Object Files

🔗 Step 2: Linking — Creating the Final Executable

🏃 Step 3: You Run the Program

What the OS does next:

💡 Summary So Far

🔑 What is an FD (File Descriptor)?

📂 Understanding FD Behavior During Compilation vs Execution

🧱 During Compilation (g++ main.cpp -o myApp)

⚙️ During Execution of an Executable (e.g., ./myApp)

🎬 Now: What About Video Files and VLC?

🧠 Final Perspective: FD as a Drawer Handle

🧠 What Is a Process?

🔧 Components of a Process

🏁 How Is a Process Created?

📟 Step 1: You Open a Terminal

🛠️ Step 2: You Run g++ main.cpp -o app

🚀 Step 3: You Run the Compiled Program ./app

🧾 Recap of Flow:

⚙️ What Is Parallel Processing?

🧪 Example 1: Compiling Multiple .cpp Files in Parallel

✅ Using a Makefile with Parallelism

🎯 Makefile

🧨 Run It With Parallelism:

🔄 How it works internally:

🌐 Example 2: Web Server Handling Multiple Clients in Parallel

🛑 Sequential Model (Not Parallel):

✅ Fork-Based Parallel Model

🔗 Example 3: Pipe with Two Processes Running in Parallel

🧵 What Is a Pipe in Unix?

📦 A Pipe Connects:

🔧 How Is a Pipe Created?

🧠 Pipes Work Like:

🔗 Example: yes hello | head -n 5

🛠 Step-by-Step Breakdown

✅ Step 1: The Shell Sets Up a Pipe

✅ Step 2: The Shell Forks Two Child Processes

🔹 Child Process 1 — yes hello

🔹 Child Process 2 — head -n 5

🧪 What Happens During Execution?

✅ What You See on Your Terminal

🧵 Process vs Thread

🔍 1. What Is a Process?

🧱 Key Characteristics:

🔍 2. What Is a Thread?

🧱 Key Characteristics:

🧪 Real-World Example: Web Server

🔁 Using Processes:

🧵 Using Threads:

📊 Comparison Table

🧠 Analogy

🍱 Processes = Bento Boxes

🍜 Threads = Compartments in a Bowl

✅ Summary

Subscribe to my newsletter

Nurul Hasan

Nurul Hasan

🧱 During Compilation (`g++ main.cpp -o myApp`)

⚙️ During Execution of an Executable (e.g., `./myApp`)

🛠️ Step 2: You Run `g++ main.cpp -o app`

🚀 Step 3: You Run the Compiled Program `./app`

🧪 Example 1: Compiling Multiple `.cpp` Files in Parallel

🔗 Example: `yes hello | head -n 5`

🔹 Child Process 1 — `yes hello`

🔹 Child Process 2 — `head -n 5`