Inside the Operating System: A Journey Through Compilation, Execution, and Concurrency

Written by: Nurul Hasan
For: Fellow C++ learners and curious minds
Special thanks to: ChatGPT.
Let’s understand the full behind-the-scenes system behavior — from writing a C++ program, compiling it, linking it, and finally executing it. This will include how the operating system handles processes, file descriptors (FDs), fork()
, execve()
, and how everything fits together during execution.
🚀 Starting Point: You Open a Terminal
You open a terminal (e.g., bash
, zsh
, gnome-terminal
):
The OS creates a new process for the terminal application.
This process is assigned:
A PID (Process ID)
Virtual memory
File Descriptor table
Other metadata: parent PID, UID, GID, etc.
FDs 0, 1, 2 are initialized:
FD 0 → stdin (keyboard)
FD 1 → stdout (terminal output)
FD 2 → stderr (terminal error output)
✅ At this point, you have an interactive shell running.
✍️ You Write Your Code
Let’s say you have the following C++ files:
main.cpp → contains `main()`
logic.cpp → contains `doSomething()`
utils.cpp → contains `helperFunction()`
Each .cpp
file is a translation unit — a partial program that needs to be compiled and later linked.
🏗️ Step 1: Compilation — Translating Code to Object Files
You run:
g++ -c main.cpp → main.o
g++ -c logic.cpp → logic.o
g++ -c utils.cpp → utils.o
Each g++ -c
command:
Forks a child process.
That child calls
execve()
to launch the compiler (g++
binary).g++
:Opens the source file using
open("file.cpp", O_RDONLY)
→ maybe FD 3Creates the
.o
file usingopen("file.o", O_WRONLY | O_CREAT | O_TRUNC, 0644)
→ maybe FD 4Compiles source to machine code, writes to
.o
file.
Then exits, returning control to the shell.
✅ Each compilation step is independent and sequential unless run using make -j
(we’ll cover parallelism later).
🔗 Step 2: Linking — Creating the Final Executable
Now you run:
g++ main.o logic.o utils.o -o myApp
Here’s what happens:
Shell forks a new process.
Child execs
g++
, now running the compiler driver.Internally,
g++
calls the linker (usuallyld
) as a subprocess.ld
:Opens all
.o
filesResolves all symbols (e.g., where is
doSomething()
defined?)Writes the final single binary executable file
myApp
To create myApp
, the linker does:
open("myApp", O_WRONLY | O_CREAT | O_TRUNC, 0755) → maybe FD 4
FD 4 now points to the output binary.
It writes the ELF headers, code, data segments into
myApp
.
✅ This does not run your app, just builds it.
🏃 Step 3: You Run the Program
You type:
./myApp
Here’s what happens under the hood:
Shell calls
fork()
→ creates a child process.The child process calls:
execve("./myApp", ...)
This tells the OS: “Replace my current program (shell) with this new one (
myApp
).”
What the OS does next:
Opens the file
myApp
Parses the ELF(Executable and Linkable Format.) binary format
Maps sections into memory:
.text
→ executable code.data
,.bss
→ global/static dataStack and heap
Closes the binary file (after mapping it)
Initializes:
Registers (including program counter)
argc
,argv
, andenvp
Starts executing
main()
in your code
✅ Your compiled C++ program is now running as a new process, with:
A new PID
Its own virtual memory
A new file descriptor table
FDs 0, 1, 2 still point to the terminal (same as parent)
💡 Summary So Far
✅ From the moment you opened the terminal to the point your C++ program runs:
Terminal runs as a process with stdin/stdout/stderr.
You compile your
.cpp
files into.o
files → eachg++ -c
is a new process.You link
.o
files usingg++
→ internally callsld
linker to produce the final binary.Running
./myApp
:Shell forks → child
Child execs → replaces memory with new program
OS maps code/data/stack/heap
Process runs
main()
with inherited FDs (0, 1, 2)
🔑 What is an FD (File Descriptor)?
A File Descriptor (FD) is a simple integer number that the Linux kernel uses to represent an open file, socket, or I/O resource within a process.
Think of it like a small ticket stub that points to an open file or resource — the process uses this stub to read/write to that file, terminal, socket, etc.
Each process has its own FD table, and each entry in that table points to an open file object in the kernel.
📂 Understanding FD Behavior During Compilation vs Execution
Let’s now make this mental model fully accurate using your drawer analogy — which fits perfectly.
🧱 During Compilation (g++ main.cpp -o myApp
)
When we pass main.cpp
to the compiler:
We are not accessing the actual file content directly, rather we are using a token (file descriptor) — like putting the file into a drawer, and the drawer gives us a number (FD) that links to the file in the background.
So:
main.cpp
is just plain text, not executable.It is data for the compiler to read.
The OS does:
open("main.cpp", O_RDONLY) → FD 3
The compiler reads from FD 3.
Since it’s non-executable data, the OS treats it just like a text file:
It allocates an FD.
That FD links to the file on disk.
The compiler reads through that FD — like reading a piece of paper through a slot in a locked drawer.
✅ Because it's not executable, the OS doesn't do any memory mapping or ELF parsing. It’s just:
"Here's the drawer handle (FD), read your text."
⚙️ During Execution of an Executable (e.g., ./myApp
)
Now the behavior completely changes, because we're executing, not just reading data.
Instead, the OS:
Opens
myApp
→ gets FD 3 (let’s say)Parses the ELF header to understand how to map it into memory
Uses
mmap()
or equivalent to map the binary sections of the file into virtual memory:.text
→ executable code.data
,.bss
→ global/static variables
Stack, heap are initialized
Closes the FD!
✅ Once the code is in memory, the file is no longer needed.
So again, your statement holds:
The file is opened via FD, mapped to memory, then unlinked (closed).
Execution happens from memory, not through the FD.
The FD was only the handle to read and load the code, not to run it directly.
🎬 Now: What About Video Files and VLC?
Let’s apply the same principle to video files like .mp4
.
When you double-click a file like
movie.mp
4
, the following happens:
The OS resolves the default program for
.mp4
→ maybe VLC.It runs:
execve("/usr/bin/vlc", ["vlc", "movie.mp4"], ...)
That creates a new process for VLC.
VLC itself does:
int fd = open("movie.mp4", O_RDONLY); // gets FD 3 maybe
FD 3 is now pointing to the raw binary data of the video file.
VLC:
Reads that data using the FD
Decodes the video and audio in memory
Sends output to:
The screen (via GPU)
The speakers (via ALSA or PulseAudio)
✅ The video file is never directly executed — it is read via FD as data, just like the .cpp
source file during compilation.
🧠 Final Perspective: FD as a Drawer Handle
File Type | Treated As | Opened via FD? | Executed via FD? | FD Closed After Use? |
.cpp | Plain data | ✅ Yes | ❌ No | ✅ After read |
Executable | Binary code | ✅ Yes | ❌ Not directly | ✅ After mmap() |
.mp4 | Media data | ✅ Yes | ❌ No | ✅ After playback |
✅ In all these cases, the FD is just the drawer handle — a temporary link to a file, used for reading or loading, but not kept open unnecessarily.
🧠 What Is a Process?
A process is an instance of a running program — it is how the operating system runs, isolates, and manages code execution.
🔧 Components of a Process
When a process is created, the OS assigns and manages several key components:
Component | Description |
PID (Process ID) | Unique identifier for the process. |
Parent PID (PPID) | The PID of the process that created (forked) it. |
Virtual Memory Space | Includes sections for code (.text ), data (.data , .bss ), heap, stack. |
File Descriptor Table | A table of open file descriptors (FDs like stdin, stdout, stderr). |
Execution Context | CPU state: program counter, registers, etc. |
Environment Variables | User and system-defined vars (e.g., $PATH , $HOME ). |
Process Metadata | Priority, state, user ID, resource usage, etc. |
🏁 How Is a Process Created?
Let’s use a real example involving your terminal and g++
.
📟 Step 1: You Open a Terminal
Your desktop environment (like GNOME, KDE) is already running.
It has a process running the GUI and background services.
You click the terminal icon (say, GNOME Terminal).
GNOME creates the terminal window using:
fork(); // Spawns a new child process execve("/usr/bin/gnome-terminal", ...) // Loads terminal program into memory
This new process now becomes the terminal.
The parent process is the desktop environment (e.g.,
gnome-shell
).It inherits:
Environment variables
File descriptors (like for the display, sound)
Working directory
The child’s memory is replaced with the terminal binary using
execve()
.
🧠
execve()
= "execute a program, replacing this process’s memory"
The terminal process is now running with its own:
PID (e.g., 2205)
FD table
Allocated memory for the terminal program
🛠️ Step 2: You Run g++ main.cpp -o app
Now you're inside the terminal and you type:
g++ main.cpp -o app
Here’s what happens:
The terminal is running a shell (
bash
,zsh
, etc.).The shell receives the command and calls:
fork(); // Creates a new child process execve("/usr/bin/g++", ["g++", "main.cpp", "-o", "app"], ...)
A new child process is created to run
g++
.It gets its own:
PID (e.g., 2210)
FD table (inherits 0, 1, 2 — stdin, stdout, stderr)
Virtual memory space
Environment variables
Once
g++
finishes compiling:It returns control to the shell (the parent process).
The child process calls
exit()
→ it's removed from the process table.
✅ So now you're back at the shell prompt.
🚀 Step 3: You Run the Compiled Program ./app
Now you execute:
./app
Again:
The shell (
bash
) calls:fork(); // Create child execve("./app", ["./app"], ...) // Replace child memory with app binary
The
./app
process is now a new process:Own PID (e.g., 2215)
Memory layout initialized:
.text
,.data
,.bss
, heap, stack
FDs 0, 1, 2 connected to your terminal
ELF binary loaded via FD and
mmap()
into memoryFD used for the binary is closed after loading
Once your program finishes:
It calls
exit()
Process dies
Control returns to the shell
🧾 Recap of Flow:
Step | Action | Result |
Open Terminal | Parent (GUI) forks & execs | New process: Terminal GUI |
Type g++ ... | Shell forks & execs | New process: g++ compiler runs |
Type ./app | Shell forks & execs | New process: Your app runs |
In all cases:
fork()
creates a copy of the parent process
execve()
replaces the child’s memory with the new programNew FDs, PID, memory are assigned
When child exits, control goes back to the parent (usually your shell)
till now what we have discussed is the sequential processing, let discuss parallel processing..
⚙️ What Is Parallel Processing?
Parallel processing means:
Running multiple processes at the same time, so work gets done concurrently — ideally using multiple CPU cores.
This is useful when:
Tasks are independent
You have multi-core hardware
You want to speed things up (compilation, server requests, file processing)
🧪 Example 1: Compiling Multiple .cpp
Files in Parallel
Earlier, you compiled files like this:
g++ -c main.cpp
g++ -c logic.cpp
g++ -c utils.cpp
This ran one after another, wasting time if you had multiple cores.
✅ Using a Makefile with Parallelism
🎯 Makefile
all: main.o logic.o utils.o
g++ main.o logic.o utils.o -o app
%.o: %.cpp
g++ -c $< -o $@
🧨 Run It With Parallelism:
bashCopyEditmake -j3
This means:
Make can spawn up to 3 separate
g++
processes at onceThese processes run in parallel, each on its own core if available
🔄 How it works internally:
Make forks a process for each file:
cppCopyEditfork(); execve("g++", ["g++", "-c", "main.cpp", ...]); fork(); execve("g++", ["g++", "-c", "logic.cpp", ...]); fork(); execve("g++", ["g++", "-c", "utils.cpp", ...]);
Each process has:
Its own memory
FD table (stdin/out/err, plus source/output files)
PID
The OS schedules these across CPU cores.
✅ Result: Faster builds using real parallelism.
🌐 Example 2: Web Server Handling Multiple Clients in Parallel
Let’s say you build a simple HTTP server.
🛑 Sequential Model (Not Parallel):
cppCopyEditwhile (true) {
int client_fd = accept(server_fd, ...);
handle_request(client_fd); // Blocks: only one client at a time
close(client_fd);
}
Only one request at a time
Others must wait
Bad for performance
✅ Fork-Based Parallel Model
cppCopyEditwhile (true) {
int client_fd = accept(server_fd, ...);
if (fork() == 0) {
// In child process
handle_request(client_fd);
close(client_fd);
exit(0); // Exit child
}
close(client_fd); // Parent closes its copy
}
Each request handled by a new process
OS runs child processes in parallel
Each has its own memory, FD, stack
✅ This is how Apache, PostgreSQL, and others used to handle concurrency.
🔗 Example 3: Pipe with Two Processes Running in Parallel
🧵 What Is a Pipe in Unix?
A pipe is a one-way communication channel used to pass data from one process to another, without using intermediate files.
Think of it like this:
🧪 Imagine process A is pouring data into a hose, and process B is drinking from the other end.
📦 A Pipe Connects:
STDOUT (output) of one process
STDIN (input) of another process
This allows:
One process to write data into the pipe
Another process to read that data in real-time
🔧 How Is a Pipe Created?
The shell or program uses the pipe()
system call:
int pipefd[2];
pipe(pipefd); // pipefd[0] = read end, pipefd[1] = write end
Then it forks two processes and assigns:
Process 1: writes to
pipefd[1]
Process 2: reads from
pipefd[0]
🧠 Pipes Work Like:
A shared buffer in the kernel
Data written to the pipe by one process gets buffered
Data is then read by the other process
🔗 Example: yes hello | head -n 5
Let’s use this example now to see a pipe in action:
yes hello | head -n 5
🛠 Step-by-Step Breakdown
✅ Step 1: The Shell Sets Up a Pipe
int pipefd[2];
pipe(pipefd); // pipefd[1] is write end, pipefd[0] is read end
✅ Step 2: The Shell Forks Two Child Processes
🔹 Child Process 1 — yes hello
Writes infinite "hello\n" to
stdout
Shell redirects its
stdout
→pipefd[1]
So now, yes
is writing into the pipe.
🔹 Child Process 2 — head -n 5
Reads from
stdin
Shell redirects its
stdin
←pipefd[0]
So now, head
is reading from the pipe.
🧪 What Happens During Execution?
yes hello
starts spamming:hello\nhello\nhello\n...
into the pipe.
head -n 5
reads from the pipe, line by line.It reads:
hello hello hello hello hello
Once
head
reads 5 lines:It closes its end of the pipe.
It exits.
yes
is still trying to write, but suddenly:The read end of the pipe is gone.
The kernel sends a SIGPIPE signal to
yes
.yes
is terminated.
✅ What You See on Your Terminal
Only what head
prints:
hello
hello
hello
hello
hello
Even though
yes
was generating infinite lines, only 5 made it through — becausehead
controlled the read and exited after 5.
🧵 Process vs Thread
Both processes and threads represent independent flows of execution, but they differ in how they manage resources, isolation, and performance characteristics.
🔍 1. What Is a Process?
A process is an independent instance of a running program, managed by the OS.
🧱 Key Characteristics:
Feature | Description |
Has its own memory space | Completely isolated from other processes |
Has its own PID, FD table, and stack | |
Created with fork() (heavyweight) | |
Expensive to create and switch between | |
Safer — crashing one process doesn’t affect others |
🔍 2. What Is a Thread?
A thread is a lightweight unit of execution within a process. Multiple threads share the same memory.
🧱 Key Characteristics:
Feature | Description |
Shares memory with other threads in the same process | |
Each thread has its own stack, program counter, and registers | |
Created with pthread_create() in C/C++, or std::thread in C++ | |
Fast to create and context switch | |
Riskier — a crash in one thread can corrupt shared memory |
🧪 Real-World Example: Web Server
Let’s say you build a server to handle client requests.
🔁 Using Processes:
int client = accept(...);
if (fork() == 0) {
handle_request(client);
exit(0);
}
Each client request spawns a new process
These processes run in parallel
Completely isolated
Expensive in terms of system resources
🧵 Using Threads:
int client = accept(...);
std::thread t(handle_request, client);
t.detach();
Each client handled by a new thread
Shares memory with other threads
Fast and efficient
More complex to handle safely (due to shared data)
📊 Comparison Table
Feature | Process | Thread |
Memory | Separate address space | Shared address space |
Creation cost | High (fork ) | Low (pthread_create / std::thread ) |
Communication | Via IPC (pipes, sockets, shared memory) | Direct via shared variables |
Crash impact | Isolated – doesn’t affect others | Shared fate – may crash the whole process |
Use case examples | Chrome tabs, microservices, CLI commands | Server threads, GUI responsiveness, AI tasks |
Scheduling | Managed by OS (context switch = expensive) | Managed by OS or language runtime |
🧠 Analogy
🍱 Processes = Bento Boxes
Each box is self-contained.
Opening one doesn’t mess with the others.
More overhead, but safer.
🍜 Threads = Compartments in a Bowl
All in the same bowl (shared memory).
Fast to create new compartments (threads).
But if one spills, the whole bowl is messy.
✅ Summary
When to Use Processes | When to Use Threads |
Need isolation and safety (e.g., running untrusted code) | Need speed and shared memory |
Crashes should be contained | Threads must coordinate carefully (mutexes, locks) |
Example: Browser tabs (separate PIDs) | Example: Game engine threads (render, audio, input) |
Thank you for reading through this article. I'm currently learning and exploring some of these concepts myself, and while going through them, I thought it might be helpful to write things down in a way that’s easy to revisit and understand. If it helped you too, I’m really glad.
That’s all — just wanted to share what I’m learning. Thanks again for taking the time to read ❤️.
Subscribe to my newsletter
Read articles from Nurul Hasan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
