Understanding Linux Process Creation Internals: fork(), exec(), Zombie Processes, and the Role of wait()
In the Linux operating system, process creation is one of the fundamental aspects of how programs execute and interact with the system. Processes are instances of running programs, and understanding how they are created, managed, and terminated is crucial for developers and system administrators. In this blog, we will delve into the internals of process creation in Linux using fork()
and exec()
, and explore concepts like zombie processes and how they are managed using the wait()
system call.
Process Creation: using fork()
The Linux kernel uses the
fork()
system call to create a new process. When a process callsfork()
, the operating system creates a child process that is a nearly identical copy of the parent process. The child process receives a new Process ID (PID) but inherits most of the parent’s resources, including file descriptors, environment variables, and memory space.What Happens During a
fork()
Call?Memory Space Duplication: Initially, the memory space of the parent is copied for the child, but Linux uses a technique called Copy-On-Write (COW) to optimize this. Memory pages are shared between the parent and child processes until one of them writes to a page, at which point a copy of that page is made.
Separate Execution: Both the parent and child process continue execution from the point where the
fork()
was called. The return value offork()
helps distinguish between the two:In the parent process,
fork()
returns the PID of the child. So basically a parent is responsible for the child, like if the child finished its execution or not, if it is taking too long etc and hence the requirement for the PID of the child.In the child process,
fork()
returns 0. Doesn’t care about parents. Just does what its supposed too.
pid_t pid = fork();
if (pid == 0) {
printf("I am the child process\n");
} else if (pid > 0) {
printf("I am the parent process\n");
} else {
perror("fork failed");
}
Replacing Process Memory: The
exec()
FamilyAfter a process is created using
fork()
, the child often calls one of the functions in theexec()
family to replace its memory space with a new program. Unlikefork()
, which creates a new process,exec()
loads a new program into the current process and starts its execution from the entry point.For example, the
execl()
function might be used to replace the child process with another program:execl("/bin/ls", "ls", "-l", (char *) NULL);
This call replaces the child’s memory with the
/bin/ls
program, effectively transforming it into a new process that executes thels
command.Zombie Processes
A zombie process is a process that has completed execution (via
exit()
), but its exit status has not been read by its parent process. When a child process terminates, it enters the "zombie" state until the parent collects its termination status using thewait()
orwaitpid()
system calls.When a process terminates, Linux must keep some information about it in the process table to allow the parent to retrieve the termination status. Once the parent retrieves this information using
wait()
, the kernel removes the process entry from the process table.A zombie process is essentially a dead process that still occupies an entry in the process table. Zombies do not consume system resources (like memory or CPU), but they do occupy an entry in the process table, which is finite. Too many zombie processes can eventually exhaust the process table, causing issues for the system.
The
wait()
System CallThe
wait()
system call is used by the parent process to wait for its child process to finish execution and retrieve its exit status. When the parent callswait()
, it blocks until a child process terminates, at which point it reaps the child’s exit status, and the zombie process is removed from the process table.pid_t pid = fork(); if (pid == 0) { // Child process printf("Child process executing\n"); exit(0); // Child exits, becoming a zombie } else if (pid > 0) { // Parent process printf("Parent waiting for child to terminate\n"); wait(NULL); // Parent collects the child’s exit status printf("Child has been reaped\n"); }
In the code above, once the child process exits, the parent calls
wait()
to reap the child. This prevents the child from becoming a zombie.Once the parent process calls wait(), the zomn=bie process is removed fromt the process table, freeing up its entry. The two main methods for handling zombie processes :
Using
wait()
orwaitpid()
: A parent process should always callwait()
(or a related function likewaitpid()
) to clean up after its child processes. This prevents the creation of zombies.Reparenting and Orphan Processes: If a parent process terminates before calling
wait()
, any orphaned child processes are reparented to theinit
process (systemd
in modern Linux). Theinit
process automatically callswait()
to clean up orphaned child processes, thus preventing zombies.
Understanding these internals is crucial for developing efficient, reliable, and scalable applications on Linux. By mastering these system calls and process management techniques, you can ensure your applications perform optimally, avoiding the common pitfalls of process creation and termination.
Subscribe to my newsletter
Read articles from Aman Kumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Aman Kumar
Aman Kumar
Konnichiwa! I'm Aman Kumar, a passionate developer, competitive programmer, and AI/ML enthusiast currently studying at the Indian Institute of Technology, Indore. Delving into Next.js, I'm broadening my web development horizons. Through regular participation in competitive programming contests, I continuously refine my problem-solving skills. As a web developer, I leverage my expertise to craft innovative solutions and deliver impactful digital experiences.