Unix IPC Deep Dive: Understanding fork() and Signals

While thinking about the Producer and Consumer from the previous article, I realized I'm not that familiar with the various ways processes can communicate. Luckily, I found a great resource: Beej's Guide to Interprocess Communication. I recommend reading through it as it's a great guide with lots of useful information, written in a witty way.

This series will be a simplified version of the guide with some extra notes I added to improve my understanding.

All the code in the series is available as a GitHub repository.

Before diving deeper into various communication techniques, it's useful to understand fork() and signals, so let's start with that.

fork()

A Unix process is a running instance of a program that has a process ID (PID) and a reserved part of memory called an address space.

A process can create another process, called a child process, by using a fork() system call. The child process is a copy of the parent but with its own PID and address space. The data in the child process is duplicated from the parent. The child process starts executing from the point where the fork() call was made in the parent process.

fork() returns an integer value that is used to differentiate between the parent and child process:

in the parent process, the call will return the PID of the child if it is successful, and a negative number otherwise.
in the child process, the call will return 0.

Typically, the parent waits for the child process to exit by using wait() or waitpid() system calls. That way, the parent can also obtain the exit code.

Other system calls that are interesting in this context are getpid() which returns the PID of the current process and getppid() which returns the PID of the parent process.

In this demo, a parent spawns a child process and waits for it to exit.

#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>

void fork_demo()
{
    int status_argument;
    pid_t pid = fork();
    switch (pid)
    {
    case -1:
        perror("fork");
        exit(1);
    case 0:
        printf("CHILD: PID: %d, parent's PID: %d\n", getpid(), getppid());
        exit(42);
    default:
        printf("PARENT: PID %d, child's PID: %d\n", getpid(), pid);
        printf("PARENT: Waiting for child to exit()...\n");
        wait(&status_argument);
        // Low-order 8 bits of the status argument represent the exit code
        int exit_code = WEXITSTATUS(status_argument);
        printf("PARENT: Child's exit status: %d\n", exit_code);
        printf("PARENT: Full exit number: %d\n", status_argument);
    }
}

When the child finishes with execution, it will send a signal SIGCHLD to notify the parent and will become defunct, (a "zombie" process). If the parent calls the wait() or waitpid() functions, the child process can completely exit.

A parent process can ignore the handling of SIGCHLD signal:

signal(SIGCHLD, SIG_IGN);

All child processes that end up living longer than their parent, either because the parent ignored their SIGCHLD or because the parent process died before calling wait(), will be reparented to the init process (PID 1). init will periodically destroy all defunct processes.

Signals

A signal is a way for one process to send a message to another. It's usually used to request something of a process or to notify it about something. A signal that you're most likely familiar with is SIGINT which is sent when you press CTRL+C. Programs typically handle that signal by exiting. Commonly used signals are:

SIGINT - interrupt signal, raised by pressing CTRL+C in a terminal.
SIGSTOP - suspend the process.
SIGCONT - continue execution of the process.
SIGTERM - request termination of the process.
SIGKILL - forceful termination of the process.
SIGCHLD - sent to the parent process when the child terminates or stops.
SIGUSR1 and SIGUSR2 - available for custom handling.

To handle the received signal, there's usually a default action provided by the operating system. For example, a default action for SIGINT signal is for the process to exit.

A process can also have a custom handler for a specific signal. To set up a custom handler, we need to use sigaction, both a function and a struct with that name.

struct sigaction {
    void     (*sa_handler)(int);
    void     (*sa_sigaction)(int, siginfo_t *, void *);
    sigset_t sa_mask;
    int      sa_flags;
    void     (*sa_restorer)(void);
};

sa_handler specifies the action to perform when a signal is received. It's a function that takes only the signal number as a parameter. There are predefined functions that we can pass here, SIG_DFL for using the default action and SIG_IGN for ignoring the signal.
sa_sigaction can be used if the SA_SIGINFO flag is set in sa_flags parameter. By using this signature we can get more information about the signal via the extra parameters.
sa_mask allows us to block other signals while we're processing this one. There are functions available to manipulate this set:
- sigemptyset() - initializes the signal set to empty set, excluding all signals.
- sigfillset() - initializes the set to full, including all signals.
- sigaddset() - add a signal to the set.
- sigdelset() - remove a signal from the set.
- sigismember() - test whether a signal is a member of the set.
sa_flags is used to set flags that modify the behavior of the signal, for example:
- SA_SIGINFO enables the usage of sa_sigaction function handler instead of the default sa_handler.
- SA_RESTART ensures that certain system calls, such as fgets(), will be restarted after receiving an interrupt.
sa_restorer field is not intended for application use.

The function is much simpler as it makes use of the struct.


int sigaction(
    int sig,                     // A signal to catch, e.g. SIGINT
    const struct sigaction *act, // Struct with details about singal handling
    struct sigaction *oact);     // Old handler, in case of temporary changes

sig is an identifier of a signal we're handling, e.g. SIGINT.
act is an instance of sigaction struct described above, with all the details.
oact is used in cases when we want to introduce temporary changes, and it holds a reference to the previously used handler. It allows for reverting to the previous handling approach at the end of our custom handler.

It's important to note that some signals can't have custom handlers, such as SIGSTOP and SIGKILL.

In this example, we create a custom handler for SIG_INT signal and use the SA_RESTART flag to restart fgets() call in case it gets interrupted.

#include <signal.h>
#include <stdio.h>

void sigint_handler(int sig)
{
    write(0, "\nI got interrupted!\n", 20);
}

void signals_demo()
{
    void sigint_handler(int sig);
    char s[200];
    struct sigaction sa = {
        .sa_handler = sigint_handler,
        .sa_flags = SA_RESTART,
    };
    sigemptyset(&sa.sa_mask);

    sigaction(SIGINT, &sa, NULL);

    printf("Enter a string and press ENTER to exit.\n");
    printf("Press CTRL+C during typing to restart it.\n");

    fgets(s, sizeof s, stdin);
    printf("You entered: %s\n", s);
}

When making system calls, it's a standard practice to check the return value for errors, for example:

if (sigaction(SIGINT, &sa, NULL) == -1)
 {
    perror("sigaction");
    exit(1);
}

I decided to omit such checks from the samples to make them easier to read.

That's it for the intro. In the next article, we'll start with the simplest form of inter-process communication, the pipes. Stay tuned!

References:

fork() and signals

fork()

Signals

Subscribe to my newsletter

Mladen Drmac

Mladen Drmac