When you type gcc main.c in the command line, you're starting a process that transform human-readable code into something a computer can execute. But what exactly happens during this process? I will try as best I can to explain the process step by step.

What is GCC?

First, let's start by understanding this acronym, GCC. GCC stands for GNU Compiler Collection, a tool that takes source code (like your main.c file) and turns it into machine code that a computer can run. When you type gcc main.c, you are asking the compiler to translate your C code into an executable file. This is the first step in making a program work.

Why do we need to compile code?

Computers don't understand human programming languages directly, like C or C++. They only understand machine code, which is a series of numbers (binary) that represent instructions the computer can follow. Compilation bridges the gap between human-readable code and machine-readable code. Without compilation, the computer wouldn't know how to execute your program.

Stages of compilation

When you run the command gcc main.c, the compilation process follows several key steps. To see this, let’s implement each step ourselves:

1. Preprocessing: Expanding Macros and Include Directives

The first thing GCC does is preprocess your code. This includes removing comments and expanding macros (special shortcuts defined in your code). It reads through the file and prepares it for the next stage.

gcc -E main.c -o main.i

Here’s what happens:

#include <stdio.h> is replaced with the actual contents of stdio.h (a header file)
#define macros are expanded.

2. Compilation: Converting C to Assembly

Now, GCC can translate your cleaned-up code into assembly language. This is still human-readable, but it is much closer to machine code. It's a lower-level representation of your code that's specific to the architecture of your computer (x86 or ARM).

gcc -S main.i -o main.s

Here’s what happens:

main.c is translated into assembly language (low-level instructions).

3. Assembly: Turning Assembly into machine code

Next, the assembler takes the assembly code and converts it into machine code, which is still not executable but is in the form of an object file. This file contains binary instructions that the computer can understand, but it still isn’t ready to run.

gcc -c main.s -o main.o

Here’s what happens:

The assembler converts main.s into main.o, a binary object file.

4. Linking: Producing the Final Executable Program

Finally, GCC links all object files (yours and any external libraries you're using) into a final executable program. This step links your program with the computer’s system libraries, such as input/output functions, ensuring it runs smoothly.

gcc main.o -o main

Here’s what happens:

The linker connects main.o with system libraries.
The final executable main is created.

Now, you can run your program by typing:

./main

When do you compile?

You need to compile your code every time you make changes to it. For instance, if you edit main.c, you'll have to re-run the gcc main.c command to produce an updated executable. In more complex projects, developers often use Makefiles to automate this process, ensuring only changed files are recompiled.

I hope you enjoyed reading this and now have some understanding of what happens under the hood when you type gcc main.c.

Share your thoughts with me! Did I miss anything, explain something incorrectly, or is there something you'd like me to elaborate on?

Until then, happy learning!

What happens when you type gcc main.c

What is GCC?

Why do we need to compile code?

Stages of compilation

When do you compile?

Subscribe to my newsletter

Sunny Pritchard

Sunny Pritchard