Magic of Malloc - Behind the Scenes
Dynamic memory allocation in C is a fundamental concept, and most of us are familiar with the infamous malloc function. However, did you know that "malloc" isn't a system call itself but rather a clever interface built on top of the brk() and mmap() system calls? In this blog post, we'll delve into the internals of malloc, exploring how it operates and the kernel's ingenious mechanisms to optimize system calls when using malloc repeatedly.
Optimizing System Calls:
One intriguing aspect of malloc is its ability to minimize system calls when allocating memory multiple times. The kernel employs intelligent strategies to optimize this process. Rather than invoking a system call for each malloc request, the kernel manages memory in more efficient ways.
Compile below program ( gcc mallocTest.c -o mallocTest)
// mallocTest.c
#include <stdio.h>
#include <stdlib.h>
#define FourtyKB 40*(1 << 10)
int main() {
// Loop 10 times
for (int i = 0; i < 10; ++i)
{
// Allocate 40KB of memory
void *memory = malloc(FourtyKB);
printf("Press Enter to continue...");
getchar();
if (memory == NULL)
{
fprintf(stderr, "Memory allocation failed\n");
return 1; // Exit with an error code
}
printf("Allocated 40KB of memory, iteration: %d\n", i + 1);
}
return 0; // Exit successfully
}
In above program we are trying to allocate 40KB of memory in a loop of 10.
Here "getchar" acts like a breakpoint in every "for loop" so that we can analyze what system calls are being made at every Malloc call.
After you have compiled the program (gcc mallocTest.c -o mallocTest)
simply run strace on it.\>> strace ./mallocTest
What you will notice is that Malloc does not invoke brk() system call every single time.
This is because the first time you request 40KB of memory using malloc, it allocates a buffer larger than the requested size, typically 128KB. Subsequent calls to malloc within this buffer do not trigger the brk() system call until this buffer is exhausted. The brk() system call is observed to occur approximately every 3-4 iterations.
This optimization in the kernel is done to minimize costly operations and enhance overall program efficiency.
This 128KB is a default threshold which can be changed as well.
To further fine-tune memory management, developers can delve into Malloc Tuning parameters provided by the GNU C Library. These parameters allow you to customize how malloc interacts with the underlying system calls, providing greater control over memory allocation and deallocation. Understanding and experimenting with these parameters can lead to improved performance tailored to your application's needs or your embedded system.
Subscribe to my newsletter
Read articles from Harsh Agarwal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Harsh Agarwal
Harsh Agarwal
Embarking on a tech odyssey! 🚀 Currently charting the realms of networking at the platform team in Arista, after navigating the dynamic universe at Qualcomm. My journey commenced in the Linux Kernel APPS stability for Snapdragon IOT chipsets—think wearables, smartwatches, and smart cameras. Crafted two slick Windows apps, automating parsing and earning a Qualstar along the way. Accelerating into the fast lane, I joined the Automotive development team, conquering global Linux USB Subsystem dev work, mastering the latest board bring-ups, and unleashing features like a tech superhero. Then? At the helm of the AR/VR Reference R&D team, I orchestrate Linux drivers for Power and sculpt features for the AR/VR glasses experience. Noteworthy mention? I've written the IPD (Interpupillary Distance) driver from scratch for XR chips at Qualcomm, proudly leading as the sole Software lead from India. The tech adventure continues! 🕶️