Memory Allocation: Crafting malloc() from Scratch

So, say I want to manage the allocation and deallocation of memory in my system—it's actually a really interesting topic! The languages we generally use—Java, Go, Python, JavaScript—don’t require us to worry about memory allocation, right? But it’s still something we should be aware of, and it's definitely important.

Here’s an amazing talk by Benjamin Feng:
https://youtu.be/vHWiDx_l4V0

By definition, a memory allocator is a component of a programming language or runtime system responsible for managing the allocation and deallocation of memory during program execution. It becomes especially important in languages that support dynamic memory allocation, where memory can be requested and released as needed during runtime.

Let’s build a basic memory allocator!

Since this will be really simple, we’ll just implement malloc(), realloc(), free(), and calloc(). First, let’s get familiar with the memory layout of a program:

Text section: Contains the binary instructions to be executed by the processor.
Data section: Contains non-zero initialized static data.
BSS (Block Started by Symbol): Contains zero-initialized static data. Static data that is uninitialized in the program is initialized to 0 and placed here.
Heap: Contains dynamically allocated memory.
Stack: Contains automatic variables, function arguments, copies of the base pointer, etc.

Generally, the data, BSS, and heap segments are collectively referred to as the data segment. Now, if we want to add more memory to the heap, we ask the system to increase the program break (via brk). Conversely, when adding data to the stack, the break is reduced (since stack grows downward).
BTW, brk stands for program break—it's the end of the process's data segment.

`malloc()`

Also known as the memory allocation function, malloc() allocates a specified number of bytes and returns a pointer to the allocated memory.

In the code I'm working on, I’ve used sbrk() with a given size—this adds that many bytes to the heap. To use sbrk(), we’ll have to implement our own custom version. The tricky part is freeing memory, because in the current scheme, we aren't storing the size of the allocated block anywhere, so releasing memory isn’t straightforward.

Also, heap memory managed by the OS is contiguous, and we can only release memory that’s at the end of the heap. If we want to free memory from the middle of the heap, we need to make a distinction between:

Freeing memory — marking a block as free so it can be reused later (by malloc()).
Releasing memory — actually returning memory to the OS (only possible at the end of the heap).

Freeing is preferred in most cases, as it avoids modifying memory boundaries or OS-level structures.

So now, for every allocated memory block, we need to store two things:

The size of the block
A flag indicating whether the block is free or in use

This information is stored in the header of each memory block.

struct header_t {
    size_t size;
    unsigned is_free;
};

The idea is simple: when a program requests size bytes of memory, we calculate the total required space as: total_size = header_size + size

Then we call sbrk(total_size). The memory returned by sbrk() is used to store both the header and the actual memory block.

The header is managed internally by our allocator and is completely hidden from the calling program. From the program’s perspective, it only sees a pointer to the usable memory block—not the header that precedes it.

So, the structure of a single memory block looks like this:

Now we’ll organize our memory blocks in a linked list so that malloc can keep track of them efficiently. This allows us to traverse the list to find a free block that fits the requested size (also useful for free() and realloc() operations).

To enable this, we’ll modify the header of each memory block to also store a pointer to the next block in the list.

So now, the header will look like this:

struct header_t {
    size_t size;
    unsigned is_free;
    struct header_t *next;
};

And now the memory block looks like this:

Now we can go one step further and wrap the entire header in a union, along with a stub variable of 16 bytes. The reason for using a union is that its size will be the size of its largest member, which guarantees that the header ends at a memory address aligned to 16 bytes.

typedef char ALIGN[16];

union header {
    struct {
        size_t size;
        unsigned is_free;
        union header *next;
    } s;
    ALIGN stub;
};
typedef union header header_t;

Now to prevent two or more threads from concurrently accessing memory, we will put a basic locking mechanism in place and let’s use a global lock for now, which will have to be accessed before accessing any memory action and once done, can release the lock!

This is how our malloc looks:

void *malloc(size_t size)
{
    size_t total_size;
    void *block;
    header_t *header;
    if (!size)
        return NULL;
    pthread_mutex_lock(&global_malloc_lock);
    header = get_free_block(size);
    if (header) {
        header->s.is_free = 0;
        pthread_mutex_unlock(&global_malloc_lock);
        return (void*)(header + 1);
    }
    total_size = sizeof(header_t) + size;
    block = sbrk(total_size);
    if (block == (void*) -1) {
        pthread_mutex_unlock(&global_malloc_lock);
        return NULL;
    }
    header = block;
    header->s.size = size;
    header->s.is_free = 0;
    header->s.next = NULL;
    if (!head)
        head = header;
    if (tail)
        tail->s.next = header;
    tail = header;
    pthread_mutex_unlock(&global_malloc_lock);
    return (void*)(header + 1);
}

header_t *get_free_block(size_t size)
{
    header_t *curr = head;
    while(curr) {
        if (curr->s.is_free && curr->s.size >= size)
            return curr;
        curr = curr->s.next;
    }
    return NULL;
}

free(): so it will first determine if the block-to-be-freed is at the end of the heap. If it is, we can release it to the OS. Otherwise, all we do is mark it ‘free’, hoping to reuse it later.

This is how it will look:

void free(void *block)
{
    header_t *header, *tmp;
    void *programbreak;

    if (!block)
        return;
    pthread_mutex_lock(&global_malloc_lock);
    header = (header_t*)block - 1;

    programbreak = sbrk(0);
    if ((char*)block + header->s.size == programbreak) {
        if (head == tail) {
            head = tail = NULL;
        } else {
            tmp = head;
            while (tmp) {
                if(tmp->s.next == tail) {
                    tmp->s.next = NULL;
                    tail = tmp;
                }
                tmp = tmp->s.next;
            }
        }
        sbrk(0 - sizeof(header_t) - header->s.size);
        pthread_mutex_unlock(&global_malloc_lock);
        return;
    }
    header->s.is_free = 1;
    pthread_mutex_unlock(&global_malloc_lock);
}

First we get the header of the block we want to be free by getting the pointer behind the block by a distance equal to the size of hthe eader! We cast a block to a header pointer type and move it behind by 1 unit.

calloc(): This function allocates memory for an array of num elements of n size bytes and returns a pointer to allocated memory! Additionally memory is all set to zeroes!

Here’s how it looks:

void *calloc(size_t num, size_t nsize)
{
    size_t size;
    void *block;
    if (!num || !nsize)
        return NULL;
    size = num * nsize;
    /* check mul overflow */
    if (nsize != size / num)
        return NULL;
    block = malloc(size);
    if (!block)
        return NULL;
    memset(block, 0, size);
    return block;
}

realloc(): It basically changes the size of the given memory block to the size given!

We first get the block’s header and see if the block already has the size to accommodate the requested size if the current block does not have the requested size, then we call malloc() to get a block of requested size, and relocate the contents to a newer, bigger block!

Here is how it looks:

void *realloc(void *block, size_t size)
{
    header_t *header;
    void *ret;
    if (!block || !size)
        return malloc(size);
    header = (header_t*)block - 1;
    if (header->s.size >= size)
        return block;
    ret = malloc(size);
    if (ret) {

        memcpy(ret, block, header->s.size);
        free(block);
    }
    return ret;
}

So this was all about the memory allocator would love to hear if I had made any mistake!

Behind the Scenes of malloc(): Building a Memory Allocator from Scratch

`malloc()`

Subscribe to my newsletter

Kartik

Kartik