Introduction

In previous post , We implemented basic allocator which can be used in production enviornment for very faster memory allocation queries and we took the first step towards your first billion $$$. It was perfect but lets add more code to turn your billion dollars into millions. Let’s change the strategy to improve fragmentation by adding more code (more code -> more bugs -> more loss). Hence no-code is the future.

There are two types of fragmentation, internal fragmentation, if 100 bytes are being asked to allocate, malloc returns block which size is 104. 4 bytes extra. Keep asking 100 bytes, 100 times, you will get 400 bytes extra in total but spread across the blocks in 4 bytes chunk. External fragmentation is if variable sized blocks are requested but there are none of the free blocks are in asked size. We’ll go deeper in later..

Splitting

In previous post, we used to any block that would be larger than or equal size of the requested size. size <= currernt->size. Now, We’ll check that if block can be split into two blocks. New blocks’ sizes will be size and current->size - size. To Split, block size should be larger than the usable size. Usable size is summation of block metadata (sizeof Block) and minimum block size (8). Therefore block should only be splited if its size is greater than (or equal to) summation of requested size, block metadata and minimum block size. If leftover space after splitting is less than the block metadata + minimum block size then its not worth splitting, it would create an unsusable fragment.

Let’s visualize it. We have block size of 100 bytes.

+------------------------------+
|         Block Header         |  <-- current (size = 100)
+------------------------------+
|                              |
|         Free Memory          |
|      (size = 100 bytes)      |
|                              |
+------------------------------+

Now, requested size is 40 bytes, so we need minimum 40 + Block Size (16) + Minimum split size (8) = 64 bytes of the block for able to split without unusable fragment.

+------------------------------+------------------------------+
|       Block Header A         |       Block Header B         |
+------------------------------+------------------------------+
|  Allocated to user (40 B)    |  Remaining free (44 B)       |
|                              |                              |
+------------------------------+------------------------------+

Block A total size: 40 (user) + 16 (header) = 56
Block B total size: 44 (free) = 44 ≥ MIN_SPLIT_SIZE

If block size would have been 50, it’s no use to split as it would create 10B of unusable fragment.

Post split, new block should be added to the free list.

In code, we need to make changes as below in malloc,

#define MINIMUM_SPLIT_SIZE 8

void *malloc(size_t size)
{
    Block *prev = NULL;
    Block *current = free_list;

    size = (size + 7) & ~7; // Align to 8 bytes

    // Search free_list only (not all blocks)
    while (current)
    {
        if (current->size >= size)
        {
            // Check eligibilty of splitting.
            if (current->size >= (size + BLOCK_SIZE + MINIMUM_SPLIT_SIZE))
            {
                // Current block size would be reduced to requested size.
                // new_block will be splited free block
                Block *new_block = (Block *)((char *)current + BLOCK_SIZE + size);
                new_block->size = current->size - size - BLOCK_SIZE;
                new_block->next = current->next;

                current->size = size;
                if (prev)
                {
                    prev->next = new_block;
                }
                else
                {
                    free_list = new_block;
                }
            } else  {
                // Remove from free list
                if (prev)
                {
                    prev->next = current->next;
                }
                else
                {
                    free_list = current->next;
                }
            }
            return (void *)(current + 1);
        }
        prev = current;
        current = current->next;
    }

   ...
}

Address ordered free_list

As we are splitting the blocks, we need to coalesce the block as well to create a big block that can be splitted if required or satisfy the request of the similar size. From previous post, calling free is adding the freed block to the free_list’s head as shown below,

              +---------+     +---------+     +---------+     +---------+
free_list --> | Block A | --> | Block D | --> | Block B | --> | Block C | --> NULL
              +---------+     +---------+     +---------+     +---------+

From random list, it will be additional effort to sort the list based on their addresses and then run coalesce logic. We can improve free to add free block in address order (sorted by memory address) so we can coalesce the consucutive block into one free block.

void free(void *ptr)
{
    if (!ptr)
        return;     // Silently return, not failing not. It should fail.

    Block *block = (Block *)ptr - 1;    // ptr points to payload not the Block Metadata.

    // Insert at correct position in adddress order for coalescing.
    if (!free_list || block < free_list)
    {
        // Free list is either empty or block is located
        // earlier in the memory than current head of free_list.
        block->next = free_list;
        free_list = block;
    }
    else
    {
        Block *curr = free_list;
        while (current->next && current->next < block)
        {
            current = current->next;
        }
        block->next = current->next;
        current->next = block;
    }

    // Call Coalescing logic?
    coalescing_free_list()
}

free_list will remain sorted by memory address.

Coalescing

There’s an option to call coalescing logic after the block is placed at its address ordered place. Coalescing everytime there’s new free block on the list seems overly aggressive as same block might colesced and splitted repeatedly and not coalescing everytime we might end up not using the free block avaialble despite the availibility. In Glibc, there are multiple lists called bins. Bins categorize free memory chunks by size and optimize their reuse. There are fastbins, smallbins and largebins. Whether coalescing occurs immediately or is deferred depends on which bin the chunk is placed in.

Fastbins is for the smaller allocation and lazily coalesced where smallbins and largebins are for medium to large allocations and immediately coalesced on free.

For simplicity, we’ll do coalescing each time we free the block.

void coalescing_free_list()
{
    Block *curr = free_list;

    while (current && current->next)
    {
        // Char *, it moves N bytes instead of sizeof Block. Raw memory layout math!!
        // This byte address right after the current block ends which is where next adjacent block would begin.

        /**
         *      +------------------+-----------------------------+
         *      | Block Metadata   |    Usable memory (size B)   |
         *      +------------------+-----------------------------+
         *      ^ current          ^ current + BLOCK_SIZE        ^ curr_end
         */

        char *curr_end = (char *)current + BLOCK_SIZE + current->size;

        if (curr_end == curr->next)
        {
            // Adjacent: Let's merge
            current->size = current->size + BLOCK_SIZE + current->next->size;
            current->next = current->next->next;
        }
        else
        {
            current = current->next;
        }
    }
}

It’s the basics of how basic memory allocator works.