QNX From The Board Up #14 - free()

Another quick detour to talk about free() and releasing memory.

QNX From The Board Up #14 - free()

Welcome to the blog series "From The Board Up" by Michael Brown. In this series we create a QNX image from scratch and build upon it stepwise, all the while looking deep under the hood to understand what the system is doing at each step.

With QNX for a decade now, Michael works on the QNX kernel and has in-depth knowledge and experience with embedded systems and system architecture.


free()

One more detour, because I started yammering on about how we could write a malloc() implementation that was nothing but a thin wrapper around mmap() with MAP_PRIVATE and MAP_ANON, but I didn't get into the implementation of free(), which "should be pretty easy".

Since you can ask for memory from the OS with mmap(), it makes sense that you can give it back with munmap(). Let's take a look at this function:

int munmap(void* addr, size_t len);

The first param is easy, that's the pointer to the memory we want to free, so that's the address returned by a successful mmap(), which is what was returned by our malloc(), my_malloc().

But, that second param is getting a little tricky, because we did not record the amount of memory we asked for with the mmap(). The info was passed into malloc(), but, we kinda threw it away. And now we need it again.

We could ask the person who called malloc() to keep track of it, but, how would they pass it into free(), which has the signature

void free(void* ptr);

?

Shoot. This was supposed to be the easy part. It seems our approach of skipping requirement capture and The Thinking has already painted us into a corner.

Requirements Requirements Requirements

At this point, I will say that requirement elucidation can be extremely helpful here.

On one embedded project I worked on many moons ago, an embedded device, image capture system, it's either powered on and running, or unpowered. Power is only ever removed abruptly.

We started using one small lib for something, I forget what, and it needed to call malloc() during initialization because ... Whatever. Until we'd started to use this lib, our code had never called malloc(), calloc() et al. All memory was static storage duration or automatic. Because we used a static C lib, that heap management code in the C library never got hauled in. But, with this lib calling malloc(), it was, and the size of the binary grew so much we couldn't fit it onto the device anymore.

Instead of looking at optimizing for size, we just said "Let's provide our own malloc()" which:

  • returned a pointer to a buffer (static storage duration),
  • incremented an index into the buffer, taking alignment into account,
  • asserted if called more than the 3 (IIRC) times malloc() was called by this lib.

Since this lib also had code that called free(), we had to provide a stub for that too. But, we never "shut down" gracefully (power just disappeared) therefore we never got to the path in this lib that caused free() to be called. So, our implementation of free() became

void free(void* p)
{
    assert(false);
}

It did the job.

Now, if you have a process that only ever:

  • during initialization:
    • grabs memory up front
  • during execution:
    • uses that memory
  • during shutdown:
    • frees memory
    • terminates

that empty free() could be a perfectly good implementation for you, too.

"But it will be a leak!"

No, because the OS recovers all the memory given to the process when the process terminates. In this scenario, it's actually a bit of a waste to call free() for every single allocation when all you're going to do next is terminate the process, whereupon the OS is going to say "I'm taking this all this memory back en masse."

Math quiz: How many times can you successfully run this program on a computer with 16 GB of RAM, on an OS like QNX, and no other programs running:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
    void* p = malloc(1u * 1024 * 1024);
    printf("%p\n", p);
    return (NULL == p) ? EXIT_FAILURE : EXIT_SUCCESS;
}

Answer: Infinity times.

So, why do you often hear "Make sure you call free()!"

Let's look at a different use case where you're:

  • constantly malloc()ing and free()ing, and
  • running a very long time.

In this case, forgetting to free() will mean memory is acquired and acquired and acquired... until you've reached some limit, either the amount your process is allowed to have (RLIMIT_AS), or the largest contiguous amount available by the impl of malloc()/free(), or the amount of RAM free in the system.

History has shown that big programs often start as little programs, so, for the sake of scalability, free()ing what you've acquired is a good habit.

Ensuring something is free()d also helps you keep in mind who has ownership of the memory. i.e. who is responsible for making sure that

  • the memory is free()d, and
  • that nobody else needs it or is using it when it is free()d.

So, how about that free() impl?

Oh, right, uh, let's put a struct at the front of the allocation which keeps track of the size and also takes alignment requirements into account. Left as an exercise for the reader.

There is no shortage of ways to solve this problem. What's best depends on your requirements. What are your allocation patterns? What's the histogram of sizes? Single, multi-threaded, or mix? If multi-threaded, how many threads and CPUs, and are mallocs and frees across threads? Do all threads have the same patterns and histograms? Alignment requirements? Heavy use of realloc()? (typically when parsing) etc. etc.

The C library is there for the most common use cases. Over time, some use cases have become more common. In QNX 8, we changed our implementation of malloc() etc. to be much better with multi-threaded scenarios. When your average "embedded system" has 8+ CPUs, multi-threaded processes are not uncommon.

Coming Up...

Next, we're back to mmap() and a peek at using MAP_PHYS to get access to specific physical address ranges.