A collection of custom C++ allocators.
mkdir build && cd build
cmake ..
# build the static library (.a)
cmake --build . --config Release
# build and run performance benchmarks
cmake --build . --target run-benchmarks
# build and run individual tests
cmake --build . --target test-arena
# available:
# test-arena, test-pool, test-stack,
# test-free, test-track, test-rbtree,
# test-tree, test-buddy, test-slab,
# test-seg, test-allCustom allocators are tightly coupled with certain data structures, access patterns and object lifetimes. Just like choosing the right data structure, picking the correct allocation strategy can grant noticeable performance gains. Each allocator comes with its own quirks: different time complexities, memory overheads and optional constraints. It is crucial to pick the right tool for the job. The overview available below can help you do exactly that.
| Type | alloc_raw |
free_raw |
clear |
Overhead | Best for | Constraints |
|---|---|---|---|---|---|---|
| Arena | O(1) |
N/A | O(1) |
0B | Bulk allocations | No individual frees |
| Stack | O(1) |
O(1)* |
O(1) |
~8B | Temporary data | Strict LIFO |
| Pool | O(1) |
O(1) |
O(1) |
0B | Identical objects | Fixed sizes |
| Free List | O(n) |
O(n) |
O(1) |
~16B | General purpose | Slow search / coalesce |
| Free Tree | O(log n) |
O(log n) |
O(1) |
~32B | General purpose | High block overhead |
| Segregated | O(1) |
O(1) |
O(1) |
~16B | General purpose | Complex to tune |
| Buddy | O(1) |
O(1) |
O(1) |
~1-8B | OS memory | Fragmentation (in) |
| Slab | O(1) |
O(1) |
O(1) |
0B | Object caching | Small objects |
*Freeing a specific block also frees all allocations made after it.
Note: The time complexities above reflect the underlying raw memory algorithms (alloc_raw / free_raw). The type-safe make and destroy templates call these methods internally with near-zero overhead.
All allocators in this library inherit from the IAllocator base class.
It defines a simple virtual interface for raw memory management and provides C++20 templates for type-safe object construction and destruction.
class IAllocator {
public:
// ...
virtual void* alloc_raw(std::size_t size, std::size_t align) = 0;
virtual void free_raw(void* ptr) = 0;
virtual void clear() = 0;
virtual std::size_t capacity() const = 0;
virtual bool owns(void* ptr) const = 0;
// Handles object construction (placement-new)
// and destruction. Calls 'alloc_raw'/'free_raw' internally.
template<typename T, typename... Args>
requires (!std::is_array_v<T>)
T* make(Args&&... args);
template<typename T>
requires std::is_unbounded_array_v<T>
std::remove_extent_t<T>* make(std::size_t count);
template<typename T>
void destroy(T* ptr);
template<typename T>
void destroy(T* ptr, std::size_t count);
};You can find the full definition here.
The simplest allocator. It allocates a large, contiguous block on construction and uses m_offset to track the next available address. It is exceptionally fast but does not support individual free_raw operations.
Each time alloc_raw is called, size of said allocation is added to the m_offset.
Minimal fragmentation is achieved due to the sequential allocation - the only space wasted is used for alignment.
class ArenaAllocator: public IAllocator {
private:
void* m_start_ptr;
std::size_t m_total_size;
std::size_t m_offset;
public:
explicit ArenaAllocator(std::size_t size);
~ArenaAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override { /* NOOP */ }
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};When alloc_raw is called, it calculates the memory alignment, adds the requested size plus the alignment to the m_offset, and returns the previous address.
This makes alloc_raw an instant O(1) operation.
Because it only tracks a single forward-moving offset, you can't free individual objects.
You can only clear the entire arena, which is done by resetting m_offset to zero, resulting in a O(1) clear time complexity.
Arena allocator after two allocations. The offset points to the start of the free space.
Arena allocator after clear() is called. It holds no data.
As the name suggests, it treats the memory as a stack.
It builds upon the Arena allocator.
Contrary to the Arena, it supports individual free_raw operations, but due to the strict Last-In First-Out (LIFO) restrictions, you can only "pop" the most recently allocated element.
Its definition is practically identical to the Arena allocator, but it has an implemented method free_raw.
What differs is that each allocated block has a header before it, which stores the previous m_offset.
class StackAllocator: public IAllocator {
private:
void* m_start_ptr;
std::size_t m_offset;
std::size_t m_total_size;
public:
explicit StackAllocator(std::size_t size);
~StackAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};When alloc_raw is called, the allocator calculates the required padding for alignment, reserves space for both the header and the requested size, and writes the previous m_offset into the header just behind the returned pointer. This makes alloc an instant O(1) operation.
On free_raw, the allocator simply reads the header immediately before the given pointer and restores m_offset to the value stored in it, hence deallocating the block. Since we don't need to search for the data, it's also an O(1) operation.
Stack allocator after two allocations. The hidden headers (h1, h2) are placed immediately before the user data.
After calling free() on data 2, the allocator reads h2 and instantly snaps m_offset back to the end of data 1.
It's a very limited, but exceptionally fast and simple allocator. It works by dividing up its entire memory block into equally sized segments, reffered to as chunks.
To track free memory, it uses a singly linked list. To achieve zero memory overhead, it uses a intrusive list. It stores the "next" pointer directly inside the free chunks themselves. When a chunk is free, it acts as a list node. When it is allocated, that pointer is overwritten by the users data.
class PoolAllocator: public IAllocator {
private:
std::size_t m_chunk_size;
std::size_t m_chunk_align;
void* m_start_ptr;
std::size_t m_total_size;
void* m_free_list_head;
public:
explicit PoolAllocator(std::size_t chunk_size, std::size_t chunk_align, std::size_t chunk_count);
~PoolAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};When alloc_raw is called, it pops the m_free_list_head, updates the head to point to the next free chunk in the list, and returns the popped address.
Since there is no list traversal required, this is an instant O(1) operation.
On free_raw, it casts the passed ptr into a list node, sets its "next" pointer to the current m_free_list_head, and then updates m_free_list_head to point to ptr. This frees the chunk in O(1) time.
Pool allocator in a fragmented state. The free chunks store "next" pointers directly inside their empty space.
After calling free_raw() on data 1, it gets pushed to the front of the intrusive linked list. The head pointer is updated in O(1) time.
It's a specific type of general-purpose dynamic allocator, widely used under the hood for heap memory management. General-purpose allocators gained popularity due to having essentially no restrictions on allocation size or order. However, their performance varies greatly based on the underlying data structure used to track the free blocks:
- Free List: Achieves O(n) time complexity using a singly-linked or doubly-linked list.
- Free Tree: Achieves O(log n) time complexity using a bin search tree (e.g., Red-black tree).
- Segregated Fits: Achieves (mostly) O(1) time complexity using an array of size-segregated free lists (buckets).
The implementation below details the singly-linked free list.
The list is sorted by memory address (ascending), and is intrusive, just like in the Pool allocator.
What differs is that compared to the Pool allocator, the free block stores more data.
It stores the size of the free data, and the address to the next free block.
When a chunk of memory is allocated, it is prefixed with a AllocHeader, which holds the size and pad, required to align the data.
class FreeListAllocator: public IAllocator {
private:
struct FreeBlock {
std::size_t size;
FreeBlock *next;
};
struct AllocHeader {
std::size_t size;
std::size_t pad;
};
void* m_start_ptr;
std::size_t m_total_size;
FreeBlock* m_free_list_head;
public:
explicit FreeListAllocator(std::size_t size);
~FreeListAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};For clarity purposes, the helpers/handlers are not shown above. You can find the full definition here.
When alloc_raw is called, it traverses through the list of free blocks, picking one matching the wanted size the closest (best-fit), or the first that can contain the memory to allocate (first-fit), and allocates the data with the hidden header before it.
Due to the required list traversal, the time complexity is O(n).
When free_raw is called, reads the size and pad from the hidden header, and casts the memory back into a FreeBlock.
To prevent external fragmentation, it then coalesces neighboring free memory blocks.
Coalescence is achieved by checking if the end of one free block perfectly touches the start of the next.
If they touch, their sizes are summed together, and the second block is physically removed from the linked list.
To prevent pointer invalidation bugs, it is best practice to always coalesce a block with its next neighbor before coalescing with its previous neighbor.
Because finding the correct insertion point in the address-sorted list requires traversal, free_raw also carries an O(n) time complexity.
Free list in a highly fragmented state.
After calling alloc_raw() for 20B, the allocator uses "first-fit" to slice 50B (header + data 3) off free 1. The remainder of free 1 shrinks to 70B.
After calling free_raw() on data 2, the allocator detects that free 2, the newly freed block, and free 3 are adjacent. It coalesces them into a single 250B block.
While the Free list allocator is highly memory-efficient, traversing a linked list to find a suitable block results in an O(n) time complexity. The Free tree allocator solves this by replacing the linked list with a Red-black tree.
Instead of free blocks storing a simple next pointer, they store the metadata required to act as a node within a self-balancing binary search tree.
This drops the time complexity for finding a best-fit block down to O(log n).
To achieve O(1) neighbor coalescence without list traversal, this allocator implements boundary tags (used also e.g. in dlmalloc) around every block.
It also utilizes a hidden back pointer stored in the alignment padding, allowing it to locate a block's metadata instantly during a free_raw operation.
class FreeTreeAllocator: public IAllocator {
private:
struct BlockMetadata {
std::size_t size_and_flags; // encodes size and allocated flag
// ... helper methods ...
};
using AllocHeader = BlockMetadata;
using AllocFooter = BlockMetadata;
void* m_start_ptr;
std::size_t m_total_size;
internal::RBTree* m_free_tree;
public:
explicit FreeTreeAllocator(std::size_t size);
~FreeTreeAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};For clarity purposes, the helpers/handlers are not shown above. You can find the full definition here.
When alloc_raw is called, it searches the red-black tree for the best-fitting block in O(log n) time.
If the block is significantly larger than requested, it is split, and the remainder is re-inserted into the tree.
A pointer is then aligned, and a hidden back pointer is written into the padding exactly behind the returned address.
When free_raw is called, it steps back exactly sizeof(void*)B to read the back pointer, instantly locating the AllocHeader.
It then uses the boundary tags (header and footer) to check the left and right neighbors in O(1) time, coalesces them if they are free, removes them from the tree, and inserts the newly merged block into the tree.
Allocated blocks hide a back pointer in the alignment padding. Free blocks overwrite the user data to store red-black tree navigation pointers.
When the target data is freed, the allocator reads the footer to its left and the header to its right. Because the right block is a epilogue, it acts as a wall, preventing out-of-bounds memory access.
After coalescing, the target block and the left free block are merged into a single space and inserted into the red-black tree as a new node.
The Segregated allocator is a high-performance, general-purpose memory manager. It solves the O(n) traversal problem of the standard Free list allocator by maintaining an array of distinct free lists, called buckets. Each bucket strictly holds memory blocks of a specific size class (in this case, powers of two).
By combining size segregation with an intrusive doubly-linked list and a buddy-style splitting algorithm, it guarantees a strict O(1) time complexity for both allocation and deallocation, making it ideal for real-time systems like game engines.
To avoid the "floating header" problem and ensure instant O(1) coalescing, it anchors a 1-byte AllocHeader at the physical start of the block, and drops a 1-byte back-pointer offset exactly behind user's aligned memory.
class SegregatedAllocator: public IAllocator {
private:
struct AllocHeader {
std::uint8_t order : 6;
std::uint8_t is_free : 1;
std::uint8_t is_prev_free : 1;
};
struct AllocFooter {
std::uint8_t order;
};
struct FreeBlock {
FreeBlock* prev;
FreeBlock* next;
};
static constexpr std::uint8_t NUM_BUCKETS = 25;
static constexpr std::uint8_t MIN_BUCKET_ORDER = 5;
void* m_start_ptr;
std::size_t m_total_size;
std::array<FreeBlock*, NUM_BUCKETS> m_buckets;
std::uint32_t m_active_buckets; // each bit is a bucket: 1 if free, 0 if not
public:
explicit SegregatedAllocator(std::size_t size);
~SegregatedAllocator() override;
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};For clarity purposes, the helpers/handlers are not shown above. You can find the full definition here.
When alloc_raw is called, it calculates the required size and uses a bitwise operation (std::countr_zero) to find the target bucket.
Rather than iterating through the array to find an available block, it queries m_active_buckets - a 32b integer acting as a bitmask.
It locates the first available larger block in one CPU cycle via std::countr_zero.
It then recursively splits this block down in a buddy-style cascade until the target bucket is populated.
It then pops the block from the intrusive list in O(1) time, aligns the data pointer, and writes the offset back pointer right behind the data.
When free_raw is called, it steps back exactly one byte from the data pointer to read the offset, instantly jumping to the start of the block to grab the AllocHeader.
In a standard segregated allocator, every call to free_raw would halt the execution to coalesce.
While it is mathematically O(1), it causes a lot of cache misses.
By deferring this process, free_raw skips neighbor checking entirely and simply pushes the block to the front of its respective bucket list.
This prevents cache thrashing and allows the allocator to handle chaotic, fragmented scenarios over 3 times faster than std::malloc.
The trade-off is fragmentation.
If an allocation fails because no contiguous blocks are logically available, the allocator pauses to trigger coalesce_all. It's a full heap sweep that uses bitwise XOR math to locate buddies, merge them together, and allow the allocation. In a moreso production environment, this sweep can be called manually e.g. during loading screens in video games.
To serve a request, a 256B block (order = 8) is split. The left buddy is claimed, and the right buddy is pushed to the Order-7 free list.
After calculating alignment padding, a 1-byte back-pointer offset is dropped exactly behind the user's payload to allow O(1) backtracking.
On free_raw, the allocator steps back to read the physical header and checks its neighbors. It finds its original buddy from step 1 and unlinks it.
Unlike the previous allocators, the Tracking allocator does not manage memory directly. Instead, it acts as a wrapper around any other existing allocator. Its primary purpose is debugging.
It intercepts calls to alloc_raw and free_raw, records memory metrics, and then forwards the request to the allocator. It is a great tool for profiling memory usage, detecting leaks, and measuring peak memory consumption during app runtime.
class TrackingAllocator: public IAllocator {
private:
IAllocator& m_base_allocator;
std::unordered_map<void *, std::size_t> m_active_allocs;
std::size_t m_curr_alloced_bytes = 0;
std::size_t m_peak_alloced_bytes = 0;
public:
explicit TrackingAllocator(IAllocator& base_allocator);
~TrackingAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
std::size_t curr_bytes() const { return m_curr_alloced_bytes; }
std::size_t peak_bytes() const { return m_peak_alloced_bytes; }
std::size_t active_allocs() const { return m_active_allocs.size(); }
};When alloc_raw is called, it increments its internal counters, calculates if a new m_peak_alloced_bytes has been reached, and then simply returns the result of m_base_allocator->alloc().
When free_raw is called, it decrements m_curr_alloced_bytes and forwards the pointer to m_base_allocator->free(). Because it only performs basic arithmetic before delegating the actual work, the overhead is extremely minimal, maintaining the time complexity of the underlying allocator.
The Buddy allocator is a highly efficient memory manager frequently used at the OS level (e.g. in the Linux kernel). It manages memory by dividing it into partitions to try to satisfy a memory request as suitably as possible.
It works strictly with block sizes that are powers of two. To keep track of available memory, it uses an array of intrusive doubly-linked lists, where each order represents a specific power-of-two block size. Because it rounds every allocation up to the nearest power of two, it completely eliminates external fragmentation at the cost of internal fragmentation.
Its biggest advantage is its very fast coalescence, achieved through bitwise arithmetic rather than list traversal.
class BuddyAllocator: public IAllocator {
private:
struct AllocHeader {
std::uint8_t data; // MSB - is_free, rest - order
// ... helper methods ...
};
struct FreeBlock {
AllocHeader header;
FreeBlock* prev;
FreeBlock* next;
};
std::size_t MIN_BLOCK_SIZE = 32;
std::uint8_t MAX_ORDER = 32;
void* m_start_ptr;
std::size_t m_total_size;
std::array<FreeBlock*, MAX_ORDER> m_free_lists;
public:
explicit BuddyAllocator(std::size_t size);
~BuddyAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};For clarity purposes, the helpers/handlers are not shown above. You can find the full definition here.
When alloc_raw is called, it calculates the total required size (including the alignment padding and the header) and determines the target order (power of two).
It then checks the free list for that specific order.
If no blocks are available, it finds the next largest available block and recursevily splits it in half. Each split creates two "buddies".
The left buddy is used for the allocation, and the right buddy is pushed onto the lower-order free list.
To support strict alignment, a header is placed exactly behind the aligned data pointer, storing the offset back to the physical block start.
Because the max order is a constant, this operation is of O(1) time complexity.
When free_raw is called, it steps back to read the hidden header and uses the stored offset to locate the physical start of the block.
To coalesce, it calculates the address of its "buddy" using a simple bitwise XOR operation on its relative memory address.
If the buddy is also free and of the same order, they are instantly merged into a single, larger block.
This process repeats recursively up the hierarchy.
This mathematical approach allows coalescence in O(1) time without traversing any lists or relying on headers/footers.
Buddy allocator beginning a split cascade. To serve a small allocation, a 128B block is halved into two 64B buddies.
After splitting again, a 32B block is reserved. A header is placed before the aligned data, leaving the 32B and 64B buddies in free lists.
After calling free_raw, the allocator reads the hidden header. It uses a bitwise XOR operation to instantly locate the adjacent 32B buddy.
The allocator recursively merges the free buddies. The two 32B blocks merge into 64B, and the two 64B blocks zip back into a single 128B block.
The Slab allocator is a highly efficient object-caching memory manager. It is designed to completely eliminate internal fragmentation for small objects and avoid the overhead of storing metadata next to every single allocation.
Instead of managing memory byte by byte, it requests large, page-aligned blocks of memory (slabs) from a base allocator (usually the Buddy allocator) and divides them into fixed-size chunks.
It uses an array of CacheManager objects, where each cache handles a specific power-of-two size class.
Because it operates on fixed-size slots within page-aligned boundaries, it requires zero per-object metadata.
class SlabAllocator: public IAllocator {
private:
struct SlabHeader {
SlabHeader* prev;
SlabHeader* next;
void* free_list_head;
std::uint16_t used;
std::uint16_t capacity;
std::uint16_t cache_idx;
std::uint16_t id;
};
struct CacheManager {
std::size_t object_size;
SlabHeader* empty_slabs;
SlabHeader* partial_slabs;
SlabHeader* full_slabs;
void init(std::size_t size) {
// ...
}
};
std::uint8_t NUM_CACHES = 9;
std::uint8_t MIN_CACHE_ORDER = 3;
std::uint16_t SLAB_ID = 0x51AB;
std::size_t m_total_size;
std::size_t m_page_size;
IAllocator* m_base_allocator;
std::array<CacheManager, NUM_CACHES> m_caches;
public:
explicit SlabAllocator(std::size_t size);
~SlabAllocator();
void* alloc_raw(std::size_t size, std::size_t align) override;
void free_raw(void* ptr) override;
void clear() override;
std::size_t capacity() const override;
bool owns(void* ptr) const override;
};For clarity purposes, the helpers/handlers are not shown above. You can find the full definition here.
When alloc_raw is called, the allocator first checks if the request exceeds its maximum cache size.
If it does, the request is routed directly to the m_base_allocator.
If the size is small, it determines the correct CacheManager and looks for an available slot in the partial_slabs, later checking the empty_slabs lists.
If a slab is found, it pops the head of the intrusive free list in O(1) time, increments the used counter, and promotes the slab to the full_slabs list if capacity is reached.
If no slabs are available, it requests a new page from m_base_allocator, formats it with a SlabHeader, and carves the rest into new slots.
When free_raw is called, the allocator faces a unique problem - determining whether the pointer belongs to a Slab cache or is a huge block managed by m_base_allocator.
This issue is commonly solved by passing the size of an allocation as an argument, and checking if it exceeds the maximum cache size.
The implementation in this codebase however, doesn't pass size as an argument, so it requires a different approach.
Because slab pages are strictly aligned to m_page_size, the allocator applies a bitmask to the pointer to instantly find the start of memory page.
It then checks the id field of the struct sitting there.
If the ID matches the predefined SLAB_ID (0x51AB, standing for 'slab' obviously), it is a slab block, and the pointer is pushed back into the slab's internal free list.
The slab is transitioned between the full/partial/empty lists as needed.
If the ID does not match, the allocator immediately delegates the free_raw operation to m_base_allocator.
A 4KB page is requested from the base allocator. A header is placed at the front, and the remaining space is divided into chunks linked via an intrusive list.
After two allocations, the objects are cached tightly together with zero metadata overhead. The slab's free list head pointer simply shifts to the next available slot.
On free_raw, the allocator applies a bitmask to the pointer to instantly jump back to the page-aligned header. It checks the ID to confirm it belongs to a slab.
If a memory request exceeds the maximum cache size, the slab structure is completely bypassed, and the allocation is routed directly to the base allocator.
While oo-alloc is meant for educational purposes, the standard library contains a great production-grade alternative.
This is how oo-alloc maps to std::pmr:
oo-alloc concept |
std::pmr equivalent |
Description |
|---|---|---|
IAllocator |
std::pmr::memory_resource |
The abstract base class representing a memory pool. |
alloc_raw/free_raw |
allocate/deallocate |
Public interface for raw byte management. |
make<T>/destroy<T> |
std::pmr::polymorphic_allocator<T> |
The mechanism that bridges raw bytes to the C++ object lifecycle. |
#include <memory_resource>
// create the raw buffer memory resource
std::pmr::monotonic_buffer_resource resource(1024);
// create a typed wrapper bound to the resource
std::pmr::polymorphic_allocator<Thing> alloc(&resource);
// allocate raw memory
Thing* thing = alloc.allocate(1);
// construct the object
alloc.construct(thing, 4562);
// destroy and deallocate
alloc.destroy(thing);
alloc.deallocate(thing, 1);#include <oo_alloc/ArenaAllocator.hpp>
// create the allocator
oo_alloc::ArenaAllocator arena(1024);
// allocate and construct
Thing* thing = arena.make<Thing>(4562);
// destroy and deallocate
arena.destroy(thing);All measurements are single-threaded and run on google/benchmark.
std::malloc is included as a reference baseline; note that it is thread-safe,
which the allocators in this library are not — the comparison reflects raw
algorithmic cost under specific workloads, not equivalent guarantees.
Numbers below are from one machine (ARM M4), your results may vary.
Mixed alloc/free workload designed to fragment the heap. std::malloc and the Free tree slow down as the heap fragments — malloc because internal coalescing scans grow with allocation count, the Free tree because of rebalancing overhead. The Segregated allocator runs ~5× faster than std::malloc on this workload thanks to per-size-class free lists and deferred coalescing.
Frees objects in random order, forcing the Free list to repeatedly traverse the address-sorted free list — quadratic in the number of live blocks. The Buddy and Segregated allocators stay close to Pool/Slab throughput here because their coalescing is constant-time (XOR-based buddy lookup), independent of free-list length.
Allocation latency as a function of heap fragmentation. The Free list scales linearly with the number of free blocks (search cost), the Free tree logarithmically (RBTree lookup), and the Segregated allocator stays flat — the bucket lookup via std::countr_zero is constant-time and independent of how fragmented the heap is.
Sequential allocation throughput with no frees. Arena and Stack set the upper bound — both reduce to a pointer bump. The Segregated allocator reaches ~180M allocations/second, within the same order of magnitude as the bump allocators because its hot path is a bitmask lookup and a free-list pop.
Throughput on uniformly random block sizes. The Buddy allocator spends most of its time recursively splitting and coalescing power-of-two blocks. The Segregated allocator defers coalescing until an allocation fails, avoiding the per-call overhead and maintaining higher throughput on this mix.
- Handle thread safety.
- Add a benchmark/performance README section.
- Implement a size segregated free list allocator.
- Add
ownsmethod. - Update readme after changes
- Move
alloctoalloc_rawand make a inline templatemake - remove
realloccompletely - move
initlogic to constructors - Move from hand-written tests to proper stress testing.
- Implement a slab allocator.
- Implement a pow-of-two buddy allocator.
- Move benchmarks to google/benchmark.
- Add realloc description in README.
- Refactor and clean up API.
- Move free-list coalescing to a helper function.
- Implement a red-black tree free list allocator.
- Move from malloc to mmap/VirtualAlloc.
- Add realloc functionality.
- Add a build README section.
- Write a description for each allocator.
- Implement a explicit free-list allocator.
- Linear Allocator from Nicholas Frechette's Blog
- Writing My Own Malloc in C by Tsoding
- mtrebi's memory-allocators repository
- gingerBill's Memory Allocation Strategies - Article Series
- CS107 - Explicit Free List Allocator, Stanford University
- MallocInternals from glibc wiki
- Slab allocation Wikipedia page
- Red-black tree Wikipedia page
- Data Structures and Algorithms - Red-black trees, University of Michigan
- Red-Black Trees chapter from Introduction to Algorithms by Cormen, Leiserson, Rivest and Stein
- CS0330 - Malloc, Brown University