The Hoard Memory Allocator

gregburd · on March 26, 2012

Hoard is fine, but if we're going to start talking about various scalable memory allocators designed for concurrent, multi-core/cpu systems then...

libumem is the user space slab memory allocator first available in Solaris 9 (SunOS 5.4) now the default allocator on Solaris (and Illumos, SmartOS, OpenIndiana, etc.). There is a fork of libumem that has been ported to other popular operating systems, such as Linux, Windows and *BSD systems (including Darwin/OSX) by OmniTI (https://labs.omniti.com/labs/portableumem). I maintain a fork of portable libumem (https://github.com/gburd/libumem) that includes changes made by Joyent as part of their ongoing work to improve SmartOS.

I have deployed this allocator to dozens of production systems to improve the performance of highly concurrent memory-intensive applications (such as Riak) and found it to be an excellent, stable and fast allocator.

In addition to fast allocations it includes excellent statistics and memory leak detection (https://blogs.oracle.com/pnayak/entry/finding_memory_leaks_w...) as well as a few different allocator heap-fit algorithm choices.

It is licensed under the CDDL.

vgnet · on March 26, 2012

There's a very interesting comparison between libumem (mtmalloc) and Hoard at http://www.oracle.com/technetwork/articles/servers-storage-d... (HN link: https://qht.co/item?id=3755621). How do you view this class of allocators to tcmalloc, jemalloc, ptmalloc3, etc., specially in the concurrent multi-cpu scenario?

Here's what I'll be reading on allocators tonight, any further suggestions (following all the links in comments is part of the deal)?

http://en.wikipedia.org/wiki/C_dynamic_memory_allocation

http://www.facebook.com/notes/facebook-engineering/scalable-...

http://locklessinc.com/benchmarks_allocator.shtml

http://blog.reverberate.org/2009/02/20/one-malloc-to-rule-th...

scott_s · on March 26, 2012

Any writeups on the design and algorithms used?

TwoBit · on March 26, 2012

We've looked into Hoard. It's fast only if you have a large number of threads that are -constantly- allocating memory, and even then it wastes more memory than a regular heap. The large majority of real-world applications don't do this, but some server applications might. I think Hoard could be useful for that. On the other hand, if performance was critical to you then you'd change your code to not allocate memory so much.

A practical problem we have with Hoard is that it lacks diagnostics. By diagnostics I mean extensive heap reports, fragmentation information, free space, what's the largest block I can safely allocate, give me a callback for every malloc so I can do stats tracking, etc.

scott_s · on March 26, 2012

On the other hand, if performance was critical to you then you'd change your code to not allocate memory so much.

This was a problem we ran into with the Streamflow project (http://people.cs.vt.edu/~scschnei/streamflow/). There was no benchmark suite for multithreaded allocators, so we had to pull from HPC benchmark suites, but what's the first optimization many people do when improving an application's performance? Remove dynamic memory allocation.

We were motivated to do our work because of runtimes for multithreaded programming that need to dynamically allocate memory to represent parallel tasks. And I think that motivation remains, but that pain wasn't felt until recently.

onemoreact · on March 26, 2012

I don't think the parent post was suggesting you completely avoid dynamic allocation just minimize it. Because it's not a question of dynamic vs static memory but # of allocations / second / thread. And often it's a question of delayed optimization where you suspect what your doing needs to change, but you avoid optimizing until you have some performance data.

scott_s · on March 26, 2012

I was more talking about the difficulty of testing the performance of a multithreaded dynamic memory allocator. Because moving dynamic memory allocation off the critical path of an application is a common first optimization, it was difficult to find interesting applications to use as benchmarks.

scott_s · on March 26, 2012

I published a paper about a lock-free allocator back in 2006 that performed very well against Hoard and TCMalloc (http://goog-perftools.sourceforge.net/doc/tcmalloc.html), you can find the paper and source code here: http://people.cs.vt.edu/~scschnei/streamflow/

I think that our techniques worked very well, but TCMalloc is a better engineered allocator. That is, it's production quality code. People have used our allocator in their research projects when they wanted a low-latency, highly scalable allocator, but I'm not aware of anyone using it for production - and I no longer maintain it, so that's probably for the best.

There was recently a paper in PACT last year that extended on our design. Myself and my advisor would like to release a tech report with a more full description of our algorithms and designs, but both of us are busy with our current work.

dnda · on March 26, 2012

Unfortunately their performance comparison does not include modern memory allocators like tcmalloc or jemalloc.

_delirium · on March 26, 2012

It appears to have made a splash around 2000-2003 or so, which is when most of the publications and testimonials/comparisons are from. Looks like it's been mostly in maintenance mode since then, with relatively minor performance and compatibility releases around once per year, which may be why they haven't done a re-assessment against more recent competitors.

willvarfar · on March 26, 2012

Hoard is very nice and this kind of programming really interests me.

An alternative is tcmalloc by Google:

http://goog-perftools.sourceforge.net/doc/tcmalloc.html

maqr · on March 26, 2012

I see the license is both GPLv2 and proprietary for commercial use. I was unaware that licensing could be done that way. Is that a valid use of GPL?

pgeorgi · on March 26, 2012

If all copyright holders agree, then yes (see mysql).

The main issue with that is that you have some problems with accepting external contributions - you need a copyright assignment process, which is difficult to do internationally and a hurdle for contributors to overcome.

mkup · on March 26, 2012

It's a valid use of GPL in dual licensing context. Other examples of dual licensing are: VirtualBox, MongoDB, BerkeleyDB.

All patch contributors who initially chose GPL side of dual licensing model should sign copyright disclaimer for their patches, otherwise their code won't be accepted into the main repository. Alternatively, patchmaker may execute his/her GPL rights and fork the project (e.g. Percona).

ilikejam · on March 26, 2012

Sure. If you wrote it, you can license it any way you like. The Qt framework is probably the most famous example of dual licensed (GPL and commercial) software. http://qt.nokia.com/products/licensing

merijnv · on March 26, 2012

A license is just that, a license to use the code. The copyright holder of a piece of code (or really any other copyrighted data) may license it to whomever they want, however they want. The GPL is just a blanket license giving permission to everyone provided they follow the restrictions put forth by the GPL.

If you are the sole copyright holder (because you are the only person who worked on a project, or because all others assigned their rights to you) then you are allowed to provide any additional licenses besides the GPL. A lot of open source projects (see Qt, MySQL and a bunch of others) work this way, only accepting patches from the community if the patches come with a copyright assignment. This allows them to charge for closed source use of the library while at the same time allowing the open source community to benefit.

ZeroGravitas · on March 26, 2012

It's actually GPLv2 or proprietary for "commercial use" at the choice of the user which is reasonably common.

Since "commercial use" can include cases where GPLv2 is acceptable to the user (either because you don't care about handing out the source code e.g. because you're selling expensive hardware or consulting, or because you're running it on a server so the GPL doesn't apply to you) you can't forbid those uses and offer GPL, but you can offer things beyond the GPL (i.e. keeping modifications closed, which is what they're trying to imply with "commercial use").

simonster · on March 26, 2012

Jason Evans's benchmarks (http://www.facebook.com/notes/facebook-engineering/scalable-..., 3/4 of the way down the page) don't look good for Hoard, but they may be a little biased.

scott_s · on March 26, 2012

That's similar to what I found. (See my paper referenced elsewhere in the thread.)