libhtp
libhtp copied to clipboard
[rfc] Memory wrappers and accounting v4
Not for merge.
In my quest to find where Suricata spends it's memory I've created this branch to track libhtp's memory allocations and frees. What it does is create wrapper functions for malloc, calloc and friends: htp_malloc, htp_calloc, etc.
Each of those then updates a global counter (using gcc atomics to be thread safe), which indicates the current amount of memory in use by libhtp.
A simple new function htp_memory_get_memuse() exposes this value to the application.
To make this work, all calls to 'htp_free' and 'htp_realloc' had to be changed to take the current size of the memory as an argument.
I think the wrappers themselves make sense, although the size arguments to free and realloc are a bit painful and easy to mess up. The counter implementation is probably a problem. Maybe we could have a callback or a set of callbacks that is called for each allocation. The memuse counter could then move into the application:
void *htp_malloc(size_t size) { void *ptr = malloc(size); if (ptr && htp_malloc_callback) htp_malloc_callback(size); return ptr; }
My callback (in Suricata) would then be:
void HtpMallocUpdate(size_t size) { SC_ATOMIC_ADD(libhtp_memuse, size); }
Or we could to a more generic update callback, that pushes a signed int so both memory use grow and shrink operations call the same callback.
Have you considered tracking memory usage on per-connection basis? That way you don't have to deal with multithreading (and the locking overhead), and you could have a mechanism to "kill" a connection if it's using too much usage.
I think I would prefer a combination of per connection and global tracking. I think the per connection tracking is logic that needs to be in libhtp, as each allocation needs to be connected/tied to a connection.
The global tracking could be in the form of allocation/free callbacks, registered by the application. Then the logic of tracking, locking, etc could be handled by the application. In case of no registered callbacks libhtp could just use malloc and friends.
Does that make sense to you?
Yes, I think some combination would be best. Would you prefer to monitor and enforce memory consumption in real time (e.g., when new memory is requested) or by periodically monitoring memory consumption across all connections (e.g., in a monitoring thread that runs periodically)? I think the latter approach would be faster, at an acceptable loss of absolute control over memory consumption (i.e., it'd be possible to go over the limit until the thread kicks in).
I would want it to be a real-time hard limit. This is how we keep limits for other subsystems in Suricata as well. Pretty much the allocation functions just return NULL once the limit is reached.