libqb icon indicating copy to clipboard operation
libqb copied to clipboard

maps have a terrible memory management model

Open chrissie-c opened this issue 3 years ago • 12 comments

All of the libqb maps leave 'ownership' of the memory stored in them to the caller. So the claler can easily free things that are stored in the maps without libqb being aware of it.

This happens most easily when iterators are used. libqb adds a ref to items that are used in iterators so that they do not get removed from the map when qb_map_rm() is called. HOWEVER, the caller is quite likely to free any malloced memory used by th item after callimg qb_map_rm() leaving dangling pointers.

The maps have a special callback that tells the application when it is safe to free items, but apart from being unnecessarily messy (and easy to miss or forget), it means that calling qb_map_rm() on an object doesn't necessarily remove it from the map, which could be a cause of race conditions.

Fixing this is an API change so would need a soname bump, of course.

chrissie-c avatar Jun 03 '21 08:06 chrissie-c

My first thought for fixing this is for libqb to allocate its own version of the key and data, so it would be easy for it to free them when they are finished with. Add to which we need a flag or some way of marking a entry as 'deleted' after qb_map_rm() is called so that it wouldn't appear in other iterators or to qb_map_get() calls.

This would also need adding the length of the value to qb_map_put(), and, maybe, passing it back from qb_map_get() and the iterators. Though the latter might not be necessary if we assume that callers know what they put in there ... debatable.

chrissie-c avatar Jun 03 '21 08:06 chrissie-c

glib has a similar situation with hash tables. Their solution is that tables can be created with either g_hash_table_new(), which leaves memory management to the caller, or g_hash_table_new_full(), which takes key free and value free functions as arguments. Then g_hash_table_destroy() will free either just the hash table object, or each key/value entry first then the object, depending on how it was created.

kgaillot avatar Jun 09 '21 21:06 kgaillot

But this (g_hash_table_new_full) is what is currently implemented in libqb (the QB_MAP_NOTIFY_FREE callback). Without this callback libqb behaves like g_hash_table_new. Sadly without callback it's not possible to safely delete item from hash table and free data/key until some iterator exists (I have no idea how glib solves this problem).

The main problem with libqb callback is how easy is to forget that it's absolutely needed for safe operation. Basically API is not intuitive.

jfriesse avatar Jun 10 '21 06:06 jfriesse

glib just documents it and leaves it to the caller: "Note that neither keys nor values are copied when inserted into the GHashTable, so they must exist for the lifetime of the GHashTable. This means that the use of static strings is OK, but temporary strings ... should be copied with g_strdup() before being inserted."

BTW Pacemaker doesn't use libqb maps, but we do use glib hash tables extensively, both with and without free functions. Using a table without free functions is useful for example when a map relates two existing data aggregates with the same lifetime as the map (e.g. lists A and B, and a map where A entries are keys and B entries are values, and all of them are members of the same struct).

I think it would be fine to just stress the issue more heavily in the documentation. Alternatively if you want to be stricter, you could make all functions return an error unless the map has a free function registered (an aware caller could make that function no-op, but it would have to be a conscious choice). That could still be seen as breaking ABI compatibility, though.

kgaillot avatar Jun 10 '21 14:06 kgaillot

I think it's the fact that I've been doing quite a bit of Rust recently that makes be balk (or worse) when I see pointers thrown around semi-randomly with little thought for ownership or lifetimes. IMHO it's the sort of thing that gives C a bad name for security and reliability.

chrissie-c avatar Jun 11 '21 06:06 chrissie-c

Yup, such comment would be useful. Also we may add something like g_hash_table_new_full which would add callback - that would make a job of clarity. And honestly I'm not so sure if there is better generic solution in C.

jfriesse avatar Jun 11 '21 06:06 jfriesse

Yep, this sort of thing is an endless source of memory mischief in C. Depending on what you're looking to accomplish I see a few options:

  1. Chrissie's suggestion (take pointer+length args and duplicate the memory when adding). This would have to be a shallow copy, so it could only handle contiguous blocks of memory, not e.g. a struct that contains pointers to dynamically allocated items.
  2. Honza's suggestion (replace the existing creation function with one that takes a free notification handler as an argument).
  3. A variation on Chrissie's suggestion, define a new "libqb memory object" with a void* and a reference count, and only accept these as map entries. It could still only hold contiguous memory, but it would avoid copying -- callers would allocate and dereference them with libqb functions that would keep the memory until there were no references.
  4. A variation on Honza's suggestion, make the existing map functions return an error if a free notification handler hasn't been registered yet. (This is the only option that would require an immediate soname bump, the rest could add new functions and deprecate the old ones to delay the bump until a convenient time.)
  5. Deprecate the entire map code, and switch callers to glib or similar. That wouldn't necessarily eliminate the problem, but makes it not libqb's problem. :)
  6. Rewrite the cluster stack in Rust. Not a bad idea, but a bit intimidating. :)

kgaillot avatar Jun 11 '21 15:06 kgaillot

I am very much in favour of 5 or 6 ;-)

chrissie-c avatar Jun 14 '21 06:06 chrissie-c

I have one question (before trying to find it in rust/glib docs).

In corosync we use prefix callbacks (ICMAP_TRACK_PREFIX which is converted to libqb QB_MAP_NOTIFY_RECURSIVE) intensively (historically reason is objdb which allowed similar functionality, just on tree/subtree object, not on arbitrary string). By prefix callback I mean feature to register callback for string prefix and callback is delivered if anything in prefix changes. So for example we register "nodelist.", "quorum.", "totem." and if anything changes (so for example key "quorum.two_node") callback is triggered.

This is quite unique feature and not sure if either rust or glib is implementing that.

jfriesse avatar Jun 14 '21 06:06 jfriesse

I have one question (before trying to find it in rust/glib docs).

In corosync we use prefix callbacks (ICMAP_TRACK_PREFIX which is converted to libqb QB_MAP_NOTIFY_RECURSIVE) intensively (historically reason is objdb which allowed similar functionality, just on tree/subtree object, not on arbitrary string). By prefix callback I mean feature to register callback for string prefix and callback is delivered if anything in prefix changes. So for example we register "nodelist.", "quorum.", "totem." and if anything changes (so for example key "quorum.two_node") callback is triggered.

This is quite unique feature and not sure if either rust or glib is implementing that.

Good point. glib doesn't have that, or any change callbacks at all.

It would be straightforward for corosync to implement glib wrappers that would check for prefixes, if we don't mind making the capability private to corosync.

kgaillot avatar Jun 14 '21 13:06 kgaillot

I have one question (before trying to find it in rust/glib docs). In corosync we use prefix callbacks (ICMAP_TRACK_PREFIX which is converted to libqb QB_MAP_NOTIFY_RECURSIVE) intensively (historically reason is objdb which allowed similar functionality, just on tree/subtree object, not on arbitrary string). By prefix callback I mean feature to register callback for string prefix and callback is delivered if anything in prefix changes. So for example we register "nodelist.", "quorum.", "totem." and if anything changes (so for example key "quorum.two_node") callback is triggered. This is quite unique feature and not sure if either rust or glib is implementing that.

Good point. glib doesn't have that, or any change callbacks at all.

Thats good to know

It would be straightforward for corosync to implement glib wrappers that would check for prefixes, if we don't mind making the capability private to corosync.

straightforward yes, but probably pretty slow ;) QB trie is really nicely implemented in that respect (it's not "linked" list walking but trigger is stored in the element).

jfriesse avatar Jun 14 '21 13:06 jfriesse

It would be straightforward for corosync to implement glib wrappers that would check for prefixes, if we don't mind making the capability private to corosync.

straightforward yes, but probably pretty slow ;) QB trie is really nicely implemented in that respect (it's not "linked" list walking but trigger is stored in the element).

True, libqb has the advantage here :)

kgaillot avatar Jun 14 '21 14:06 kgaillot