How to implement hashmap_filter
Thanks for making this software, it's very useful.
Is it safe to call hashmap_delete while using hashmap_iter (or hashmap_scan)? Would this ever result in invalid pointers, or in some items being skipped by the iterator?
If it's not safe, how would you suggest implementing a "filter" function? Could it be added as a feature?
Would this ever result in invalid pointers, or in some items being skipped by the iterator?
Yes: https://github.com/tidwall/hashmap.c/blob/master/hashmap.c#L459-L461
If it's not safe, how would you suggest implementing a "filter" function? Could it be added as a feature?
If by "filter" you mean extracting a subset of a hashmap, you could iterate the hashmap and build another hashmap (or any other data structure) with the items matching your criteria instead of altering the source hashmap.
Here I am using the counter reset method. This is less efficient because it restarts the iteration every time after a delete, until the whole iteration runs without matches. But it's an option.
By "filter" I mean deleting every item that matches some condition. I don't need to extract them.
Resetting the loop each time might be the best method for now (and yours is a nice tidy way of doing it compared to what I would've written.) It would be nice to have this as a feature though.
The problem with deleting during iteration is that when the current item is deleted, that deletion operation may cause the items in the buckets to the right to move over by one. This would mean that iterator index may be off by one, thus skipping the next item. It won't always be the case, but it will happen often enough to disappoint.
One idea is having a specialized delete function that is only intended for iteration. Perhaps.... it could take the iteration index as a parameter instead of a key. Using that index it could look at the last item provided to the user, delete it and then move the index appropriately.
size_t iter = 0;
void *item;
while (hashmap_iter(map, &iter, &item)) {
const struct user *user = item;
if (!user->activated) {
hashmap_iter_delete(map, &iter);
}
}
Yeah, that'd work for me. The only other option I could think of would be if, internally, you could do multiple deletions at once without moving anything, them do all the moving-around at the end. That would potentially be more efficient as you'd never need to move any item more than once.
Have you considered the option of building a data structure additively from matching items instead of deleting items from the original? To me it seems more straightforward and less error-prone, but I don't know your use case.
The only case in which it wouldn't work is if you have a very large hashmap that you can't keep in memory along with its (partial) copy.
That would incur at least one or two new heap allocations, which doesn't seem necessary here. Your loop-resetting solution would be more efficient than that.