ebpf
ebpf copied to clipboard
introducing Map.Drain API for traversing maps while removing entries
Dear community, with this PR I would like to introduce the Map.Drain
API to traverse a map while also removing its entries.
The current MapIterator.Next
traverse a map without removing the entries. For some use cases, it might be useful to have this API so that maps such as Hash can be freed during the lookup.
In addition, the MapIterator.Next
would normally fail with maps of type Queue or Stack, for two main reasons:
- these maps do not have kernel primitives to traverse their content w/o removing, only support for LookupAndDelete (Lookup for Queue/Stack acts as a peek method and return the head);
- these maps are keyless, hence need separate support.
The proposed solution reuses the MapIterator
structure and as much code as possible: the Map.Drain
API return an iterator with a special flag set, specifying that the retrieved entries must be deleted as well.
To support different kernel versions in which the support for LookupAndDelete
system call is not equal among the different maps (e.g., queues and hash support for this has been introduced in different commit),
this implementation introduces a fallback method that would leverage the sequential Lookup
->Delete
operations in case LookupAndDelete
for that specific map is not supported.
Please any feedback is really welcome, don't hesitate ping me :smiley:
Thank you very much for putting in the work. Though I would like to request some changes.
Many thanks again Dylan for the feedback!
Thanks to all the reviewers for the feedback. I guess the only thing left to decide is whether to proceed with one of the two proposed solutions:
- Creating a simple method
pop
that would be called instead ofnext
, as for the solution proposed in the previous commit. - Creating an interface to support multiple MapIterators, as for the current commit under review.
Have you had the time to look into my questions in https://github.com/cilium/ebpf/pull/1349#pullrequestreview-1951908908 ? I think they will help us decide what the implementation looks like.
Have you had the time to look into my questions in #1349 (review) ? I think they will help us decide what the implementation looks like.
@lmb I apologize, I missed the message.
- According to https://docs.kernel.org/bpf/map_queue_stack.html and some quick tests of the code, Queues/Stack maps support the following operations.
- lookup (peek): returns the value at the head without removing it. If no value exists,
ErrKeyNotExist
is returned. - update (push)
- delete (pop)
In addition,
batch operations
are not supported. Also,NextKey
is not supported: it either returnscan't marshal key: <type> doesn't marshal to 0 bytes
ornext key: invalid argument
when called withnil
(these maps have a key of size 0).
- lookup (peek): returns the value at the head without removing it. If no value exists,
- On the other hand, other maps like
Array
andHash
supportLookupAndDelete
, and the method requires a non-nullkey
as a parameter.
Queues/Stack maps support the following operations.
Clarification, for stacks/queues:
BPF_MAP_LOOKUP_ELEM
-> peek
BPF_MAP_LOOKUP_AND_DELETE_ELEM
-> pop
BPF_MAP_UPDATE_ELEM
-> push
On the other hand, other maps like Array and Hash support LookupAndDelete, and the method requires a non-null key as a parameter.
Only hash maps(and variants: LRU, PER_CPU, LRU_PER_CPU) have support for BPF_MAP_LOOKUP_AND_DELETE_ELEM
since v5.14. Array maps don't seem to support it since you can't delete from an array map.
Also interesting is that BPF_MAP_LOOKUP_AND_DELETE_BATCH
which is the batch version is only supported by hash maps and not by stacks or queues and was added before the non-batch version was supported in v5.6.
Are there ones which support passing a Key argument?
Yes, the hashmap supports passing a key value. Actually, even before v5.14, a valid pointer must be provided to the key field. The kernel always checks it and copies it into kernel memory, just to discard it. After v5.14 it of course starts to actually get passed to the map implementation.
Okay! That was very helpful, thanks to both of you. The peek operations only return the top of the stack / first item in the queue? And currently doing .Iterate()
on a queue or stack doesn't work at all since we pass a key?
I'm currently leaning towards adding a new method Map.Drain() *MapIterator
. Initially this could only support queues / stacks but later on we may be able to extend it to hash maps. The benefit of a new method is that it makes it clearer that we're actually deleting items here. We can put the implementation into the same MapIterator
struct for now as Timo suggested. I think that we should take MapIterator.Next(nil, &value)
when draining (and not ignore the key argument as currently done.)
The peek operations only return the top of the stack / first item in the queue? And currently doing .Iterate() on a queue or stack doesn't work at all since we pass a key?
Yes, peeking only returns the top of the stack. Any key provided will be ignored as far as I understand. But as @s41m0n mentioned the BPF_MAP_GET_NEXT_KEY
op would return an error if using the normal iterator.
Thanks for the info-gathering and brainstorming. I just pushed a possible implementation of what has been discussed. Let me know your opinions :)