weave
weave copied to clipboard
[WIP] Workaround to fix Kernel bug related ipset entry deletion
if the kernel version is in affected range of Kernels, then resync the entries to expected set of entries.
https://bugzilla.netfilter.org/show_bug.cgi?id=1119
ipset (v6.30, v6.29, v6.25.1, but not v6.21.1) hash code is sometimes evicting (or bumping) as a
side-effect other entries in the set upon entry deletion (ipset del). The symptom of this is that you
get the error "ipset v6.30: Element cannot be deleted from the set: it's not added" when deleting an
entry that has not yet been legitimately deleted. The problem happens with a large number of entries;
in some cases I've seen it with under 700 entries but typically it takes over 1,000 entries. I've seen
one, two, even three or four entries evicted on a deletion.
Fixes #3296 failed: ipset v6.32: Element cannot be deleted from the set: it's not added
Kernel versions are a poor indicator as to what functionality/bugs are present, since some distros aggressively backport changes.
Kernel versions are a poor indicator as to what functionality/bugs are present, since some distros aggressively backport changes.
Ok, i will check if there is more reliable way. Nevertheless adding this workaround will add little overhead if Kernel is already has the fix and will not cause any side-affects.
If I read it right, you are triggering off the error message we saw in #3296, but the bug will hit earlier. We need to check after every delete - maybe not straight after, maybe have a timer so we batch up multiple deletes to the same set.
Is it still WIP (as the tittle suggests) or is it ready for a review?
@brb Key logic to detect if the Kernel has the bug is not reliable. As pointed out Kernel version is not the best way to figure if it has ipset bug. Do you have any better way to reliably figure if Kernel is affected with ipset issue?
I wouldn't bother with checking the kernel version, and I would enable the safe-delete for all.
Also, as we already bookkeeping all ipset elements in NPC, wouldn't it better to compare against the ones in NPC instead of ipset list <...>?
I wouldn't bother with checking the kernel version, and I would enable the safe-delete for all.
@brb agree, its little bit of overhead but I can not think of any other way to find if the kernel is affected with the bug
Also, as we already bookkeeping all ipset elements in NPC, wouldn't it better to compare against the ones in NPC instead of ipset list <...>?
Could you point me where bookkeeping of ipset elements is done? My intent is since its helper function, we don't know who the consumers are if they do any bookkeeping etc. Ideally would like to contain the changes with in the helper function without taking any help from the consumer of this utility library.
E.g. https://github.com/weaveworks/weave/blob/master/npc/selector.go#L170. I'm suggesting to always sync with the tracked by NPC ipsets as these are the source of truth.