nftables icon indicating copy to clipboard operation
nftables copied to clipboard

FlushSet is unreliable, occasionally elements remain in the set

Open marten-seemann opened this issue 9 months ago • 2 comments

I'm running into occasional errors when working with sets / maps. I narrowed this down to the FlushSet function, which sometimes doesn't actually flush the set, i.e. the set sometimes is not empty after FlushSet.

It usually works as expected, but in 5-10% of the cases, the existing elements remain in the set.

I observed this behavior both with v0.3.0, as well as with the current state of the master branch. Kernel information:

❯ uname -a
Linux linode-sg 5.15.0-131-generic #141-Ubuntu SMP Fri Jan 10 21:18:28 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Here's a minimal example to reproduce the bug. It does the following:

  1. It creates a table and a set.
  2. Adds some elements to the set.
  3. Flushes the set using FlushSet.
  4. Checks that there are 0 elements remaining.
package loadbalancer

import (
	"testing"

	"github.com/google/nftables"
	"github.com/google/nftables/binaryutil"

	"github.com/stretchr/testify/require"
)

func TestFlushSet(t *testing.T) {
	conn := &nftables.Conn{}
	table := &nftables.Table{
		Name:   "test_table",
		Family: nftables.TableFamilyIPv4,
	}
	conn.AddTable(table)

	t.Cleanup(func() {
		conn.DelTable(table)
		require.NoError(t, conn.Flush(), "failed to delete table")
	})

	set := &nftables.Set{
		Name:    "test_set",
		Table:   table,
		IsMap:   false,
		KeyType: nftables.TypeInetService, // 2-byte key (e.g., ports)
	}
	require.NoError(t, conn.AddSet(set, nil), "failed to add set")

	elements := []nftables.SetElement{
		{Key: binaryutil.BigEndian.PutUint16(80)},   // Port 80
		{Key: binaryutil.BigEndian.PutUint16(443)},  // Port 443
		{Key: binaryutil.BigEndian.PutUint16(8080)}, // Port 8080
	}
	require.NoError(t, conn.SetAddElements(set, elements), "failed to add elements")
	require.NoError(t, conn.Flush(), "failed to flush added elements")
	conn.FlushSet(set)
	require.NoError(t, conn.Flush(), "failed to flush the flush")

	// check that the set is empty
	elems, err := conn.GetSetElements(set)
	require.NoError(t, err, "failed to get set elements after flush")
	require.Empty(t, elems, "set not empty after flush")
}

marten-seemann avatar Mar 07 '25 03:03 marten-seemann

cc @twitchyliquid64 / @twitchy-jsonp who originally contributed set support back in 2019

stapelberg avatar Mar 20 '25 15:03 stapelberg

I had a look into this but wasn't able to reproduce the issue. I am running kernel 6.12.10-76061203-generic but I also tried Ubuntu 22 (5.15.0-143-generic) on Vagrant.

I recently came across this discussion so I am wondering whether it's related ?

To test this theory I wrote a test that creates a large set and flushes it repeatedly (100 times in total). The test took about 22 minutes to run on my machine, but I still wasn't able to trigger a failure.

nickgarlis avatar Jul 27 '25 10:07 nickgarlis