kmesh ads: support envoy filter local ratelimit.

What type of PR is this? support local rate limit.

What this PR does / why we need it:

Which issue(s) this PR fixes: Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Sep 15 '24 10:09 yuanqijing

Codecov Report

Attention: Patch coverage is 88.57143% with 4 lines in your changes missing coverage. Please review.

Project coverage is 52.17%. Comparing base (1f2940b) to head (6876096). Report is 130 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/controller/ads/cache.go	0.00%	4 Missing :warning:

Files with missing lines	Coverage Δ
pkg/controller/ads/extensions/local_ratelimit.go	`100.00% <100.00%> (ø)`
pkg/utils/test/bpf_map.go	`34.04% <100.00%> (-3.11%)`	:arrow_down:
pkg/controller/ads/cache.go	`47.94% <0.00%> (+0.39%)`	:arrow_up:

... and 69 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e5b8028...6876096. Read the comment docs.

Sep 15 '24 10:09 codecov[bot]

First we need a proposal to better understand this implement

hi i have created a proposal pr in #873

Sep 19 '24 13:09 yuanqijing

@hzxuzhonghu hi, ratelimit use the same way to cut tcp connection as circuit breaker, it can be merged after #570.

Sep 19 '24 14:09 yuanqijing

lgtm, and your test report(google doc) can be pasted here to show your test results @yuanqijing

Oct 11 '24 08:10 tacslon

lgtm, and your test report(google doc) can be pasted here to show your test results @yuanqijing

hi, i have add a google doc here, This PR might only take effect after #570 is merged. Currently, the corresponding log output can be observed in the code. I manually merged the disconnection handling capability from #570 and tested it locally.

Oct 16 '24 08:10 yuanqijing

@yuanqijing Does it support http rate limit

Oct 17 '24 01:10 hzxuzhonghu

@yuanqijing Does it support http rate limit

No, it only support tcp connections rate limit as describe in proposal #873.

Oct 17 '24 04:10 yuanqijing

@tacslon Ptal

Oct 18 '24 02:10 hzxuzhonghu

lgtm, please resolve the conflict @yuanqijing

Oct 21 '24 01:10 tacslon

Go test:

bpf_map.go:40: Failed to mount /mnt/kmesh_cgroup2/: device or resource busy

cause TestServer_getAndSetBpfLevel Failed, this was wired

Oct 21 '24 06:10 yuanqijing

@tacslon conflict addressed. a notice: I add bpf.CleanupBpfMap() in InitBpfMap, the tests passed. Should I create another pr to fix it?

func InitBpfMap(t *testing.T, config options.BpfConfig) (CleanupFn, *bpf.BpfLoader) {
+	bpf.CleanupBpfMap()

	err := os.MkdirAll("/mnt/kmesh_cgroup2", 0755)
        ...
}

Oct 21 '24 10:10 yuanqijing

@tacslon conflict addressed. a notice: I add bpf.CleanupBpfMap() in InitBpfMap, the tests passed. Should I create another pr to fix it?
func InitBpfMap(t *testing.T, config options.BpfConfig) (CleanupFn, *bpf.BpfLoader) {
+	bpf.CleanupBpfMap()

	err := os.MkdirAll("/mnt/kmesh_cgroup2", 0755)
        ...
}

No problem for me, ptal @hzxuzhonghu

Oct 22 '24 03:10 tacslon

We also need to delete kmesh_ratelimit whenever a pod deleted, otherwise it cause mem leak

Oct 23 '24 01:10 hzxuzhonghu

We also need to delete kmesh_ratelimit whenever a pod deleted, otherwise it cause mem leak

Thanks, you're right.

Should i create a global var (option 1) or modify NewCache Function Signiture(option 2) like this commit do.

var BpfAds *BpfAds
var once     sync.Once

// NewBpfAds creates or returns the existing BpfAds instance, ensuring that only one instance is created (singleton).
func NewBpfAds(cfg *options.BpfConfig) (*BpfAds, error) {
	var err error
	once.Do(func() {
		BpfAds = &BpfAds{}
		err = BpfAds.SockConn.NewBpf(cfg)
		if err != nil {
			BpfAds = nil
		}
	})
	return BpfAds, err
}

Option 1: Easy to implement, no impact on existing code, but exposing a global variable might introduce risks. Option 2: Affects the interface. Or add a new function, like NewCacheWithConfig(cfg *CacheConfig), to avoid breaking changes.

Oct 23 '24 05:10 yuanqijing

We also need to delete kmesh_ratelimit whenever a pod deleted, otherwise it cause mem leak

Thanks, you're right.

Should i create a global var (option 1) or modify NewCache Function Signiture(option 2) like this commit do.
var BpfAds *BpfAds
var once     sync.Once

// NewBpfAds creates or returns the existing BpfAds instance, ensuring that only one instance is created (singleton).
func NewBpfAds(cfg *options.BpfConfig) (*BpfAds, error) {
	var err error
	once.Do(func() {
		BpfAds = &BpfAds{}
		err = BpfAds.SockConn.NewBpf(cfg)
		if err != nil {
			BpfAds = nil
		}
	})
	return BpfAds, err
}
Option 1: Easy to implement, no impact on existing code, but exposing a global variable might introduce risks. Option 2: Affects the interface. Or add a new function, like NewCacheWithConfig(cfg *CacheConfig), to avoid breaking changes.

Introducing a global variable may not be a good choice, option 2(add a new function) should be better

Oct 23 '24 08:10 tacslon

We also need to delete kmesh_ratelimit whenever a pod deleted, otherwise it cause mem leak

@hzxuzhonghu Maybe this PR can be merged, then @yuanqijing creates another PR to fix this issue?

Oct 24 '24 11:10 tacslon

@yuanqijing I think both @nlgwcy and me are concerned this is a per node ratelimit, while with istio generated xds, it is per pod. https://github.com/kmesh-net/kmesh/pull/859#discussion_r1810778940

Oct 28 '24 02:10 hzxuzhonghu

@yuanqijing I think both @nlgwcy and me are concerned this is a per node ratelimit, while with istio generated xds, it is per pod. #859 (comment)

Yes, this is a per node ratelimit, it does not ratelimit per pod connection rates. It ratelimit downstream pods to upstream service. The ducument:

Envoy will call the rate limit service for every new connection on the listener where the filter is installed. The configuration specifies a specific domain and descriptor set to rate limit on.

seems ratelimit does not specifiy which downstream pods should be processed, i dont get it. I understand that traditional rate limiting can restrict each pod, which makes sense. However, this might not align with Envoy's documentation.

Oct 28 '24 05:10 yuanqijing

@yuanqijing I think both @nlgwcy and me are concerned this is a per node ratelimit, while with istio generated xds, it is per pod. #859 (comment)

Regarding this point

while with Istio generated xDS, it is per pod'

I don't quite understand which rate limit configuration case the statement refers to. Could you provide a specific example for discussion?

Oct 28 '24 05:10 yuanqijing

In kmesh, we are a little different, we can only ratelimit on the client side now. So take the example here, though it is http.

https://istio.io/latest/docs/tasks/policy-enforcement/rate-limit/#local-rate-limit

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-local-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: productpage
  configPatches:
    - applyTo: NETWORK_FILTER
      match:
        context: SIDECAR_OUTBOUND

It should be applied on each client instance, not on all the instances of a node.

Oct 28 '24 06:10 hzxuzhonghu

In kmesh, we are a little different, we can only ratelimit on the client side now. So take the example here, though it is http.

https://istio.io/latest/docs/tasks/policy-enforcement/rate-limit/#local-rate-limit
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: filter-local-ratelimit-svc
  namespace: istio-system
spec:
  workloadSelector:
    labels:
      app: productpage
  configPatches:
    - applyTo: NETWORK_FILTER
      match:
        context: SIDECAR_OUTBOUND
It should be applied on each client instance, not on all the instances of a node.

Understood, so the key point here is whether intances in a node share a token bucket. option 1: client_A->Service_C: 5 token, client_B->Service_C: 6 token option 2: Service_C: 11 token.

Envoy using option 2, but in kmesh we use option1 ? Is my understanding correct?

Oct 28 '24 06:10 yuanqijing

Envoy using option 2, but in kmesh we use option1 ? Is my understanding correct?

Envoy can use both, kmesh we can only use option 1

Oct 28 '24 07:10 hzxuzhonghu

Envoy using option 2, but in kmesh we use option1 ? Is my understanding correct?

Envoy can use both, kmesh we can only use option 1

I have added a new field to distinguish different pods:

struct ratelimit_key {
    union {
        struct {
            __u64 netns;  /* Network namespace. */
            __u32 ipv4;   /* Destination IPv4 address. */
            __u32 port;   /* Destination port. */
            __u32 family; /* Address family (e.g., AF_INET) */
        } sk_skb;
    } key;
};

Oct 28 '24 08:10 yuanqijing

@hzxuzhonghu Hi I think there are 2 pending Issues:

When cleaning up the listener, also clean up the ratelimit configuration. Can I create another PR to fix this?
The concurrent issue here may not have been fully discussed yet.

Oct 28 '24 09:10 yuanqijing

When cleaning up the listener, also clean up the ratelimit configuration. Can I create another PR to fix this?

Yes, i think its ok

The concurrent issue here may not have been fully discussed yet.

@nlgwcy suggested using atomic update

Oct 28 '24 09:10 hzxuzhonghu

When cleaning up the listener, also clean up the ratelimit configuration. Can I create another PR to fix this?

Yes, i think its ok

The concurrent issue here may not have been fully discussed yet.

@nlgwcy suggested using atomic update

suggested atomic updates all addressed. https://github.com/kmesh-net/kmesh/pull/859#discussion_r1810771130 https://github.com/kmesh-net/kmesh/pull/859#discussion_r1810771130

Oct 28 '24 10:10 yuanqijing

/lgtm

Oct 31 '24 11:10 nlgwcy

New changes are detected. LGTM label has been removed.

Oct 31 '24 12:10 kmesh-bot

/retest

Oct 31 '24 12:10 lec-bit

/approve

Oct 31 '24 12:10 nlgwcy