deployments-k8s icon indicating copy to clipboard operation
deployments-k8s copied to clipboard

Re-calculate limits with k8s/vertical-pod-autoscaler

Open denis-tingaikin opened this issue 2 years ago • 1 comments

Our first limits were set in https://github.com/networkservicemesh/deployments-k8s/pull/727

That was good for first step but it is not looking good for now. We still see issues with limits on different systems.

Solution

Re-calcluate limits with https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler

denis-tingaikin avatar Apr 14 '22 17:04 denis-tingaikin

@edwarnicke Can we schedule this for v1.4.0?

denis-tingaikin avatar Apr 18 '22 20:04 denis-tingaikin

This looks super actual. Many users face tiny limits.

/cc @glazychev-art

denis-tingaikin avatar Dec 04 '22 23:12 denis-tingaikin

Subtasks

  • [ ] Check NSMgr limits
  • [ ] Check Forwarder limits
  • [ ] Check Registry limits

NikitaSkrynnik avatar Dec 13 '22 10:12 NikitaSkrynnik

@edwarnicke

We started work with this, because many customers report about tiny limits on some components

denis-tingaikin avatar Dec 13 '22 15:12 denis-tingaikin

Example of the work for the vertical-pot-autoscaler

image

denis-tingaikin avatar Dec 13 '22 15:12 denis-tingaikin

Subtasks

  • [x] Check find request perfomance in registry-k8s ~ 3h
  • [x] Check NSM component limits on 10 NSC and NSE ~ 2h
  • [x] Check NSM component limits on 30 NSC and NSE ~ 2h
  • [x] Check NSM component limits on 60 NSC and NSE ~ 2h
  • [x] Test ping time from NSC to NSE for 10, 30, 60 NSC and NSE ~ 2h
  • [x] Test 10, 30, 60 NSC and NSE with disabled dataplane healing ~ 3h
  • [x] Test NSCs and NSEs on one node ~ 2h

NikitaSkrynnik avatar Dec 22 '22 02:12 NikitaSkrynnik

Recommendations from VPA for 10, 20, 30, 40 NSCs and NSEs (Kernel2Ethernet2Kernel example, dataplane healing disabled) image

40 NSCs and NSEs is the maximum at which all 40 pings go normally. If we deploy 50 NSCs and NSEs some pings stop working despite the fact that all requests were successful.

@edwarnicke Should we investigate the issue with 50 NSCs and NSEs now?

Here are logs from NSM with 50 NSCs and NSEs: log.zip

NikitaSkrynnik avatar Dec 27 '22 10:12 NikitaSkrynnik