kubernetes icon indicating copy to clipboard operation
kubernetes copied to clipboard

Reuse the http request object for http probes

Open SergeyKanzhelev opened this issue 2 years ago • 20 comments

Today we construct the request object for every http request for the http probes. This involves concatenating some strings, setting headers, etc.

Probes are simple http get requests that are executed synchronously (for a single probe) and has no request body.

The optimization may be to reuse the request object for all probe executions. This likely save some cpu and memory for the kubelet process.

Some code references:

  • DoProbe is executed today on each probe execution.
  • New request are being created for each execution.
  • New object will need to be stored in worker that for http probes will hold the http request object. This object will need to encapsulate the logic of runProbe method of the prober.

I'm marking this issue as a good first issue, as the issue doesn't require deep knowledge of Kubernetes. However it is not an easy issue, a few parts of code will be affected and change needs to be done very carefully.

Please review code and make sure you understand the task before self-assigning the issue.

Implementation notes

  1. If you can send the small benchmark results on probes with and without this optimization, it will be a good starting point. Note, for the benchmark, each probe can run once a second, not more often. But the number of probes is virtually unlimited since we are not defining the limit on number of containers per Pod. Realistically we are talking about 110 pods (max) with maybe 2 containers each, with 3 probes each (660 probes).
  2. Code change may be big. So make sure to help reviewers to review the code. One idea may be to split implementation into multiple commits to simplify the review. First, creating the object encapsulating the logic of runProbe, second storing and initializing this object in worker, and third, actual reusing of the request object.

/sig node /good-first-issue /help-wanted /area kubelet

SergeyKanzhelev avatar Feb 21 '23 23:02 SergeyKanzhelev

@SergeyKanzhelev: This request has been marked as suitable for new contributors.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to this:

Today we construct the request object for every http request for the http probes. This involves concatenating some strings, setting headers, etc.

Probes are simple http get requests that are executed synchronously (for a single probe) and has no request body.

The optimization may be to reuse the request object for all probe executions. This likely save some cpu and memory for the kubelet process.

Some code references:

  • DoProbe is executed today on each probe execution.
  • New request are being created for each execution.
  • New object will need to be stored in worker that for http probes will hold the http request object. This object will need to encapsulate the logic of runProbe method of the prober.

I'm marking this issue as a good first issue, as the issue doesn't require deep knowledge of Kubernetes. However it is not an easy issue, a few parts of code will be affected and change needs to be done very carefully.

Please review code and make sure you understand the task before self-assigning the issue.

Implementation notes

  1. If you can send the small benchmark results on probes with and without this optimization, it will be a good starting point. Note, for the benchmark, each probe can run once a second, not more often. But the number of probes is virtually unlimited since we are not defining the limit on number of containers per Pod. Realistically we are talking about 110 pods (max) with maybe 2 containers each, with 3 probes each (660 probes).
  2. Code change may be big. So make sure to help reviewers to review the code. One idea may be to split implementation into multiple commits to simplify the review. First, creating the object encapsulating the logic of runProbe, second storing and initializing this object in worker, and third, actual reusing of the request object.

/sig node /good-first-issue /help-wanted /area kubelet

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 21 '23 23:02 k8s-ci-robot

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Feb 21 '23 23:02 k8s-ci-robot

Hi @SergeyKanzhelev! I'm a new contributor and would like to contribute to the project.

Since this is a good first issue, I suppose this would be a good starting point for contributing?

I would like to take this up.

Could you help me out in getting started with this if so?

XDRAGON2002 avatar Feb 22 '23 05:02 XDRAGON2002

Hello @SergeyKanzhelev @thockin I'm a beginner to open-source and want to work on this good-first-issue Can you please guide me on how to proceed with this I've gone through the Contributing.md guidelines and have basic programming knowledge in Golang

VishalPraneeth avatar Feb 23 '23 05:02 VishalPraneeth

@XDRAGON2002, @VishalPraneeth thank you for your interest. Please look at code references I posted in original issue for a good starting points to understand the task.

SergeyKanzhelev avatar Feb 23 '23 07:02 SergeyKanzhelev

I completely understand the issue. And you have explained it very well. Till now I have contributed in 4 prs in kubernetes in which 2 were merged. I know golang but still I am afraid that I dont know how can I start? Can you please help me with that.

Sajiyah-Salat avatar Feb 23 '23 10:02 Sajiyah-Salat

/assign

rayowang avatar Feb 23 '23 14:02 rayowang

Just recently, I checked kubelet-related issues and saw this code as well. I will use the object pool to complete the reuse of related objects. @SergeyKanzhelev

rayowang avatar Feb 23 '23 15:02 rayowang

Hi @SergeyKanzhelev I tweaked the prober and did some CPU-memory profiling and compared them, overall saved 32.MB of RAM and 2.27sec CPU time by running 500 probers concurrently for 2 minutes and probing every second.

Benchmarking function: https://github.com/daman1807/kubernetes/blob/11f6565bf092128deaf9e3b080e4e9e4b92dc07b/pkg/probe/http/benchmark_test.go

Modification in prober for this benchmarking: https://github.com/daman1807/kubernetes/blob/11f6565bf092128deaf9e3b080e4e9e4b92dc07b/pkg/probe/http/http.go#L65-L93

cpu_diff mem_diff

aroradaman avatar Feb 23 '23 20:02 aroradaman

overall saved 32.MB of RAM and 2.27sec CPU time by running 500 probers concurrently for 2 minutes and probing every second

very nice. I think it's reasonable to go ahead with this implementation than. 65K per probe saving with the reasonable straightforward refactoring is a very good saving

SergeyKanzhelev avatar Feb 23 '23 20:02 SergeyKanzhelev

@aojea also did some work here and had notes about HTTP in particular.

Xref https://github.com/kubernetes/kubernetes/pull/115143

On Fri, Feb 24, 2023, 1:58 AM Daman Arora @.***> wrote:

@SergeyKanzhelev https://github.com/SergeyKanzhelev can we maintain a cache with the existing prober structure?

type cache struct { httpRequest *http.Request }

type prober struct { exec execprobe.Prober http httpprobe.Prober tcp tcpprobe.Prober grpc grpcprobe.Prober runner kubecontainer.CommandRunner cache map[kubecontainer.ContainerID]cache recorder record.EventRecorder }

And workers, on stopping, will invalidate the cache.

func (w *worker) stop() { select { case w.stopCh <- struct{}{}: default: // Non-blocking. }

// clear cache delete(w.probeManager.prober.cache, w.containerID) }

— Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/115939#issuecomment-1443384892, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABKWAVCWKS4XLPTXYDY5MJLWZCA4FANCNFSM6AAAAAAVDTZAIM . You are receiving this because you were mentioned.Message ID: @.***>

thockin avatar Feb 25 '23 16:02 thockin

@aojea also did some work here and had notes about HTTP in particular.

This issue is only about golang objects reuse. No connections keep alive or anything like this.

SergeyKanzhelev avatar Feb 26 '23 08:02 SergeyKanzhelev

Hi @SergeyKanzhelev I tweaked the prober and did some CPU-memory profiling and compared them, overall saved 32.MB of RAM and 2.27sec CPU time by running 500 probers concurrently for 2 minutes and probing every second.

@daman1807 can you publish the difference of the benchmarks before and after your change using benchstat https://pkg.go.dev/golang.org/x/perf/cmd/benchstat

EDIT

I see, the benchmark is not using golang benchmark tooling

and I understand now Sergeys point, it seems that the NewRequestForHTTPGetAction does a lot of things, and caching its result can indeed save cycles

aojea avatar Feb 26 '23 11:02 aojea

@aojea I tried benchstat to compare perf but the variance between samples is too high to judge anything, I ran the tests for 2 minutes with 100 and 500 pods doing probes every second. The results also look pretty weird to me, maybe the GC didn't run for the test 🤯

goos: darwin
goarch: amd64
pkg: k8s.io/kubernetes/pkg/kubelet/prober
cpu: VirtualApple @ 2.50GHz
                  │  old.txt   │             new.txt              │
                  │   sec/op   │   sec/op    vs base              │
Prober/pods_100-8   120.0 ± 0%   120.0 ± 0%       ~ (p=1.000 n=6)
Prober/pods_500-8   120.0 ± 0%   120.0 ± 0%  +0.00% (p=0.041 n=6)
geomean             120.0        120.0       +0.00%

                  │      old.txt       │                new.txt                │
                  │        B/op        │     B/op       vs base                │
Prober/pods_100-8   10973330.2Ki ± 17%   367.9Ki ± 44%  -100.00% (p=0.002 n=6)
Prober/pods_500-8    53956.745Mi ±  6%   1.812Mi ± 24%  -100.00% (p=0.002 n=6)
geomean                  23.48Gi         826.3Ki        -100.00%

                  │      old.txt      │               new.txt                │
                  │     allocs/op     │  allocs/op    vs base                │
Prober/pods_100-8   140414.314k ± 17%   3.126k ± 34%  -100.00% (p=0.002 n=6)
Prober/pods_500-8    707079.79k ±  6%   16.00k ±  8%  -100.00% (p=0.002 n=6)
geomean                  315.1M         7.072k        -100.00%

aroradaman avatar Feb 28 '23 09:02 aroradaman

let me work on this issue now please /assign

ashutosh887 avatar Mar 15 '23 03:03 ashutosh887

/assign @aman1807

let me work on this issue now please

as #116058 is already in process.

pacoxu avatar Apr 26 '23 08:04 pacoxu

@pacoxu: GitHub didn't allow me to assign the following users: aman1807.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

/assign @aman1807

let me work on this issue now please

as #116058 is already in process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 26 '23 08:04 k8s-ci-robot

/assign

aroradaman avatar Apr 26 '23 08:04 aroradaman

What needs to be done here @SergeyKanzhelev

ashutosh887 avatar Sep 01 '23 03:09 ashutosh887

Hi @SergeyKanzhelev! I'm new to Kubernetes and want to contribute to the project. It looks like there have been several PRs made already for this issue, but they haven't been closed yet. I was wondering if I could assign myself to this issue and work on it? I noticed that this issue has been labeled as "maybe fixed by other PR." Thank you!

igh9410 avatar Apr 28 '24 04:04 igh9410

/assign

Taoup avatar Jul 22 '24 09:07 Taoup

Hi Team, Can team confirm whether the issue is still open?

BeElectronicSakshi avatar Sep 10 '24 03:09 BeElectronicSakshi

/assign

BeElectronicSakshi avatar Sep 10 '24 03:09 BeElectronicSakshi