datadog-agent [CWS] optimize security agent usage of remote workload meta

[CWS] optimize security agent usage of remote workload meta

Open paulcacheux opened this issue 4 months ago • 1 comments

What does this PR do?

This PR optimizes the stream operation of the remote workload meta collector, by letting the Recv do its job and removing the DoWithTimeout call.

Each DoWithTimeout call creates a new goroutime, a new timer, and then select on it incurring a huge cost for each Recv Call.

DoWithTimeout is good for unfrequent calls, for example in the tagger where there is a real request/response model. In the case of a stream where we expect a lot of events to go through, this starts to be a bottleneck. Moreover we actually don't really want to have a 10 min timeout, if the connection is still up and the stream flowing we can continue receiving events.

Notebook showing the improvement https://ddstaging.datadoghq.com/notebook/7655823/paul-mar-12-2024-09-52

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

Mar 11 '24 20:03 paulcacheux

datadog-agent datadog-agent copied to clipboard

[CWS] optimize security agent usage of remote workload meta

What does this PR do?

Motivation

Additional Notes

Possible Drawbacks / Trade-offs

Describe how to test/QA your changes

datadog-agent
datadog-agent copied to clipboard