vector Make Vector aware of available memory

A note for the community

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

We have had discussions floating around this in various issues, but I couldn't find a reference issue for it so creating this one.

Users have had issues in running Vector in memory constrained environments where it'd be better for Vector to apply back-pressure than increase memory usage when it is close to its cap, thus avoiding an OOM kill.

Attempted Solutions

No response

Proposal

Vector is "memory capacity aware" and applies back-pressure rather than allocating when it is at risk of being OOM killed.

References

https://github.com/vectordotdev/vrl/issues/82
https://github.com/vectordotdev/vector/issues/11770#issuecomment-1068853439
https://github.com/vectordotdev/vector/issues/17123

Version

vector 0.20.0 (x86_64-apple-darwin 2a706a3 2022-02-10)

Mar 22 '22 20:03 jszwedko

Is there no solution to this problem？

Apr 03 '24 02:04 baiyibing123

No response？

Apr 09 '24 08:04 baiyibing123

@baiyibing123 We have the same concerns, as we often hit OOM death issues with Vector. However, I'll say that the best solution for this would be contributions of code to solve the issue.

Apr 09 '24 18:04 johnhtodd

Agreed, this is likely to be a very large and invasive project that we unfortunately haven't been able to prioritize just yet. I realize it would be very useful.

Apr 09 '24 20:04 jszwedko

Perhaps the first thing to do would be aware in components, to make this more of a manage-able process. Is it possible for each component to understand the memory that it is using? Can this be exposed in an internal (prometheus) metric easily, or would that require significant work? I would theorize that each aggregation at least could understand its memory space usage, since each metric has a easily understood size. Same for enrichments, which have fixed sizes once indexed. Buffers in sinks. Lua. Reduce? I am not sure what other types of components would take up significant memory other than the memory required to thread N individual processing pipelines. But I am guessing without knowing the code base at all.

I think the biggest item would be aggregations (cardinality) and buffers in sinks - maybe start there?

Apr 09 '24 20:04 johnhtodd

Perhaps the first thing to do would be aware in components, to make this more of a manage-able process. Is it possible for each component to understand the memory that it is using? Can this be exposed in an internal (prometheus) metric easily, or would that require significant work? I would theorize that each aggregation at least could understand its memory space usage, since each metric has a easily understood size. Same for enrichments, which have fixed sizes once indexed. Buffers in sinks. Lua. Reduce? I am not sure what other types of components would take up significant memory other than the memory required to thread N individual processing pipelines. But I am guessing without knowing the code base at all.

I think the biggest item would be aggregations (cardinality) and buffers in sinks - maybe start there?

We did actually take one stab at exposing allocations per-component in https://vector.dev/blog/tracking-allocations/. It's still experimental, currently.

I agree each component could make an attempt at managing its own memory but without a framework in place for this in Vector it may be a bit fraught to do it per-component. I think your sense is roughly right though: sinks tend to use memory creating concurrent requests and memory buffers, some transforms like aggregate maintain state, etc.

Apr 09 '24 21:04 jszwedko

Thanks for the note on the per-component beta framework. I didn't know that existed, though the "~20% less throughput" comment is concerning. It might be useful for debugging, but in places where we're at the edge of performance (hence the debugging) it may be a little troublesome.

Apr 09 '24 22:04 johnhtodd

vector vector copied to clipboard

Make Vector aware of available memory

A note for the community

Use Cases

Attempted Solutions

Proposal

References

Version

vector
vector copied to clipboard