libdatadog icon indicating copy to clipboard operation
libdatadog copied to clipboard

perf(profiling): datadog-alloc based StringTable

Open morrisonlevi opened this issue 1 year ago • 4 comments

What does this PR do?

Creates a StringTable where the individual strings are allocated in an arena allocator. This uses the Chain, Linear, and Virtual Allocators provided by datadog-alloc to create a chain of 4 MiB chunks, each of which is virtually allocated.

Also changes the chain allocator to handle large allocations, in case we encounter a large string. Previously it would error if the allocation size was larger than the node size.

Motivation

This separates the StringTable from the system allocator. Since strings are variable length, separating them from the system allocator has more benefits than many other items.

This also is a bit faster. This is probably due to the simpler allocation strategy and better data locality. See this for details: https://github.com/DataDog/libdatadog/pull/404#discussion_r1600836784

Additional Notes

This has been a long-time coming. I'm excited for it to finally merge!

How to test the change?

As long as you weren't using string table implementation details, nothing should change. The apis for FFI users is unchanged, for instance.

morrisonlevi avatar Apr 25 '24 22:04 morrisonlevi

Codecov Report

Attention: Patch coverage is 99.50980% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 66.26%. Comparing base (1ecd94c) to head (0d4bfba). Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #404      +/-   ##
==========================================
+ Coverage   66.17%   66.26%   +0.08%     
==========================================
  Files         187      189       +2     
  Lines       23171    23650     +479     
==========================================
+ Hits        15334    15672     +338     
- Misses       7837     7978     +141     
Components Coverage Δ
crashtracker 19.34% <ø> (ø)
datadog-alloc 98.43% <97.05%> (+0.06%) :arrow_up:
data-pipeline 51.45% <ø> (ø)
data-pipeline-ffi 0.00% <ø> (ø)
ddcommon 81.52% <ø> (+0.29%) :arrow_up:
ddcommon-ffi 74.93% <ø> (ø)
ddtelemetry 53.72% <ø> (ø)
ipc 81.27% <ø> (ø)
profiling 77.52% <100.00%> (+0.68%) :arrow_up:
profiling-ffi 60.05% <ø> (ø)
serverless 0.00% <ø> (ø)
sidecar 31.75% <51.32%> (+2.07%) :arrow_up:
sidecar-ffi 0.00% <0.00%> (ø)
spawn-worker 54.98% <ø> (ø)
trace-mini-agent 69.12% <ø> (ø)
trace-normalization 97.79% <ø> (ø)
trace-obfuscation 95.74% <ø> (ø)
trace-protobuf 25.64% <ø> (ø)
trace-utils 68.85% <ø> (ø)

codecov-commenter avatar May 04 '24 15:05 codecov-commenter

Just to clarify, is the idea not to merge this PR until we merge [...]

Yes, although that is merged now and doesn't matter.

I think it's in pretty good state, but since a draft I'm not sure what plans you have and if I should approve as-is or wait for next steps, let me know! 🙏

I think the code is overall in a good state. I've updated the PR description to match the state of current. There are still a few things like benchmarking the implementation vs the previous, as well as profiling the code to see if there is low hanging fruit. In particular, I wonder about IndexMap vs using a Vec + HashMap.

morrisonlevi avatar May 13 '24 16:05 morrisonlevi

In particular, I wonder about IndexMap vs using a Vec + HashMap.

Vec + HashMap performs worse than using IndexMap, which is good (you would hope a structure meant to solve this, does it well).

I added the old implementation for other people to A/B test if they want. Import whichever StringTable you want to use, and then do:

cargo bench -p datadog-profiling

Performance is improved. I used perf to look for low hanging fruit, there was nothing obvious to me there. We're ready for final review.

morrisonlevi avatar May 15 '24 03:05 morrisonlevi

One GitLab job failed due to an internal error when doing setup. It passed when I restarted it.

morrisonlevi avatar May 15 '24 15:05 morrisonlevi