opentelemetry-cpp icon indicating copy to clipboard operation
opentelemetry-cpp copied to clipboard

Context and shared ptr improvements

Open Reneg973 opened this issue 2 months ago • 5 comments

Fixes # (issue) non, just improvements

Changes

non-API changes in Context and nostd::shared_ptr

Please provide a brief description of the changes here.

  • may reduce sizeof(nostd::shared_ptr) and therefore sizeof(Context), if sizeof(std::shared_ptr) == 16 instead of 32
  • prevent dynamic allocation in case of short keys (SSO)
  • may perform faster because of removed vtable functions

Please provide a brief description of the changes here.

For significant contributions please make sure you have completed the following items:

  • [ ] CHANGELOG.md updated for non-trivial changes
  • [ ] Unit tests have been added
  • [ ] Changes in public API reviewed

Reneg973 avatar Oct 24 '25 11:10 Reneg973

missing header <string> for self sufficiency in context.h, will update it later

Reneg973 avatar Oct 24 '25 11:10 Reneg973

Codecov Report

:x: Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 90.01%. Comparing base (d882c6b) to head (1a8b433).

Files with missing lines Patch % Lines
api/include/opentelemetry/nostd/shared_ptr.h 75.00% 2 Missing :warning:
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3713      +/-   ##
==========================================
+ Coverage   90.00%   90.01%   +0.01%     
==========================================
  Files         225      225              
  Lines        7099     7091       -8     
==========================================
- Hits         6389     6382       -7     
+ Misses        710      709       -1     
Files with missing lines Coverage Δ
api/include/opentelemetry/context/context.h 100.00% <100.00%> (ø)
api/include/opentelemetry/nostd/shared_ptr.h 96.83% <75.00%> (-3.17%) :arrow_down:

... and 1 file with indirect coverage changes

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Oct 24 '25 20:10 codecov[bot]

We could implement some changes only in ABI_VERSION >= 2, but this is not worth the added complexity in my opinion.

Well, if you decide not to integrate it, you can close this PR, which is ok, even if I disagree. The API and ABI currently add superfluous and also noticeable runtime overhead, especially for more lightweight C++ apps or on CPUs like a Raspi Zero. On heavily multithreaded apps: you don't use a custom private heap, which means every dynamic alloc/free blocks other threads using the global heap. Also there's a lot of shared_ptr usage. Especially the virtual inside the shared_ptr privates just adds superfluous indirections.

an library compiled with opentelemetry-cpp API 1.23.0 or older another library compiled with opentelemetry-cpp API 1.24.0 or newer

Does this mean you want to allow use cases where an object is created with API 1.23 and pushed into 1.24? Otherwise I don't see that they influence each other.

This was just a first proposal to slim down the overhead without losing functionality.

Reneg973 avatar Nov 24 '25 05:11 Reneg973

We could implement some changes only in ABI_VERSION >= 2, but this is not worth the added complexity in my opinion.

Well, if you decide not to integrate it, you can close this PR, which is ok, even if I disagree.

@Reneg973

The problem is as follows:

For OPENTELEMETRY_ABI_VERSION_NO == 1, the current code is as it is, and we can not change it if this breaks the abi.

For OPENTELEMETRY_ABI_VERSION_NO >= 2, the code can be changed.

I agree it would improve the final product, but it also creates a significant effort in maintenance in the mean time, with potential regressions if things are different between abi 1 and 2. This is what I meant with added complexity.

The API and ABI currently add superfluous and also noticeable runtime overhead, especially for more lightweight C++ apps or on CPUs like a Raspi Zero.

No contest here. The question is whether this is fixable without breaking existing applications, and without too much burden.

On heavily multithreaded apps: you don't use a custom private heap, which means every dynamic alloc/free blocks other threads using the global heap.

The API code has to be header only, which excludes using a custom allocator, memory pool, or anything like it.

The SDK and exporters have more implementation freedom, but this is another area.

Also there's a lot of shared_ptr usage.

For example, see #828

This has been known for a long time, and is not trivial to fix, due to the breaking change label.

Especially the virtual inside the shared_ptr privates just adds superfluous indirections.

That sounds fixable in ABI_VERSION 2.

an library compiled with opentelemetry-cpp API 1.23.0 or older another library compiled with opentelemetry-cpp API 1.24.0 or newer

Does this mean you want to allow use cases where an object is created with API 1.23 and pushed into 1.24? Otherwise I don't see that they influence each other.

An application or a library is instrumented by invoking the API only. Say a library is compiled against API 1.23. Calls to create a span (to instrument traces) can land either in the no-op implementation, or in the SDK.

The SDK is what is configured in the main() for the overall application. It could be SDK 1.25, which implements the API.

The problem here is that SDK 1.25 has to work not just with API 1.25, which is in the same code base in the same git label, but also with API 1.23. There is no CI build to verify this, so every time any code under api/include is touched, it needs extra review.

Long story short, if a library is instrumented (with API 1.23) and ships a binary package, linking this library in an application that uses the SDK 1.25 should not require whoever provided the library to produce and ship again another package (built with API 1.25).

This was just a first proposal to slim down the overhead without losing functionality.

And this is appreciated.

Overall, ABI 1 and 2 are already different, and CI is already covering both, so having a few more ifdef to fix existing issues in ABI 2 only is possible.

Ultimately, this work is needed, and the sooner ABI 2 can be cleaned up and ready, the sooner we can deprecate ABI 1.

The pitfall to avoid is to start making changes, then get distracted and delayed, to end up with ABI 1 still in use, ABI 2 very different, better but still not complete, with diverging code.

Considering all the cleanup needed anyway due to recent clang-tidy findings in #3762, it looks like opentelemetry-cpp can not postpone this cleanup forever, so we might as well start it.

Do you think the PR can be adjusted to change only ABI_VERSION 2, and pass all ci ? It will be considered.

marcalff avatar Nov 25 '25 08:11 marcalff

Your explanations are very appreciated, thanks.

Do you think the PR can be adjusted to change only ABI_VERSION 2, and pass all ci ? It will be considered.

ok, that's the kind of statement I waited for (getting a hint, what to do that this PR could be approved). I'll prepare one, maybe not tomorrow, but in the next days.

Reneg973 avatar Nov 25 '25 19:11 Reneg973