semantic-conventions
semantic-conventions copied to clipboard
[general, attributes] Make the `thread.id` attribute a string
Context: https://opentelemetry.io/docs/specs/semconv/general/attributes/#general-thread-attributes
In C++, there is no guaranteed integral conversion of std::this_thread::get_id() (https://github.com/googleapis/google-cloud-cpp/pull/13151/files#r1397504258)
So we are storing the attribute as a string in our implementation. Can we update the spec to make the attribute value a string or an int?
cc @open-telemetry/cpp-approvers
Friendly ping @open-telemetry/cpp-approvers
In C++, there is no guaranteed integral conversion of std::this_thread::get_id()
This is correct, and applicable to both C++ and Rust. The core issue is that there isn't a universally safe and portable method to convert thread identifiers into a numeric (integer) form due to concerns around type safety. This contrasts with higher-level languages that operate atop a managed runtime environment. These higher-level languages can afford to abstract away the complexities and offer a more simplified, integer-based representation of thread identifiers.
For applications written in C++ or Rust, the realistic strategies boil down to a couple of workarounds:
- Transforming the thread identifier into a hash and using that as a unique identifier, though this comes with the usual caveats about potential hash collisions.
- Setting up a custom mapping between std::thread::id and integers, safeguarded by a mutex for thread safety.
However, both of these methods introduce extra complexity and potential performance hits, making them less than ideal solutions.
Better approach would be of representing thread IDs as strings for semantic purposes.
cc @open-telemetry/rust-maintainers @open-telemetry/cpp-maintainers
@lalitb what do thread ids look like in C++/Rust?
@AlexanderWert @trisch-me have you run into this in ECS? it looks like ECS process.thread.id is also a number type
what do thread ids look like in C++/Rust?
In C++, thread id is an abstraction over the internal platform-specific representation or implementation of a thread, and it may or may not be an integer. The C++ Standard does not mandate std::thread::id to be a numeric value; as long as it is unique, can be converted to string for debugging purposes, and can be compared. As an example, the below stuff runs successfully for posix/gcc, but is not guaranteed to succeed for other compilers/platforms - https://godbolt.org/z/Eje4eWvcj
#include <thread>
#include <sstream>
#include <string>
#include <iostream>
int main() {
std::thread::id threadId = std::this_thread::get_id();
std::ostringstream oss;
oss << threadId;
std::string threadIdStr = oss.str();
std::cout << "ThreadId as string: " << threadIdStr << std::endl;
// below statement can throw exception for certain compilers/platforms if conversion fails
std::cout << "ThreadId as long :" <<stol(threadIdStr);
}
o/p:
ThreadId as string: 140550312396608
ThreadId as long :140550312396608
Rust being more secure, unlike C++ it doesn't even provide a built-in, straightforward way to imply a numeric conversion from a ThreadId.
The C++ Standard does not mandate std::thread::id to be a numeric value; as long as it is unique, can be converted to string for debugging purposes, and can be compared.
Just to clarify, sometimes it will be a string that represents a number, e.g. "140550312396608", and sometimes it will be a string that doesn't represent a number, e.g. "bZUPbvaY4d"?
Just to clarify, sometimes it will be a string that represents a number, e.g. "140550312396608", and sometimes it will be a string that doesn't represent a number, e.g. "bZUPbvaY4d"?
Thats correct.
Other than it being a breaking change, changing thread.id from number to string seems ok.
I don't think the number-ness of the value is important (other than, again, of it being a breaking change).
@tedsuo @jack-berg @jsuereth any thoughts on the defacto stability status of thread.id?
@mjwolf do you know anything about type of thread.id, if we had problems with number type before?
Given the OTel definition of thread.id is "Current “managed” thread ID (as opposed to OS thread ID).", I don't have any objection to this being a string.
The ECS field usually refers to the OS thread ID, which should be a number. The OS thread ID is a number in all major OSs.
With OTel, I think there should be an additional attribute that is specifically for OS thread ID, with number type. It's a bit of a potential problem in ECS that the type of thread ID to use in the field is not well defined, and users could use it for both runtime or OS thread ID. I haven't seen heard of any actual problems related to it in ECS, but by having separate attributes in OTel, any problems could be avoided.
but by having separate attributes in OTel, any problems could be avoided.
@mjwolf But what does that mean for ECS users? That would be breaking for them, no? (if we introduce a thread.os.id) 🤔
Given the OTel definition of thread.id is "Current “managed” thread ID (as opposed to OS thread ID).", I don't have any objection to this being a string.
The ECS field usually refers to the OS thread ID, which should be a number. The OS thread ID is a number in all major OSs.
With OTel, I think there should be an additional attribute that is specifically for OS thread ID, with number type.
this is a great point. it sounds like it makes sense to have separate attributes for "managed" thread id and "os" thread id, since you may want to stamp both things onto the same telemetry.
@open-telemetry/semconv-system-approvers any thoughts?
With OTel, I think there should be an additional attribute that is specifically for OS thread ID, with number type.
this is a great point. it sounds like it makes sense to have separate attributes for "managed" thread id and "os" thread id, since you may want to stamp both things onto the same telemetry.
std::this_thread::get_id() in C++ gives you the OS thread id, so there would be a similar problem with an integer thread.os.id.
If this breakage is acceptable for ECS, I'd advocate for staying with a single thread.id as a string. Although it's a number in all operating systems I know, there's no sense in applying number specific operations and comparisons to it (sums, averages, greater than ...).
If this breakage is acceptable for ECS, I'd advocate for staying with a single thread.id as a string. Although it's a number in all operating systems I know, there's no sense in applying number specific operations and comparisons to it (sums, averages, greater than ...).
How are we going to ensure the "managed" thread id VS "os" thread id then?
The current description clearly states "Current “managed” thread ID (as opposed to OS thread ID)". Will this be changed? And if so how we will be able to know from where the single attribute's value is derived (managed/os)?
How are we going to ensure the "managed" thread id VS "os" thread id then?
If we're confident that we need to make this distinction, then a possible solution would be to stay with thread.id as number for "managed" threads (doesn't apply to C++ and Rust), and thread.os.id as a string (to satisfy C++ and Rust constraints).
A question to ECS folks: in your experience, is there a need to distinguish between managed and OS thread ids?
The ECS field usually refers to the OS thread ID, which should be a number. The OS thread ID is a number in all major OSs.
Agreed, however, the C++ and Rust constraints stem from the fact that those languages (and their standard libraries) have to support exotic operation systems.
If we're confident that we need to make this distinction, then a possible solution would be to stay with
thread.idas number for "managed" threads (doesn't apply to C++ and Rust), andthread.os.idas a string (to satisfy C++ and Rust constraints).
I think this should be reversed? Managed thread ID should be strings, and OS thread ID should be number. But I agree they should have different types.
The C++ thread ID is the managed/higher-level ID, and AFAIK all OSs use numberic thread IDs. Also, as OS thread ID and process ID are very closely, it wouldn't make sense to me to have PID be number, and TID be string. The OTel PID field is already a number, and changing PID to string to match TID would be a larger breaking change. Unless anyone knows of an OS that actually uses non-numberic PID/TID, I think both should be a number type.
A question to ECS folks: in your experience, is there a need to distinguish between managed and OS thread ids?
I think it's important to have managed and OS thread ID as separate fields, as they are separate logical concepts. If it was only one field, an application that wanted to send both managed and OS thread ID in the same log wouldn't be able to include both. Also, if different applications use the different types in the single field, it will cause confusion and data integrity problems later, as it won't be possible to know what is actually stored in the field. With ECS, I haven't heard of these problems happening with thread ID specifically, but I have heard of these problems with other fields, so it would be best to avoid these problems with thread ID.
But what does that mean for ECS users? That would be breaking for them, no? (if we introduce a
thread.os.id)
Yes, this might be a breaking change for process.thread.id. We could also go the other direction. Make thread.id as OS id to not break ECS and add additional field for managed thread id.
The C++ thread ID is the managed/higher-level ID, and AFAIK all OSs use numberic thread IDs.
I see. std::thread directly wraps OS threads in all cases I'm aware of, however, we aren't guaranteed to obtain the OS thread id, so it makes sense to classify those as "managed" threads.
I think this should be reversed? Managed thread ID should be strings, and OS thread ID should be number. But I agree they should have different types.
Makes sense.
@mjwolf @trisch-me Would you agree with the following approach:
- As
thread.idis defined to hold a "managed" thread id, let's make it a string (which resolves this issue). - Discuss the introduction of OS thread ids and compatibility with ECS in a separate issue.
Hey @pyohannes let's do it then. So it seems we need to differentiate between OS and managed thread. In this case let thread.id to be a managed thread as it is now and change type to string
As for OS thread id - let's discuss it here
Hey @pyohannes let's do it then. So it seems we need to differentiate between OS and managed thread. In this case let thread.id to be a managed thread as it is now and change type to string
Great, I opened #836 for this.
Why not having semantic conventions for each runtime/language instead?
E.g. jvm.thread.id, jvm.thread.name for Java, go.goroutine.id for Go, python.asynciotask.name for Python, etc.
I would leave thread.id and thread.name for OS thread ID and name (https://github.com/open-telemetry/semantic-conventions/issues/812). This way we can solve the issue without making breaking "syntactic" changes in the scheme.
https://github.com/open-telemetry/semantic-conventions/pull/836#issuecomment-2023362943
E.g. jvm.thread.id, jvm.thread.name for Java, go.goroutine.id for Go, python.asynciotask.name for Python, etc.
I would leave thread.id and thread.name for OS thread ID and name (https://github.com/open-telemetry/semantic-conventions/issues/812). This way we can solve the issue without making breaking "syntactic" changes in the scheme.
That's definitely a possible solution (not syntactically but semantically breaking).
As I wrote in https://github.com/open-telemetry/semantic-conventions/pull/836#issuecomment-2025446934, as we have breaking changes with either solution approach, we should aim at having them result in a comprehensive and consistent solution (that can ideally be declared stable) and avoid piece-meal breaking changes.
I'll bring it up in the next semconv SIG meeting, maybe it makes sense to pause work on this and make breaking changes when we are serious about stabilizing thread.* attributes.
This was discussed in the semantic conventions SIG.
Given that thread.id is already widely used and that any of the proposed solutions would involve breaking changes, we postpone work on this for one until somebody has the cycles to propose, present, and discuss a coherent and comprehensive solution that encompasses both OS thread id and name and managed thread id and name and thus resolve both this issue and #812.
This should result in a solution that can then be declared stable.