opentelemetry-specification
opentelemetry-specification copied to clipboard
How to classify browser and mobile telemetry
What are you trying to achieve?
The client-side instrumentation SIG is working on defining telemetry for client-side applications, and we believe that we need a way to classify this telemetry somehow. The reasons for this are
- backends processing client-side telemetry might want to perform specific post-processing or analysis of the data
- vendors might want to present a different UI experience for browser and mobile devices than for backend services
We would like to specify what attributes an SDK MUST include on a resource in order for the telemetry to be interpreted as browser or mobile. We could use guidance (or further discussion) on the approach that makes sense in a wider context of the project.
Possible options for classifying browser telemetry
- presence of browser attributes on resource (proposed here)
This aligns with how semantic conventions have been used so far and is also recommended in this PR
Counter arguments:
- it may not be possible to collect browser attributes in every environment
- browser attributes would primarily collect information about the user agent, which should be optional because it is a supplemental piece of information (and the telemetry is still useful without it)
- instead we would be making it required, by saying - if you want this to be treated as browser telemetry, then you MUST capture user agent
- value of the
process.runtime.name
attribute
This is already defined in the specs for JavaScript runtimes here
Counter arguments:
- the accompanying
process.runtime.version
attribute does not make much sense for browsers - the current example in the spec shows the user-agent string, which includes more information than a version
- schema
I am not sure if this could fall under the intended purpose of schemas. The idea is to have a schema that is unique to client-side telemetry, and the classification would be done based on the value of the schema field.
Possible options for classifying mobile telemetry
- presence of device attributes
It seems that these attributes were originally intended for mobile devices.
Counter arguments:
- the term device is generic enough that it could be used for IoT devices or even infra
- value of the os.name attribute
The examples in the spec include Android and iOS.
Counter arguments:
- consumers of the data will need to know a full list of OS names that apply to mobile devices
Thanks for summarizing these options, @martinkuba!
Browser
Ad 1.
If the proposed browser.platform
was kept despite duplicating os.name
, its presence would be sufficient without having to capture a user agent.
Ad 2.
Having the existing process.runtime.name=browser
attribute looks like a good approach to me.
If the defined value for process.runtime.version
does not make much sense, this could be changed independent from the classification issue discussed here.
Ad 3. I also don't think that introducing a separate schema just for client-side telemetry would make sense and open a lot of new questions. Would this be entirely separate? Would this be an extension to the "generic" schema so you would still be able to use the attributes defined here? How would versioning be handled? Also this would make it necessary to have a "topmost"/application-level tracer that is guaranteed to use this schema while other libraries with built-in instrumentation or separate instrumentation libraries might still use the generic schema. Furthermore, it would put additional burden on telemetry consumers to potentially develop/maintain support for two "worlds" of such data.
Mobile
Ad 1.
How about adding an open enum device.kind
with a value mobile/handheld or the like? (distinguishing phone and tablet might be difficult and likely not even necessary/insightful)
Ad 2.
For this to be feasible an enum like we have for os.type
would be necessary.
device.kind
makes sense. When I've implemented this previously, we've had enumerate values like:
- mobile
- wearable
- desktop
- streamer
Where "mobile" would include both phones and tablets, due to the difficulty of differentiating form factors on Android, and desktop including both PCs and laptops. Naming subject to debate.
@arminru
If the proposed browser.platform was kept despite duplicating os.name, its presence would be sufficient without having to capture a user agent.
In some browsers (older versions of Chromium, Firefox, Safari), it is not possible to get the platform value alone, only the full user-agent string.
Having the existing process.runtime.name=browser attribute looks like a good approach to me. If the defined value for process.runtime.version does not make much sense, this could be changed independent from the classification issue discussed here.
I am in favor of this approach, as it makes it straight-forward. I think we still need browser
attributes in addition (see https://github.com/open-telemetry/opentelemetry-specification/pull/2353). I think it would make sense to capture the version (or user agent string) there instead.
With that said, I would like to know if others have any objections to using process.runtime.name = "browser"
.
I also don't think that introducing a separate schema just for client-side telemetry would make sense and open a lot of new questions. Would this be entirely separate? Would this be an extension to the "generic" schema so you would still be able to use the attributes defined here? How would versioning be handled? Also this would make it necessary to have a "topmost"/application-level tracer that is guaranteed to use this schema while other libraries with built-in instrumentation or separate instrumentation libraries might still use the generic schema. Furthermore, it would put additional burden on telemetry consumers to potentially develop/maintain support for two "worlds" of such data.
This was an idea mentioned by @jmacd. I would need more guidance from others whether this makes sense to pursue.
@martinkuba
In some browsers (older versions of Chromium, Firefox, Safari), it is not possible to get the platform value alone, only the full user-agent string.
So browser.platform
would also be left empty in this case since the sentiment is to not impose the requirement of user-agent parsing on instrumentation, right?
I am in favor of this approach, as it makes it straight-forward. I think we still need
browser
attributes in addition (see #2353). I think it would make sense to capture the version (or user agent string) there instead.
For browser.user_agent
this certainly makes sense, yes 👍
In ECS there is a top-level user_agent
attribute (reference) but I think having a browser
namespace on top-level makes sense as we might add more browser-related attributes in future.
So browser.platform would also be left empty in this case since the sentiment is to not impose the requirement of user-agent parsing on instrumentation, right?
Yes, I don't think we should put the burden of parsing the user-agent string on the client instrumentation.
Before we close this, we should document the outcome in semantic conventions.
Other options:
- entity signal
- ongoing conversation about service.name
- telemetry.sdk.language value