semantic-conventions
semantic-conventions copied to clipboard
Attribute names: unicode on OTLP, only `[a-z0-9._]` in OTel semcov
Attribute names can be any unicode sequence
https://github.com/open-telemetry/semantic-conventions/blob/dd277f62f66f3342be33aec2c432f6bd959b379b/docs/general/attribute-naming.md?plain=1#L24
It makes sense for user apps using OTel API and OTLP, but is not accepted by our (build-tools) tooling
ID_RE = re.compile("([a-z](\\.?[a-z0-9_-]+)+)")
"""Identifiers must start with a lowercase ASCII letter and
contain only lowercase, digits 0-9, underscore, dash (not recommended) and dots.
Each dot must be followed by at least one allowed non-dot character."""
We should document and enforce the rules that we have for semantic convention definitions in this repo:
- only a-z, 0-9,
.and_are accepted - starts with a letter
- ends with a letter or number
- (no dashes - there are no existing attributes with it)
These rules are necessary for code-generation. They should also apply to metric names, units, event names, event payload fields, or other properties that are likely to be represented as a code.
We can expand the list of allowed characters if we can find a way to support code generation for them.
The use of [a-z0-9._] is currently merely a recommendation, not a strict restriction. I think it is fine if we want to rely on that recommendation and make it the default behavior for our tools but we may need the tools to be able to deal with exceptions.
I think use cases like this show that strictly prohibiting it may create problems with interoperability with other standards.
Agreed. Our existing tooling imposes such limitations - CI checks would flag it and fail. One of the reasons to have this limitation is to be able to translate attribute/metrics/etc names to constant names in the code.
To support other characters, we'd need some mechanism to define a code-friendly name for such identifiers. We might need a similar mechanism for https://github.com/open-telemetry/semantic-conventions/issues/1118#issuecomment-2173803006 (phase 2).