Auto-generated Semantic Conventions
Fixes #1658.
Changes
As discussed in #1658, this PR introduces a new package OpenTelemetry.SemanticConventions which aims to be the home of the auto-generated semantic convention code files. It uses the Semantic Convention Generator, similar as it's done in other languages.
Few things I noticed while doing this, which probably need some discussion and agreement:
-
The existing
SemanticConventions.csclass. Looking at the other languages, they call it:- Java:
SemanticAttributes - Python:
SpanAttributes - JS:
SemanticAttributes
- Java:
-
The existing
ResourceSemanticConventionsclass. The other languages call it pretty muchResourceAttributes. But this name conflicts withResourceAttributesfromSystem.Reflection, so probably not a good name to use.
What should we use here? I went with TraceSemanticConventions, which is not similar to any of the above, but is similar to what we have with the resource one.
For now, I went with this:
SemanticConventions->TraceSemanticConventions. Inside namespaceOpenTelemetry.SemanticConventions.TraceResourceSemanticConventions-> same name. Inside namespaceOpenTelemetry.SemanticConventions.Resource
The code files are generated via a script, manually for now. Once the naming above is agreed, I can add this new project to all others in the solution and start renaming and see if all works. Since they were internal files, I think it shouldn't be a problem?
The committers are authorized under a signed CLA.
- :white_check_mark: Joao Grassi (4b91bdb4d7d99cf6610409cf150d040ff30a9f7a, ca2dd5ad3a54264620fab9278b642d2e4d649ad6, 2b897ac78ed27d9ee840fb0662283c160d2fd59e)
@cijothomas @reyang for guidance on naming here.
Codecov Report
Merging #2069 (459cf85) into main (6d6222f) will increase coverage by
0.06%. The diff coverage isn/a.
Additional details and impacted files
@@ Coverage Diff @@
## main #2069 +/- ##
==========================================
+ Coverage 87.32% 87.38% +0.06%
==========================================
Files 281 281
Lines 10758 10758
==========================================
+ Hits 9394 9401 +7
+ Misses 1364 1357 -7
| Impacted Files | Coverage Δ | |
|---|---|---|
| ...metryProtocol/Implementation/ActivityExtensions.cs | 96.21% <0.00%> (+1.08%) |
:arrow_up: |
| src/OpenTelemetry/BatchExportProcessor.cs | 84.11% <0.00%> (+1.86%) |
:arrow_up: |
| ...tpListener/Internal/PrometheusCollectionManager.cs | 75.82% <0.00%> (+2.19%) |
:arrow_up: |
| ...lementation/SqlClientInstrumentationEventSource.cs | 75.00% <0.00%> (+4.16%) |
:arrow_up: |
Also, this seems pretty mature already, any particular reason for it to be a draft? You'll get the most reviews from folks once you drop the draft flag
@bruno-garcia I'll go through the comments, thanks!
Also, this seems pretty mature already, any particular reason for it to be a draft? You'll get the most reviews from folks once you drop the draft flag
It's mostly because I have all these naming stuff that I'm really not sure, so there's still quite some stuff to do. Once the namings are defined I'll have to go and change everywhere and see if everything still "works".
Then the idea is to merge this, and think how we will publish this package. Only then we'll use it in all the projects that currently use the internal class. At least that's what @pcwiese and I discussed in the linked issue and in Slack.
Publishing it now, so maybe we can get some feedback. As @bruno-garcia said I think now it's in a good state, just need to define the small details.
For now, I went with this:
SemanticConventions -> TraceSemanticConventions. Inside namespace OpenTelemetry.SemanticConventions.Trace ResourceSemanticConventions -> same name. Inside namespace OpenTelemetry.SemanticConventions.Resource
TraceSemanticConventions inside Otel.SemanticConventions.Trace seems bit redundant. We may name the namespace as OpenTelemetry.SemanticConventions, and classes TraceSemanticConventions, MetricsSemanticConventions and ResourceSemanticConventions
For now, I went with this: SemanticConventions -> TraceSemanticConventions. Inside namespace OpenTelemetry.SemanticConventions.Trace ResourceSemanticConventions -> same name. Inside namespace OpenTelemetry.SemanticConventions.Resource
TraceSemanticConventions inside Otel.SemanticConventions.Trace seems bit redundant. We may name the namespace as
OpenTelemetry.SemanticConventions, and classesTraceSemanticConventions,MetricsSemanticConventionsandResourceSemanticConventions
Alternately, if we re-use existing namespaces OpenTelemetry.Trace, Opentelemetry.Resource etc we can name classes as TraceSemanticConventions, MetricsSemanticConventions and ResourceSemanticConventions. This will avoid requires users to import more namespaces.
@cijothomas thanks for the review! I changed the namespace as your last comment.
What do you think regarding @bruno-garcia suggestion on having an inner class for the attributes?
What about an inner class Attribute and having these generated under it? So we'd access it through:
ResourceSemanticConventions.Attribute.CloudProvider
And how do you think we can release this? Via CI? manual? @pcwiese had some comments in the original issue ticket... also how we will use this package later on the other projects? I presume since Semantic Conventions are not stable yet, we can't add it to the stable packages?
Something else I forgot: Should I add the files for .publicApi for this as well?
I added the .publicApi files. The sanitycheck/encoding CI task is failing due to a non-ascii char in one of the xml docs. It comes from here: https://github.com/open-telemetry/opentelemetry-specification/blob/d271b73d6b274bfd51eae9ed813586f0322342b7/semantic_conventions/trace/database.yaml#L65.
Not sure what to do about it.. any suggestions?
I added the
.publicApifiles. Thesanitycheck/encodingCI task is failing due to a non-ascii char in one of the xml docs. It comes from here: open-telemetry/opentelemetry-specification@d271b73/semantic_conventions/trace/database.yaml#L65.Not sure what to do about it.. any suggestions?
'InterSystems Caché' é is the non-ascii char.
You can check it here.
'InterSystems Caché'
éis the non-ascii char.You can check it here.
@moonheart Yes, that's what I mentioned above. It is coming from the yaml file. None of the alternatives I thought seem to be good.
- Send a PR to remove the non-ascii on the source probably not viable
- Removing the xml comments for the auto-generated code files also not great
Thought of maybe defining some mapping and remove it during the code generation. For this case, substitute é with e, but also sounds like a hack. If tomorrow new ones are introduced, we need to change this again.
To me, the best option would be to disable the CI check for this package, but that is not up to me so waiting on others for directions.
This PR was marked stale due to lack of activity. It will be closed in 7 days.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
Hey. Are there any plans to merge this, so that the whole world stops copy-pasting these constants? 😃 @cijothomas Do you remember if there still were any open questions? I could address them real quick to get this going...
@joaopgrassi do you have time to resurrect this?
Hey @nahk-ivanov @cijothomas I have been working on and off, to try to keep it up with the conventions. I have some problems right now with the semconv tooling (jinja generation) + dotnet format. I will try to work on it in the upcoming days/weeks. I very much want this to get merged 😊
Thanks @joaopgrassi, the World will remember! 😉
I used your generated files and it works for the most part with two things I've changed:
- Remove
GROUP_TYPE_NAMEoutside of XMLDOC tags: here and here in the template - I'm not sure why prefixes and event names are
static, while attribute names areconst-- yeah, public compile-time constants should be avoided, but I think people mostly don't care at this point and in this case it would be beneficial to have everything asconst, as people will be able to use them declaratively (in attributes, etc.).
I'm also not sure why generated files are in source control, we should gitignore them, me thinks. Build should generate them before packaging.
In regards to OpenTelemetry itself - I don't really understand why common attributes are redefined for different types of telemetry. For example, both traces and metrics redefine common HTTP attributes (http.method, http.scheme, etc.) for respective semantic conventions. I think these should be orthogonal to the types of telemetry, but today they are defined explicitly within traces, for example. Once metrics are added, we will have same attributes in two places for no apparent reason.
Let me know if/how I can help you with this!
Hi @nahk-ivanov
Remove GROUP_TYPE_NAME outside of XMLDOC
Good catch, some left over when I was trying stuff out. Removed
I'm not sure why prefixes and event names are static, while attribute names are const
Yeah this is also some overlook on my part as well. I'd be more inclined in changing all to static readonly for the reasons you pointed. I'm not sure if there's even a use case for using the semconv attributes in c# attributes.
I'm also not sure why generated files are in source control, we should gitignore them, me thinks. Build should generate them before packaging.
I think the opposite. The process we are aiming as far as I remember is manual at first. Someone would bump the semconv version and run the script, check things and then commit. Code generation like this is always a bit tricky so I wouldn't do anything automatic like that. Maybe when things are more stable we can move to such approach. Having it in source control also makes it easy to know what changed between versions. Since this PR was open, I had to keep updating it with the new versions and several it broke because things like new lines, special characters were introduced that the template wasn't prepared for.
In regards to OpenTelemetry itself - I don't really understand why common attributes are redefined for different types of
This is a known issue. There was an intention to refactor it, pretty much to achieve what you described but it got stale. It's not a trivial change and involves many things I guess. (https://github.com/open-telemetry/opentelemetry-specification/pull/1977)
As for the status of the PR. I opened a PR on the build-tools, in order to be able to turn off the whitespace behavior in Jinja. Now spaces in the template actually means a space/newline is rendered and it's MUCH easier to reason about. Before I had to use this {%- %+ syntax in the template and I couldn't fix several cases without huge hacks. Like I would end up with newlines, empty spaces.. just a complete mess. I thought of leaving it and relying on dotnet format to fix it but that also added more issues.. with the latest update being it couldn't "merge" things and it just left the code broken.
Now with the changed template relying in https://github.com/open-telemetry/build-tools/pull/101, the .cs files are pretty much impeccable, apart from a small case with multi-line <remarks> but I think that's really minor and negligible.
This is awesome! Merge and publish to nuget.org like right now! 😍
In regards to
I think the opposite. The process we are aiming as far as I remember is manual at first. Someone would bump the semconv version and run the script, check things and then commit. Code generation like this is always a bit tricky so I wouldn't do anything automatic like that.
I don't know... In most of the cases we should be able to automate it anyway, so steps like clone the repo and commit just add to the process that should be one-click. If there is something that is utterly broken - build will fail and we will clone the repo at that point to fix it. If there is some new stuff added to the yaml that we don't support yet in here - then we clone the repo, update the template and rerun the build. It should get new build/revision/patch/(whatever they decide to call it today), which will have that additional stuff.
Regardless, waiting for this for over a year I'd be fine with anything that doesn't require me to copy-paste. :)
I know.. I hope we get this merged and published soon.
Now the deal is it depends on the code generator change and release: open-telemetry/build-tools#101. In the meantime, there were breaking changes in the way the tool works so it also needs a new version of the spec for things to work. In summary, there are attribute requirement levels, and the tool at the current version doesn't work with spec version 1.12. But I assume a new version of the spec will be released soon. If you want, you can help bringing more attention to the build-tools PR so it can get merged/released soon :)
Also, @cijothomas could you please re-open this? I updated with latest main and it should be fine again.
@cijothomas also highly interested in seeing this re-opened and ultimately released. We're documenting these for relevant languages and unfortunately we can't do that for .NET until this is released: https://github.com/open-telemetry/opentelemetry.io/issues/1539#issuecomment-1234281131
Looks like the only thing red is a sanity check that is detecting some extraneous spaces in XML doc comments. Is that required to be ale to push this through? I'd be happy to followup with a PR that just manually removes them for now, then figure out the source of it (i.e., is it in the generator tool, or some older version of the yaml specs this is generated from, etc.)
Hi @cartermp
In general the PR is almost ready. All the spaces, and formatting is really tricky to get it right. I was doing a best effort in the Jinja template and then let dotnet format fix the rest.. but I ran into several problems with dotnet format messing up the file so I aborted that approach.
The thing I mainly struggled was how the whitespace rendering works in Jinja. I since then submitted a PR to allow customizing it in the semconv generator so now the template on our side is "easier to reason about".
~I still need to find out where the extra spaces here are coming I assume they are mostly coming from the Faas resource semconv. There's also < > characters in the conventions brief that breakes the xml formatting in .NET. https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/resource/faas.yaml#L24~
~Could send a PR to change that to use SUBSCIPTION_GUID instead of <SUBSCIPTION_GUID> and use `` to show it as <c></c>. The problem with this approach is that as soon as someone else adds a new conventions using the same thing, our generation is broken, so I don't think that's a good idea. I feel we need to do some magic in the jinja template and escape such chars.~
~I don't have many cycles to work on it right now, but I will see what I can investigate and fix in the upcoming weeks.~
OK so I just quickly tried something and I think I got it all sorted out. I found out there's a escape filter I could pass and that solves the issues with <> in the XML comments. I also formatted the paragraphs with another filter which seem to fixed the extraneous spaces in the comments. The build pass now!
The blocking think now is a new spec release. The last spec release (v1.12.0) doesn't work with the semconv generator because of the attribute requirement levels introduced since then. So we need v1.13.0 to be out. Hopefully it happens this week, then I can update the scripts here to use it and we should be good for a final review of the generated code.
Makes sense! I'll defer to maintainers here on how to best proceed. It's not strictly urgent (just a docs update), but if we get an indication that the time to update w.r.t. the latest spec will be longer for reasons that can't be controlled here, I'd be happy to help with anything needed to get it merged+released as-is (and then documented).
This PR was marked stale due to lack of activity and will be closed in 7 days. Commenting or Pushing will instruct the bot to automatically remove the label. This bot runs once per day.
Closed as inactive. Feel free to reopen if this PR is still being worked on.
@cijothomas I think this is ready for a final review. I think the important thing is if we like the way the constants are defined. Also, only "enums" that have values of string type are supported. I can iterate later and see if we can improve this later as a follow up.
I updated it with the latest spec release (1.13), ironed out a few warnings and re-generated the public api files. The build seems to fail in Windows for net6.0, not sure why. I can't find a way to re-try failed jobs.