opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

Improve otel collector configuration w/ JSON schema

Open LBF38 opened this issue 4 months ago • 31 comments

Is your feature request related to a problem? Please describe.

For better defining the otel-collector-config.yml, a JSON schema would be great to validate the configuration before using it in the collector itself.

Describe the solution you'd like

A JSON Schema for validating the configuration when writing it, similarly to Docker compose files or Kubernetes manifest files, ...

Describe alternatives you've considered

Reading the docs multiple times, and checking config by running it on local cluster w/ Docker compose or Kubernetes.

LBF38 avatar Mar 15 '24 11:03 LBF38

@TylerHelmuth If I understand json schema correctly you could write a schema file and use it to validate the config externally (not inside the collector) correct?

TylerHelmuth avatar Mar 15 '24 18:03 TylerHelmuth

Yes, the configuration file in .yml will be statically checked against the JSON Schema, therefore providing an easier way to configure all the possible aspects of the otel collector.

You can try this out with a compose.yml file for defining Docker compose stacks, for example. And you will see that you get automatic intellisense on defining the important parts of the configuration.

And, in your schema, you can define if a property is required or not and his types.

LBF38 avatar Mar 16 '24 10:03 LBF38

I recommend you write and maintain your own file specific to your collector build.

Since the collector is composable from components from anywhere, and those components can use any configuration they want, it isn't possible for us to provide a single, wholistic, schema file for any collector build.

We could maybe provide one with each distro we provide, but even that would be hard to maintain.

TylerHelmuth avatar Mar 16 '24 13:03 TylerHelmuth

related, but closed issue https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003

yurishkuro avatar Mar 16 '24 19:03 yurishkuro

@TylerHelmuth I disagree with your DIY position. The advantage of OTEL collector for custom distros is in providing a reusable baseline. From my perspective the discoverability of configuration of different components is very bad today, 2 out of 5. A DIY approach cannot solve for the lack of schema / documentation of existing components, it needs to be solved centrally in the main collector repo. JSON schema is one option, I personally would prefer protobuf schema (either would need a prototype to iron out kinks), as I mentioned in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003#issuecomment-1778587538

yurishkuro avatar Mar 16 '24 19:03 yurishkuro

@yurishkuro I agree that we could improve the consistency and quality of component documentation - being able to auto-generate standard docs from component config would be awesome.

What I'm arguing is that the Collector Core repo cannot supply a single schema file that could be used to statically validate any collector build.

Each distribution at https://github.com/open-telemetry/opentelemetry-collector-releases could maybe supply a schema that knows about each component in its manifest. Maybe the collector could add an extra command to generate a schema file based on the supplied config.

TylerHelmuth avatar Mar 16 '24 21:03 TylerHelmuth

What I'm arguing is that the Collector Core repo cannot supply a single schema file that could be used to statically validate any collector build.

agreed, but if there was a standard mechanism defined & used in Core it would be much more likely that components and contrib plugins would start using it to declare their configuration, improving the overall ecosystem.

yurishkuro avatar Mar 16 '24 21:03 yurishkuro

With schema support issue https://github.com/open-telemetry/opentelemetry-collector/issues/9707 would be based upon

cforce avatar Mar 17 '24 10:03 cforce

I think this would improve the experience quite a bit.

As further info, if the JSON Schema were uploaded to SchemaStore.org, various tools and editors would get automatic support in their various plugins. https://www.schemastore.org/json/#editors

As an example, this VSCode YAML extension would be able to consume it: https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml

While I understand the concerns for extensibility, a JSON schema can allow arbitrary fields in addition to the official fields. So there is nothing lost if a different tool wants to add custom config.

cmgriffing avatar Mar 17 '24 20:03 cmgriffing

Hi everyone,

After exploring more related issues and work on this topic, I will try to summarize my findings in this comment.

  • https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003

    • Looks like this is a first work on automating the generation of YAML schema for the collector config file.
    • Could be a solution to this issue if merged ?
    • A related proposal defines the steps and configuration schema to generate here: https://docs.google.com/document/d/15SMsJAdE1BHAmwf8cLXBvsHDrezXOHniaRZafjO-5e4/edit
  • https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/24189

    • This is also a related issue that has been created based on/related to the latter PR.
    • would this be a suitable solution for adding jsonschema/YAML to the Collector config file ?
  • https://github.com/open-telemetry/opentelemetry-configuration

    • It seems like there is a repo for opentelemetry configuration, but, as I'm unfamiliar with all those components, I don't know if it is related to the collector config or other components.
    • I include it for reference.
  • https://github.com/open-telemetry/community/issues/1610

    • this issue also seems to be on the same topic.
    • I include it for reference too.
  • https://github.com/splunk/collector-config-tools/tree/main/cfg-metadata

    • it seems like splunk has already made some work on this and generate some metadata files for community exporters, extensions, processors and receivers.
    • that's one of their motivation to upstream this work for making it available for the whole community.

Moreover, here are some Go libs for generating jsonschema files based on Go types:

  • https://github.com/omissis/go-jsonschema
  • https://pkg.go.dev/github.com/invopop/jsonschema
  • among others

Thus, what are the next steps to implement this ? I'm willing to contribute if necessary.

LBF38 avatar Apr 05 '24 08:04 LBF38

Hi, A temporary alternative would be to use https://github.com/dash0hq/otelbin

This tool allows to validate the otel collector config against different schemas. And it also provides an easy way to visualise the collector flow.

See the live tool here also : https://www.otelbin.io/

LBF38 avatar Apr 24 '24 07:04 LBF38

this could be useful to the operator as well, perhaps for the auto-upgrade mechanism

cc @yuriolisa

jpkrohling avatar May 08 '24 11:05 jpkrohling

@jpkrohling @TylerHelmuth I am noticing that OTEL is often not participating in LFX mentorship programs, wouldn't this be a good project for an intern?

NB: the deadline for summer term is today 5pm PT.

yurishkuro avatar May 08 '24 18:05 yurishkuro

Hello, I'd also be happy to work on this in the next few weeks/months and I'd be happy to collaborate with anyone else who is interested. I could make it one of my priorities for the next 3-4 months, as long as we have some certainty on what an acceptable solution would look like.

I'm especially interested in autogenerating documentation. The Collector docs are sometimes missing docs for certain config arguments, and sometimes the arguments are not documented in a consistent way.

This automation would also make it easier for maintainers of Collector distributions to keep their code and docs accurate and up to date. For example, when updating the OTel dependencies of Grafana Alloy we have a few manual steps:

I'm not sure what would be good next steps though? Should we discuss this in more detail on the CNCF slack? Or should we come up with a proposal and update this issue? Would @LBF38 be interested in collaborating over the next few weeks/months?

ptodev avatar May 09 '24 13:05 ptodev

@ptodev a proposal would be great. There is a fair amount of design exploration in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003, it's worth learning from that and making sure the proposal covers the requirements raised.

yurishkuro avatar May 09 '24 22:05 yurishkuro

Hi @ptodev , I'm actively working around OpenTelemetry during the next few months so I'll be happy to help where I can, depending on the scope and depth of what's needed.

Let me know where I can best contribute. I would love to !

LBF38 avatar May 10 '24 05:05 LBF38

@yurishkuro, absolutely! We had only one recent participation so far (it's still happening!), but we are often lacking mentors. Would you like to be the mentor for this feature? Although we are late for this round, I can see if we can get in there if you can be a mentor.

jpkrohling avatar May 13 '24 11:05 jpkrohling

@jpkrohling I am mentoring 3 projects in Jaeger in each of LFX terms, so I don't have capacity to take more. Since this specific issue will be important to Jaeger-v2 I can consider it for the Fall term, if not solved by then.

TBH, I find it surprising that the project the size of OTEL is lacking mentorship engagements. I rather suspect this is due to lack of focus on this area.

yurishkuro avatar May 13 '24 17:05 yurishkuro

@yurishkuro I'm not sure how we could generate sufficient docs using the protobuf approach. For example, where would the description string of each config attribute go inside the protobuf?

ptodev avatar May 16 '24 15:05 ptodev

Maybe the descriptions of the arguments could go into comments inside the proto file, and then they can be parsed using something like protoc-gen-doc? It seems like protoc-gen-doc can also output json, which could help folks who are bundling their own OTel distributions and want to have more control over how the documentation is displayed.

ptodev avatar May 16 '24 15:05 ptodev

I personally like the protobuf approach:

  • It's easier to read than json.
  • I suppose it'll work well with config arguments which are shared across components (e.g. HTTP server settings). Those arguments can be defined in a single proto file and imported in many places.

However, I don't know how we would indicate what the default value of a config argument is? Where would that go inside the proto file, and how would it be translated to Go code?

ptodev avatar May 16 '24 16:05 ptodev

@ptodev I have some pointers in https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/27003#issuecomment-1778587538

TLDR: protobuf allows adding arbitrary annotations to IDL declarations (e.g. Collector is already using gogoprotobuf annotations to influence code generation). I don't know if having default values for fields is actually a requirement, but as long as there is a generic mechanism to add proto annotations that will inject struct tags (e.g. https://github.com/favadi/protoc-go-inject-tag) the solution can be built using those tags (lots of validation solutions out there already, maybe they do support default values too).

yurishkuro avatar May 16 '24 16:05 yurishkuro

I don't know if having default values for fields is actually a requirement

I believe it is, because the docs need to include it.

as long as there is a generic mechanism to add proto annotations that will inject struct tags the solution can be built using those tags

The problem is that mapstructure doesn't have a way to set a default value through a tag.

lots of validation solutions out there already, maybe they do support default values too

I hope so! I'll have a look to see if I can find any.

ptodev avatar May 16 '24 17:05 ptodev