RFC: OpenAPI Backend
OpenAPI Backend
I've been thinking about adding a backend for OpenAPI definitions to Pants, but before I get too carried away with coding it I wanted to "throw it out there" and gather some feedback and do some bike-shedding regarding target names and so on.
The main beneficiary of this would be people who take a design first approach when building their APIs, people doing code first usually don't have their generated documents in-repo anyway - but if they do they would probably find some use for this as well. The idea was born while I was tinkering with a Spectral plugin at work and realized that most of the code needed wasn't specific for Spectral but for OpenAPI files, and thus could be used to much easier integrate other tools from the rather big OpenAPI ecosystem.
So, without further ado, here are my thoughts and ideas how the OpenAPI backend would look. I've got the basics done already, so expect at least a draft MR in the near future.
Targets
The plan is to introduce two new targets; openapi_definition and openapi_source. openapi_definition will be used to specify the main OpenAPI document and openapi_source for any JSON/YAML file referenced in the openapi_definition (or another openapi_source) using $ref and the main OpenAPI document itself.
The reason why two targets are introduced is because most (if not all) tooling around OpenAPI expect one file as input (the main OpenAPI document) and will do reference resolving themselves and include other files as needed - so having a clear openapi_definition makes it easier for goals to work as expected out of the box instead of having to add skip_foobar_lint to all targets that aren't the main file.
❓ Is openapi_source a good name for "non-spec" files? openapi_library was my first option, since the file is basically a library of schemas you reference in your API definition, but openapi_source felt more in line with the rest of Pants.
❓ openapi_document instead of openapi_definition? Naming things is hard.
Dependency Inference
Dependency inference between openapi_definition and openapi_source targets should be achievable by simply resolving at all $ref keys that have relative file paths.
❓ $ref can also be external URLs or absolute file paths, but I'm not sure if we can/want to resolve these in a good way? It does lead to cases though were these files could change and Pants won't notice. I have a hunch that absolute file paths are quite rare though.
Goals
tailor
Support for tailor should be relatively straight forward by finding openapi.json or openapi.yaml files and add them as a openapi_definition. Once dependency inference is working, the same $ref resolver can also be used to add openapi_source automatically based on the openapi_definition's dependencies (and recursively add even more openapi_source based on dependencies' dependencies).
lint
lint will validate the OpenAPI definition and make sure it's valid, most likely using OpenAPI Spec validator. I also plan to add support for Spectral in a separate backend.
❓ The plan is to ship the basic validator as part of the main OpenAPI backend. Any arguments for having it as its own?
check
Not sure if it goes under check or lint, but currently at @snowfall-travel we use OpenAPI Diff outside of Pants to detect breaking changes in our OpenAPI definitions in CI. This would be a nice addition as well that I might tackle after everything else is done (as a separate backend).
fmt
OpenAPI documents are simple JSON and YAML files, so one way to do formatting would be to load the file with json or pyyaml (these will be required to parse the files for dependency inference anyway) and then dump the contents back using whatever "pretty print" way is available in the libraries.
❓ Will this actually be useful? It would for me!
❓ If it's useful, the same question as the one lint applies, separate backend or not?
package
Not something I've planned though since the idea hit me just now while writing this RFC, but maybe! The tooling in the OpenAPI ecosystem is not always great at handling specifications split over multiple files, which is why there are tools around to "dereference" an OpenAPI document (aka resolving all $ref and inlining them instead). So maybe package could do that at some point? Because while it's usually easier and "DRYer" to split the specification into multiple files, it's also a lot nicer to be able to share one file with people rather than a whole bunch of them.
export-codegen
While I don't plan on adding it personally (at least not in the foreseeable future), something like OpenAPI Generator could be used to generate client and server code based on the OpenAPI document and then hook it into other Pants stuff (similar to the protobuf codegen).
I very much welcome an OpenAPI backend. So much that I myself did a PoC on this topic although I was a bit more focused around the code generation (and in fact using the same tool you mentioned)
Regarding the target types, i kind of find your openapi_source target a bit redundant. Mostly saying this because it would basically be a glorified file (or resource) target. In the end, the OpenAPI spec doesn't really care what kind of schema your supporting files follow as long as the external $refs in the OpenAPI document point to a valid path inside the given file.
So I would consider changing the openapi_source target to just be a file or resource and then renaming the openapi_definition to openapi_sources for consistency with other targets and backends. Not sure if I'm oversimplifying it.
Your tailor implementation based on convention for the filename sounds reasonable to me.
Thanks for the feedback! That's fair. I did think about just using resource and file but decided against it for whatever reason - but since they're just JSON and YAML files we should get away with only having a openapi_source for the main file.
I don’t know what are other maintainers’ opinions on this regard, so don’t take mine as the final word.
My position is more on the side of preventing a proliferation of target types, but there could be a good reason (that I’m missing) to have those two here.
El mié, 20 jul, 2022 a las 18:01, Jonas Stendahl @.***> escribió:
Thanks for the feedback! That's fair. I did think about just using resource and file but decided against it for whatever reason - but since they're just JSON and YAML files we should get away with only having a openapi_source for the main file.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
The tricky part with using file and resource is dependency inference. The first level is fine, aka. when an openapi_source depends on a file or resource, but as soon as that file or resource depends on another file or resource we're suddenly dealing with OpenAPI specific dependency inference on a generic target type. Not sure how to handle that.
Not fully sure what is meant by OpenAPI specific in the context of dependencies... wouldn't obtaining the TransitiveTargets be enough to traverse the dependency graph?
It could very well be me not fully understanding dependency inference just yet, but here's how I see it:
Let's say we have three files: a.json, b.json and c.json with the dependency chain a -> b -> c (through $refs).
-
a.jsonis ouropenapi_sourceand we can easily find its dependency tob.jsonthrough its$refwith the dependency inference logic foropenapi_source. -
b.jsonandc.jsonare bothfile, so in order forb -> cto be inferred we need to either:- Add dependency inference for
fileandresourcefor JSON and YAML files (using the same$refresolve asopenapi_source).- This is what I called "OpenAPI specific" (though it would also work for JSONSchema etc.) since that dependency inference won't work on all
fileandresource.
- This is what I called "OpenAPI specific" (though it would also work for JSONSchema etc.) since that dependency inference won't work on all
- Have the dependency inference for
openapi_sourcego deeper soa.jsonends up with direct dependencies on bothb.jsonandc.json.- This will make the dependency graph wrong, causing
./pants dependenciesand./pants dependeesto be unreliable (and, more importantly,--changed-dependees=transitive).
- This will make the dependency graph wrong, causing
- Some method I'm unaware of that allows Pants to build the entire chain from
openapi_sourcealone.
- Add dependency inference for
I think I get it now, using files or resources the end user will always have to explicitly provide the dependencies for the b -> c part of the graph.
I was tempted to mention that wouldn't be such a common case but using those targets either forces the users to glob all potential OpenAPI sources into the same files target or, as said before, to be explicit about them. And none of them sound nice.
So yeah, it seems that to have an implementation of dependency inference that feels complete, it seems that two targets are what is needed. Sorry for making you waste time exploring the possibility of using more generic target types.
This sounds like it would support connexion projects nicely. https://github.com/spec-first/connexion
Lint with openapi spec validator to begin with sounds good.
And if pants could ease the pain of getting OpenAPI generator setup, then I might be more likely to use it (I avoid java as a rule).
I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec. https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#definitionsObject
I think I would use openapi_document or openapi_doc for the primary file. This plays off the how that is defined in the spec:
https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md#openapi-document
And the 2.0 spec seems amenable to the term "document" as well.
https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#swagger-object
Another possibility would be openapi_spec since they are often referred to as specs.
For the supporting files, openapi_source makes sense.
Sorry for making you waste time exploring the possibility of using more generic target types.
No worries! I was a bit conflicted about it already, so it was worth giving it another thought :)
I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec.
Thanks for the feedback! Yeah, I agree that openapi_document is probably better than openapi_definition. I'll update the PR.
Thanks a lot @jyggen!