pants icon indicating copy to clipboard operation
pants copied to clipboard

RFC: OpenAPI Backend

Open jyggen opened this issue 3 years ago • 9 comments

OpenAPI Backend

I've been thinking about adding a backend for OpenAPI definitions to Pants, but before I get too carried away with coding it I wanted to "throw it out there" and gather some feedback and do some bike-shedding regarding target names and so on.

The main beneficiary of this would be people who take a design first approach when building their APIs, people doing code first usually don't have their generated documents in-repo anyway - but if they do they would probably find some use for this as well. The idea was born while I was tinkering with a Spectral plugin at work and realized that most of the code needed wasn't specific for Spectral but for OpenAPI files, and thus could be used to much easier integrate other tools from the rather big OpenAPI ecosystem.

So, without further ado, here are my thoughts and ideas how the OpenAPI backend would look. I've got the basics done already, so expect at least a draft MR in the near future.

Targets

The plan is to introduce two new targets; openapi_definition and openapi_source. openapi_definition will be used to specify the main OpenAPI document and openapi_source for any JSON/YAML file referenced in the openapi_definition (or another openapi_source) using $ref and the main OpenAPI document itself.

The reason why two targets are introduced is because most (if not all) tooling around OpenAPI expect one file as input (the main OpenAPI document) and will do reference resolving themselves and include other files as needed - so having a clear openapi_definition makes it easier for goals to work as expected out of the box instead of having to add skip_foobar_lint to all targets that aren't the main file.

❓ Is openapi_source a good name for "non-spec" files? openapi_library was my first option, since the file is basically a library of schemas you reference in your API definition, but openapi_source felt more in line with the rest of Pants. ❓ openapi_document instead of openapi_definition? Naming things is hard.

Dependency Inference

Dependency inference between openapi_definition and openapi_source targets should be achievable by simply resolving at all $ref keys that have relative file paths.

$ref can also be external URLs or absolute file paths, but I'm not sure if we can/want to resolve these in a good way? It does lead to cases though were these files could change and Pants won't notice. I have a hunch that absolute file paths are quite rare though.

Goals

tailor

Support for tailor should be relatively straight forward by finding openapi.json or openapi.yaml files and add them as a openapi_definition. Once dependency inference is working, the same $ref resolver can also be used to add openapi_source automatically based on the openapi_definition's dependencies (and recursively add even more openapi_source based on dependencies' dependencies).

lint

lint will validate the OpenAPI definition and make sure it's valid, most likely using OpenAPI Spec validator. I also plan to add support for Spectral in a separate backend.

❓ The plan is to ship the basic validator as part of the main OpenAPI backend. Any arguments for having it as its own?

check

Not sure if it goes under check or lint, but currently at @snowfall-travel we use OpenAPI Diff outside of Pants to detect breaking changes in our OpenAPI definitions in CI. This would be a nice addition as well that I might tackle after everything else is done (as a separate backend).

fmt

OpenAPI documents are simple JSON and YAML files, so one way to do formatting would be to load the file with json or pyyaml (these will be required to parse the files for dependency inference anyway) and then dump the contents back using whatever "pretty print" way is available in the libraries.

❓ Will this actually be useful? It would for me! ❓ If it's useful, the same question as the one lint applies, separate backend or not?

package

Not something I've planned though since the idea hit me just now while writing this RFC, but maybe! The tooling in the OpenAPI ecosystem is not always great at handling specifications split over multiple files, which is why there are tools around to "dereference" an OpenAPI document (aka resolving all $ref and inlining them instead). So maybe package could do that at some point? Because while it's usually easier and "DRYer" to split the specification into multiple files, it's also a lot nicer to be able to share one file with people rather than a whole bunch of them.

export-codegen

While I don't plan on adding it personally (at least not in the foreseeable future), something like OpenAPI Generator could be used to generate client and server code based on the OpenAPI document and then hook it into other Pants stuff (similar to the protobuf codegen).

jyggen avatar Jul 03 '22 21:07 jyggen

I very much welcome an OpenAPI backend. So much that I myself did a PoC on this topic although I was a bit more focused around the code generation (and in fact using the same tool you mentioned)

Regarding the target types, i kind of find your openapi_source target a bit redundant. Mostly saying this because it would basically be a glorified file (or resource) target. In the end, the OpenAPI spec doesn't really care what kind of schema your supporting files follow as long as the external $refs in the OpenAPI document point to a valid path inside the given file.

So I would consider changing the openapi_source target to just be a file or resource and then renaming the openapi_definition to openapi_sources for consistency with other targets and backends. Not sure if I'm oversimplifying it.

Your tailor implementation based on convention for the filename sounds reasonable to me.

alonsodomin avatar Jul 20 '22 15:07 alonsodomin

Thanks for the feedback! That's fair. I did think about just using resource and file but decided against it for whatever reason - but since they're just JSON and YAML files we should get away with only having a openapi_source for the main file.

jyggen avatar Jul 20 '22 16:07 jyggen

I don’t know what are other maintainers’ opinions on this regard, so don’t take mine as the final word.

My position is more on the side of preventing a proliferation of target types, but there could be a good reason (that I’m missing) to have those two here.

El mié, 20 jul, 2022 a las 18:01, Jonas Stendahl @.***> escribió:

Thanks for the feedback! That's fair. I did think about just using resource and file but decided against it for whatever reason - but since they're just JSON and YAML files we should get away with only having a openapi_source for the main file.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

alonsodomin avatar Jul 20 '22 17:07 alonsodomin

The tricky part with using file and resource is dependency inference. The first level is fine, aka. when an openapi_source depends on a file or resource, but as soon as that file or resource depends on another file or resource we're suddenly dealing with OpenAPI specific dependency inference on a generic target type. Not sure how to handle that.

jyggen avatar Jul 26 '22 21:07 jyggen

Not fully sure what is meant by OpenAPI specific in the context of dependencies... wouldn't obtaining the TransitiveTargets be enough to traverse the dependency graph?

alonsodomin avatar Jul 27 '22 08:07 alonsodomin

It could very well be me not fully understanding dependency inference just yet, but here's how I see it:

Let's say we have three files: a.json, b.json and c.json with the dependency chain a -> b -> c (through $refs).

  • a.json is our openapi_source and we can easily find its dependency to b.json through its $ref with the dependency inference logic for openapi_source.
  • b.json and c.json are both file, so in order for b -> c to be inferred we need to either:
    • Add dependency inference for file and resource for JSON and YAML files (using the same $ref resolve as openapi_source).
      • This is what I called "OpenAPI specific" (though it would also work for JSONSchema etc.) since that dependency inference won't work on all file and resource.
    • Have the dependency inference for openapi_source go deeper so a.json ends up with direct dependencies on both b.json and c.json.
      • This will make the dependency graph wrong, causing ./pants dependencies and ./pants dependees to be unreliable (and, more importantly, --changed-dependees=transitive).
    • Some method I'm unaware of that allows Pants to build the entire chain from openapi_source alone.

jyggen avatar Jul 27 '22 12:07 jyggen

I think I get it now, using files or resources the end user will always have to explicitly provide the dependencies for the b -> c part of the graph.

I was tempted to mention that wouldn't be such a common case but using those targets either forces the users to glob all potential OpenAPI sources into the same files target or, as said before, to be explicit about them. And none of them sound nice.

So yeah, it seems that to have an implementation of dependency inference that feels complete, it seems that two targets are what is needed. Sorry for making you waste time exploring the possibility of using more generic target types.

alonsodomin avatar Jul 27 '22 14:07 alonsodomin

This sounds like it would support connexion projects nicely. https://github.com/spec-first/connexion

Lint with openapi spec validator to begin with sounds good.

And if pants could ease the pain of getting OpenAPI generator setup, then I might be more likely to use it (I avoid java as a rule).

I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec. https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#definitionsObject

I think I would use openapi_document or openapi_doc for the primary file. This plays off the how that is defined in the spec: https://github.com/OAI/OpenAPI-Specification/blob/main/versions/3.1.0.md#openapi-document And the 2.0 spec seems amenable to the term "document" as well. https://github.com/OAI/OpenAPI-Specification/blob/main/versions/2.0.md#swagger-object

Another possibility would be openapi_spec since they are often referred to as specs.

For the supporting files, openapi_source makes sense.

cognifloyd avatar Jul 27 '22 18:07 cognifloyd

Sorry for making you waste time exploring the possibility of using more generic target types.

No worries! I was a bit conflicted about it already, so it was worth giving it another thought :)

I'm not a fan of openapi_definition because "definitions" is a part of an OpenAPI/swagger 2 spec.

Thanks for the feedback! Yeah, I agree that openapi_document is probably better than openapi_definition. I'll update the PR.

jyggen avatar Jul 27 '22 21:07 jyggen

Thanks a lot @jyggen!

stuhood avatar Oct 12 '22 19:10 stuhood