datahub icon indicating copy to clipboard operation
datahub copied to clipboard

feat(API): support for additional methods and reading of JSON schema from swagger

Open vojtechneradatos opened this issue 2 years ago • 6 comments

Open API extended support:

  • added support for PUT, POST and PATCH methods
  • new property get_operations_only - disables reading of non-GET methods
  • reading of JSON schema from swagger.

YAML recipe and documentation of Open API are changed accordingly.

Also we changed user in creation of audit stamp from urn:li:corpuser:etl to urn:li:corpuser:datahub, since user etl does not exist by default.

Checklist

  • [ ] The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • [ ] Links to related issues (if applicable)
  • [ ] Tests for the changes have been added/updated (if applicable)
  • [ ] Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • [ ] For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

vojtechneradatos avatar Feb 08 '23 14:02 vojtechneradatos

Will work on this after https://github.com/datahub-project/datahub/pull/7361/ gets in.

shirshanka avatar Feb 18 '23 00:02 shirshanka

@vojtechneradatos hey there! Sorry about the ultra long delay in us taking a look here.

I'm in principle on board with the idea of ingesting endpoints other than GET as part of the OpenAPI ingestion source, and the naming conventions you laid out make sense.

My biggest worry with this PR as it stands is all of the custom json-schema parsing logic. Ideally the OpenAPI source should reuse the json-schema parser and json schema inference capabilities that we already have built out in the rest of the codebase - that's why we had referenced https://github.com/datahub-project/datahub/pull/7361 in an earlier comment. There's also a bit of code duplication across the processing of the different method types, which would need to get cleaned up and tested before this is mergable.

hsheth2 avatar Apr 03 '24 20:04 hsheth2

@hsheth2 Hi, I understand that consolidation of our contribution provided by my colleague @vojtechneradatos last year with #7361 would be useful and desirable. However, it's worth noting that our contribution actually preceded #7361 and we didn't get any specific requirements for modifications during the timeline of our project (unlike the already accepted #7229 which we updated as requested at that time). We provided these contributions after agreement with our customer who funded this development due to their needs. We don't have any means for funding further updates of the provided pull request at the moment. I'll try contacting our customer in order to check whether they'd be interested in funding the additional efforts necessary for getting their changes consolidated with the mainstream sources. However, even if they agree funding this effort, I'd assume that they'd want to get some guarantees that their funding indeed allows getting this and also #7288 reviewed and eventually incorporated into mainstream in reasonable timeframe rather than having to wait for months or years (comments like "We will be taking a pass shortly" in #7288 without any further action for more than 13 months don't contribute to trust in that). Please, let me know your thoughts.

tomas-hajny avatar Apr 04 '24 11:04 tomas-hajny

@tomas-hajny if you're willing to pick this and the backstage PR up, I can commit to reviewing them in a timely manner. I understand what we've burned some trust here, and I'd like to work hard to rebuild that :)

In terms of sequencing, I think it makes sense to revamp this PR first. Hopefully the backstage PR can reuse the openapi spec parsing logic from this PR and the json-schema parsing capabilities of https://github.com/datahub-project/datahub/pull/7361.

Let me know if you're up for it! Also, happy to hop on a call to discuss these two PRs and what needs to be done to get them merged.

hsheth2 avatar Apr 05 '24 21:04 hsheth2

@hsheth2 As discussed, I opened it with our customer and clarified their potential benefits related to this consolidation and inclusion of their local changes to the mainstream. I cannot predict the outcomes of their internal discussion and the final decision, but I'll certainly let you know if I have any news in this area.

BTW, yes, the backstage PR would need to follow this one rather than vice versa, that's for sure.

tomas-hajny avatar Apr 10 '24 07:04 tomas-hajny

@Coderabbitai full review

shirshanka avatar Jun 28 '24 20:06 shirshanka