tiled
tiled copied to clipboard
Add support for patching, merging, replacing metadata
- patch_metadata uses a http patch request with application/json-patch+json content type
- merge_metadata can be triggered with a http patch request and application/merge-patch+json
- update_metadata uses a similar approach to merge_metadata, but constructs a json-patch on the client-side
- add jsonpatch and json-merge-patch to requirements
@danielballan, @Kezzsim this PR is ready for another review. Changes since your last review are minimal, though it took me an obscene amount of time to figure this out.
I pushed a couple of commits addressing suggestions from @danielballan and @genematx.
Also added an option in update_metadata
for retrieving a JSON patch without applying it. May be useful for setting up a batch update via patch_metadata
(fast).
I’ve asked @Kezzsim to test drive this locally a bit before we merge. Merge away when you are satisfied, Kari!
Rebased and force-pushed to resolve merge conflicts with #695
Tests failed due to an incompatibility with pydantic 2.x, introduced by #695. I have pushed a commit with a fix.
This looks good, however I'm noticing a break from convention when it comes to how HTTP is implemented. Pulling up swagger docs on http://127.0.0.1:8000/docs reveals that application/json
is the only mimetype in the fastAPI dropdown.
Technically this is correct, as the body in @hyperrealist 's spec is a json object which contains valid json patches, but the overall request body itself is not a json patch.
Example of current request body [dict]:
{
"patch": [
{
"op": "replace",
"path": "/test",
"value": "testing"
}
],
"specs": null
}
Proposed json patch implementation of body [list]: (conforming to RFC6902)
[
{
"op": "replace",
"path": "metadata/test",
"value": "testing"
}
]
Fixing this to have it be within spec would require one of two potential changes:
- Split the endpoint into
PATCH /api/v1/metadata/{path}
andPATCH /api/v1/specs/{path}
which both take valid json-patch objects that alter their respective properties at the root path. - (As seen in the example above) change the body to be a valid json-patch but make the RFC6902
path
roots bemetadata
andspecs
respectively.
@danielballan mentioned pulling in @dylanmcreynolds since he has a good eye for making similar API design decisions that are both conforming to spec and aesthetically pleasing. I'd like to see what his suggestion is.
Thanks @Kezzsim, this is something that had been bothering me in the background at times. I appreciate your attention to detail!
Split the endpoint into
PATCH /api/v1/metadata/{path}
andPATCH /api/v1/specs/{path}
which both take valid json-patch objects that alter their respective properties at the root path.(As seen in the example above) change the body to be a valid json-patch but make the RFC6902
path
roots bemetadata
andspecs
respectively.
- If it is typical to have to update both
metadata
andspecs
together this would necessitate extra http requests. If that is not the case I think this is a very much viable solution. - This is a creative way to conform to the http spec but I don't like how arguments are mixed together, especially since how they are mixed together would change depending on the mimetype (we support both
application/json-patch+json
andapplication/merge-patch+json
. See here). - A third solution might be to let the payload be of
application/json
mimetype and split the endpoint into something likePATCH /api/v1/patch_metadata/{path}
andPATCH /api/v1/merge_metadata/{path}
, each requiring a valid json with an embedded json patch or a merge patch respectively. But I think this is achieving little at the expense of polluting the API and adding to the complexity. - A fourth solution is to enforce a sub-mimetype at the content level:
{
"type": "application/json-patch+json", # could default to this. may also support aliases "json-patch"
"metadata": [
{
"op": "replace",
"path": "/test",
"value": "testing"
}
],
"specs": [],
}
and
{
"type": "application/merge-patch+json", # or "merge-patch" / "merge"
"metadata": {"test": "testing"},
"specs": [],
}
All these solutions would conform to http spec, but I am leaning toward solutions 1 or 4 for their conceptual simplicity and ease of implementation. Anyway, I am completely open to other ideas and design considerations.
Yes, good catch @Kezzsim! And thanks for presenting some additional options @hyperrealist.
Some additional design considerations:
Option (1) would break symmetry between GET and PATCH, as GET /metadata/{path}
currently returns all "metadata" about a node, broadly defined, including structure family, structure, specs, data sources, and metadata. Making separate routes for PATCH seems asymmetrical in this sense, in addition to introducing overhead from separate requests for some use cases.
I can foresee adding support for patching data sources, alongside metadata and specs, which would make the downsides with (2) and (3) even more salient.
I can see no problems with (4). There was something appealing about specifying the patch format in the content-type
header, but there is also something clean about making the request plain JSON (everybody knows JSON!) with the patch format specified in the body. It seems to keep our options open.
If we go that way, it is worth considering options for what to name the key. Is it content-type
, format
, minetype
, type
?
I very much intentionally wanted to avoid calling it content-type
. I wouldn't envy whoever is documenting content-type
the http header and content-type
the subtype for this endpoint. mimetype
sounds too restrictive to me as I would hesitate to support aliases under that name. Between format
and type
I like type
because it hints at this being a mimetype without being overly restrictive. If adding a bit to the boilerplate is not an issue I would go with patch-type
, metadata-patch
, and specs-patch
.
Seeing :+1:s from @Kezzsim and myself, I think going ahead with (4) with name type
makes sense.
I very much intentionally wanted to avoid calling it content-type. I wouldn't envy whoever is documenting content-type the http header and content-type the subtype for this endpoint. mimetype sounds too restrictive to me as I would hesitate to support aliases under that name.
I'm not sure I understand the hesitance about calling it content-type
. There's precedence here. For example, it normally appears in both the header and the body of multi-part form posts: https://swagger.io/docs/specification/describing-request-body/multipart-requests/
CI errors for Python 3.11 are coming from dask's incompatibility with 3.11.9 (fixed in 2024.4.1). Github CI uses 3.11.9, so it is broken at the moment.
I pinned python<3.11.9 for now [on second thought, let me actually remove the pin so that you can see the errors], but I am not sure if I should attempt to resolve this issue in this PR. We should probably update to dask 2024.4.1.
With that note, I think this ready for another review, @danielballan @Kezzsim @genematx. I did not implement patch for specs, but just for metadata. If you think patching specs makes sense I can add that.
FYI, @hyperrealist, you may have seen this go by already but in case not, the CI failures on Python 3.11 are not the fault of this PR and can be ignored. See #715.
Rebased to pull in latest CHANGELOG
and place this change entry on "Next" release.
Revisiting @hyperrealist's comment proposing this API:
{
"type": "application/json-patch+json", # could default to this. may also support aliases "json-patch"
"metadata": [
{
"op": "replace",
"path": "/test",
"value": "testing"
}
],
"specs": [],
}
and
{
"type": "application/merge-patch+json", # or "merge-patch" / "merge"
"metadata": {"test": "testing"},
"specs": [],
}
I now realize that this was a little ambiguous about what's happening with specs
. As currently implemented in this PR, specs
is handled the same as in PUT
. The content of specs
in the request replaces the current specs
.
class PatchMetadataRequest(HyphenizedBaseModel):
content_type: str
# These fields are optional because None means "no changes; do not update".
patch: Optional[Union[List[JSONPatchSpec], Dict]] # Dict for merge-patch
specs: Optional[Specs]
I think it would be more useful (and more consistent) to make specs
also a patch.
class PatchMetadataRequest(HyphenizedBaseModel):
content_type: str
# These fields are optional because None means "no changes; do not update".
metadata: Optional[Union[List[JSONPatchSpec], Dict]] # Dict for merge-patch
specs: Optional[Union[List[JSONPatchSpec], Dict]] # Dict for merge-patch
Notes:
- The user has to pick one patch
content-type
for the whole request; they cannot be mixed. This seems like a good simplifying constraint. - For specs, the performance advantage of "patch" vs "replace" is low. The list of
specs
is not expected to become long. - But in a future PR I foresee adding
data_sources
here, and at that point I think we'll be glad for a consistent way to amend themetadata
,specs
, anddata_sources
, each with their own patch.
Existing unit tests cover update_metadata
, but I think it's worthwhile to add a couple tests covering the new methods.
I played with this locally. It works as advertised. I really like the ease of use of update_metadata
...no more manually making deep copies in user code!
I took the liberty of rebasing on main
and then renaming md_patch
to metadata_patch
. (I appreciate the case for brevity, but we don't abbreviate it anywhere else in the codebase, I suspect it will be more often read that typed, so I am privileging clarity.)
Will merge when CI passes.