specification icon indicating copy to clipboard operation
specification copied to clipboard

Introduce constraint on PATCH to only patch target resource

Open kjetilk opened this issue 4 years ago • 12 comments

Even if SPARQL isn't specified in 0.9, we should add some of the constraints that we agreed on in that context. The constraint that PATCH should only modify the target resource is a sound constraint that I think should go in. This text is mostly derived from earlier PRs, but generalized to any media type (which I guess is a good idea anyway).

kjetilk avatar Nov 12 '21 23:11 kjetilk

I'd like to have such things in because it is a dangerous thing to do, if people were writing new patch formats, they shouldn't do it.

There is a specific case of WITH in SPARQL, which could reasonable have an interpretation that the graph is a resource. This is to prevent such an interpretation.

kjetilk avatar Nov 23 '21 01:11 kjetilk

if people were writing new patch formats, they shouldn't do it.

Does that mean we can never have atomic updates to multiple resources at any point in the future? Or rather that we cannot have them via PATCH? (It seems to be the latter.)

I could get behind that principle.


However, then there is another obstacle:

MUST NOT allow a request with a PATCH method to change resources other than the target resource

What does change mean? To give a trivial example, the mtime listed in the container might change.

Or a patch to one resource could trigger something in another resource. (If I switch my main controller off, then all of my lights go off too.)

So language-wise, perhaps something to the extent of: servers MUST NOT allow clients to explicitly request multi-resource changes via PATCH, but the server may let effects propagate.

RubenVerborgh avatar Nov 23 '21 12:11 RubenVerborgh

if people were writing new patch formats, they shouldn't do it.

Does that mean we can never have atomic updates to multiple resources at any point in the future? Or rather that we cannot have them via PATCH? (It seems to be the latter.)

I could get behind that principle.

Yeah, definitely! Or, we could remove the constraint if we are sufficiently confident it would not cause harm. What I'm trying to do is to harden the servers so that a malicious actor can't exploit a client to confuse a user to modify a resource they shouldn't. If we can be confident through other means that can't happen, then OK. But until then, I think such measures are very important.

It can be done throught SPARQL Update on a SPARQL Endpoint, for example, that would lie outside of the current protocol, but it should be doable.

However, then there is another obstacle:

MUST NOT allow a request with a PATCH method to change resources other than the target resource

What does change mean?

Oh, yes, that's a real issue, because we do rely on side-effects for much of the functionality, and...

To give a trivial example, the mtime listed in the container might change.

Yup.

Or a patch to one resource could trigger something in another resource. (If I switch my main controller off, then all of my lights go off too.)

Yeah, and there could be cascading writes, like in #227...

So language-wise, perhaps something to the extent of: servers MUST NOT allow clients to explicitly request multi-resource changes via PATCH, but the server may let effects propagate.

OK, great, I'll adopt that!

kjetilk avatar Nov 23 '21 14:11 kjetilk

The latest language is a big improvement. SPARQL Update on a SPARQL Endpoint is not only outside the purview of Solid per se, but also well beyond "expert mode", and ramifications of any change a user makes vis such "expert mode" should also be considered outside the purview of Solid, and the user assumes the responsibility of such ramifications. Much like tinkering inside the cover of any appliance voids the warranty... The tinkering is not impossible, just reasonably guarded and warned against.

TallTed avatar Nov 23 '21 16:11 TallTed

SPARQL Update on a SPARQL Endpoint is not only outside the purview of Solid per se

Well no; just not with PATCH.

RubenVerborgh avatar Nov 23 '21 18:11 RubenVerborgh

The update should only change the description of the target resource. Don't mention propagations or side-effects until the request semantics can be further specialised.

csarven avatar Nov 24 '21 00:11 csarven

The update should only change the description of the target resource. Don't describe propagations or side-effects.

I'm fine with that, since that is what I originally did, but then @RubenVerborgh wanted some specific language to address that. I'm fine either way. Perhaps we haven't got enough on side-effects in the spec, perhaps we should have a section about that that we can refer to in such cases. Meanwhile, I think we could merge it as it is now, and do that for 1.0.

kjetilk avatar Nov 24 '21 00:11 kjetilk

I'll remove this from the current milestone so that we can wordsmith this into shape at some later point. We need to finish 0.9 urgently.

kjetilk avatar Nov 29 '21 08:11 kjetilk

The updateMany function in rdflib https://github.com/linkeddata/rdflib.js/blob/main/src/update-manager.ts#L685 is what we suggest developers call when the change updates more than one resource. It currently iterates over the resources needed to be changed, doing a patch to each one. There is no chance of atomicity.

It would be faster if all the requests could be combined in a more complex PATCH which patches more than one resource. And great is atomicity were an option too. So this issue is not in the spirit of going into th multiple resource patch would be an enhancement.

timbl avatar Jan 12 '22 15:01 timbl

See Road map item "Atomic multi-file updates" https://solidproject.solidcommunity.net/Roadmap/state.ttl#Iss1595438497668 The "To Be Done" state means we had decided it was a good thing to do though it is not scheduled.

timbl avatar Jan 12 '22 15:01 timbl

I absolutely see multi-resource patches as desirable, and atomic ones important too.

The question is whether that should be going through this kind of interface.

My concern is that users may be duped by apps, possibly because the app has a vulnerability, to modify a resource they didn't intend to modify. I'd like to plug that possible hole at the protocol level.

To support multi-resource updates in a single request, I don't think a resource-centric interface makes sense, it is not a single resource anyway. In that case, thinking in terms of endpoints (e.g. a SPARQL endpoint) makes much sense, and I suppose you could do this already using the SPARQL Protocol.

kjetilk avatar Jan 12 '22 22:01 kjetilk

Another concern is that if authorization always depends on the body of the message, then parsing the body and checking for other possible targets is a cost that must be paid by every PATCH request, unless it can be discerned from metadata (with SPARQL, there is nothing that can do that at present).

Currently, we don't have a cheap method for append (though see #305 for a proposal to fix that), so even the simplest append operation will have this cost.

It will also complicate caching on intermediaries as it implies that any resource might change based on any PATCH request (in contrast with just parent container and aux resources belonging to the resource), and so, implementations would need a mechanism to invalidate the cache for any resource.

This suggests to me that if we are going to have multi-resource operations, it ought to be on a different interface (an endpoint-oriented one), and we should only do it once we have adverse side effects under control.

kjetilk avatar Feb 22 '22 11:02 kjetilk