rdflib.js icon indicating copy to clipboard operation
rdflib.js copied to clipboard

Use If-Match header with latest known ETag in update manager

Open michielbdejong opened this issue 4 years ago • 4 comments

Whenever doing a write operation, it's important to know the resource wasn't changed by another client since we last saw it. This can be assured by sending an If-Match header with the latest known ETag for existing resources, or an If-None-Match: * request header if we want to create a resource that previously didn't exist.

Until now, rdflib.js relied on node-solid-server's sparql-update semaphore but not all servers implement that so from now on let's use ETags instead.

michielbdejong avatar Mar 17 '21 13:03 michielbdejong

Thinking about what is needed for update at least:

Seems like fetcher would already store the ETag from the headers in the store, connecting request to response to each header with the link and httph vocab. https://github.com/linkeddata/rdflib.js/blob/8b405a1cef8a576cad0b1d4233201bd40e45b959/reference/fetcher-classes.js#L1417

In the update manager, it seems the ETag should be added as an option to fetcher.webOperation in fire. The extraction of the ETag would be similar to how editable already works with other headers. https://github.com/linkeddata/rdflib.js/blob/8b405a1cef8a576cad0b1d4233201bd40e45b959/src/update-manager.ts#L377-L383 https://github.com/linkeddata/rdflib.js/blob/8b405a1cef8a576cad0b1d4233201bd40e45b959/src/update-manager.ts#L134

If the request is missing, current behaviour is to fetch the document to determine the update protocol. This assumes that the user doesn't care what is already in the document, which seems to make sense in this context, i.e. we can still use the ETag even though the client has never seen that data. https://github.com/linkeddata/rdflib.js/blob/8b405a1cef8a576cad0b1d4233201bd40e45b959/src/update-manager.ts#L782-L796

If-None-Match: * only gets sent in updates when we got a 404, which we can get from the store. This guards against a stale status code.

  • Are there cases we'd want to allow overriding whether the ETag is sent? Does the new behaviour need to expose new function arguments?
  • If the server rejects the ETag, do we just throw the error, or should rdflib.js provide functionality to handle it? At the moment we update the local store iff the web is changed successfully. Is it within scope to think about ETag handling in a way that allows offline keep/revert semantics, as in remotestorage.js?

josephguillaume avatar Sep 04 '21 01:09 josephguillaume

Maybe this is affected by the existence in the spec of N3Patch as a patch format which does not have the problem of missing conflicts which SPARQL/Update has. It generates 409 if the delete fails.

This means that two instances of an app can update different parts of the same graph at high speed without having to stop and retrace their steps. So different clients could in fact apply patches in a different order (but when they are commutative) that's OK.

Maybe we need to some tests using simulated users and see which philosophy behaves best.

We also separately need to move to the server sending the patch over the web sockets,and that will affect the throughput of patches/second and reduce the bandwidth needed.

It would then be good to run these analysis/tests of optimistic vs eTag approaches with that in place.

timbl avatar Dec 17 '21 09:12 timbl

Yes, adding the check for eTag in webOperation means it is easy to use it for both PUT and PATCH. For both of those, should it check whether theer is a received eTag and assume that it should be respected by default, without the developer having to do ask for it explicitly? Should eTag checking be the default?

timbl avatar Dec 18 '21 00:12 timbl

Should eTag checking be the default?

I suppose the issue is that clients and servers with broken Etag handling would suddenly stop working, so it would probably need to be a change for a major release.

Both CSS and SolidOS have cases of broken Etag handling at the moment.

The client/developer needs to be setup to check for and handle fetch failures due to conflicts, whereas at the moment when there is a conflict the client's version just wins.

josephguillaume avatar Dec 21 '21 21:12 josephguillaume