specification icon indicating copy to clipboard operation
specification copied to clipboard

Append operations on RDF bearing resources

Open kjetilk opened this issue 2 years ago • 10 comments

Append operations on non-container resources are currently not defined in the spec, and it is clear that what it means to append to a resource will depend on the media type of the target resource. There are some media types that have a straightforward interpretation, among them notably RDF sources. I therefore suggest that we have generic language in the spec and that we define append operations for certain media types, in the continuation of #118.

Moreover, that an append operation can be done using POST is very clearly stated in RFC7231:

Appending data to a resource's existing representation(s).

It has actually been in since the very first description of HTTP:

Extending a document during authorship.

So, given that simply adding some data is a common requirement, we have an append operation in Solid, a clear description in HTTP and a straigthforward interpretation in RDF, it seems very clear that we should do it.

I therefore suggest that we add something like this to the Protocol:

Append operations to a non-container resource

When a POST method request targets a non-container resource with an existing representation, the server MUST interprete that as an Append operation. The behaviour of an append operation depends on the media type. Servers MAY implement append algorithms for any media type. If the request contains a media type that the Server has no append algorithm for, it MUST respond with the 415 status code. Servers MUST implement the two following append algorithms:

Media types for RDF serializations

If the target resource and the request payload has a RDF bearing representation, then the server MUST use the RDF Merge algorithm to append the request payload into the target resource. This is equivalent but simpler than to PATCH the target resource using INSERT DATA from SPARQL Update.

text/plain media type.

If the target resource and the request payload has the text/plain media type, then the server MUST append the request payload to the end of the target resource.

kjetilk avatar Aug 30 '21 21:08 kjetilk

This sounds very useful. How does this relate to the property of POST that the server can always assign name? If I append to foo.ttl, I would want to expect that foo.ttl is where I can retrieve the merged data.

jeff-zucker avatar Aug 30 '21 22:08 jeff-zucker

Yes, this is POST to an existing non-container resource (POST-to-append, we might call it), it is different from the POST-to-create. With POST-to-create, you POST to a container, and the server assigns a name, or takes a suggestion that you supply with the Slug header.

With POST-to-append, you POST to an existing resource, and then it doesn't create, and so, the server does not assign any name, it is strictly a modification of an existing resource.

kjetilk avatar Aug 30 '21 23:08 kjetilk

I have no particular issue or objection to add this. As mentioned before, I wrote the Protocol text in a way that would make POST w/ RDF Merge possible one day. Some comments:

This is equivalent but simpler than to PATCH the target resource using INSERT DATA from SPARQL Update.

What is exactly gained by introducing POST to append?

Supporting append for text/plain is more clear. It can also be done with PATCH in a richer way with URI Fragment Identifiers for the text/plain Media Type: https://datatracker.ietf.org/doc/html/rfc5147 but I do understand the appeal to only append at the end object with POST. Just saying.

When POST to append support is constrained, the response should include constrainedBy ( https://github.com/solid/specification/pull/185 ) and/or problem details ( https://github.com/solid/specification/issues/28 ).

csarven avatar Aug 31 '21 07:08 csarven

What is exactly gained by introducing POST to append?

An absolutely crucial enhancement of onboarding experience as well as a lower load on the server side.

Say that you come to a pod with some data on it, like something that has been set up with your profile, your first question is likely to be "how do I add something to my profile". Now, that can be partially addressed by libraries, but many developers like to understand technology. There's a learning curve to Solid, and part of that learning curve is RDF, and to think in graphs. Many do not manage to cross that threshold.

Without POST-to-append, we raise that bar even higher without any good reason whatsoever. We ask people to learn another language, a query language. Just to add a couple of statements! If printed, once you've been through SPARQL Update, you're read 150 pages, and there's more... You gotta be mad! :-) This is no way to treat a newcomer. That's why this is so important to me, I'm truly concerned that Solid will fail to get traction because of this really simple thing. It can't be just "possible one day", it has to be there now that we're asking people to have a look at Solid, we already have a bad reputation for making simple things hard, we MUST NOT confirm that reputation yet again.

Now, I truly enjoyed helping write those 150 pages, so, it should be noted that the same thing can be done with INSERT DATA. INSERT DATA takes some triples, not triple patterns, and then, in SPARQL, the operation that the store must do is an RDF Merge operation, that's how it is defined. So, it is exactly the same operation. However, it involves an extra parsing step, through a very much more complex grammar than RDF, and after parsing, you will have to engage a relatively complex authorization framework to check that none of the parsed elements require more privileges than append. Then, you'd probably fire up a query planner too, even just to notice that there are just triples that can be merged. So, even though optimization can make the impact small, a backend would certainly want to have most users use POST-to-append rather than PATCH-with-SPARQL. Server-side, POST-to-append amounts to look for the /, check for Append permission and then do the RDF merge. It's like 5 lines of code ;-)

It is therefore a win both on the front-end and on the back-end. We should not only look for the richer ways, we should also make sure simple things are simple.

kjetilk avatar Aug 31 '21 09:08 kjetilk

POST to append to RDF resources may indeed be a very valuable simplification for developers.

I'm however not too sure about defining the semantics of POST for text/plain though, as it may open the door to things outside of the scope of Solid. IMO, it may be safer if Solid does not specify the operational semantics of non-RDF media types too strictly (This is probably part of a broader discussion).

rubensworks avatar Aug 31 '21 09:08 rubensworks

Alright, we can drop the definition of text/plain and let that be defined elsewhere. There are many media types where defining append is non-trivial, but it would be nice to have.

kjetilk avatar Aug 31 '21 09:08 kjetilk

@kjetilk I think your response to the question is adequate. The point was to have it on record - which wasn't mentioned in the original comment.

It can't be just "possible one day"

Contextomy. What I said was about writing the spec in a way such that the desired feature can be introduced at ease when there is rough consensus in the future. Here we are discussing that possibility.

csarven avatar Sep 01 '21 05:09 csarven

If the target resource and the request payload has the text/plain media type, then the server MUST append the request payload to the end of the target resource.

Note that, in addition to @rubensworks comment, this does not work because resources can have different representations. So to what representation would we be appending?

In other words, "appending the request payload" is only defined for representations, not for resources.

RubenVerborgh avatar Sep 10 '21 15:09 RubenVerborgh

That's true. Which brought up another thing in my mind: I don't think neither we nor LDP has ever said explicitly that all representations of an RDF Source MUST describe the same RDF graph, yet that is clearly the case, it would get really messy if it wasn't.

For RDF Sources, it is then quite simple, since the append operation is defined in terms of RDF Semantics, not in terms of a particular serialization and so any representation will apply to the RDF graph.

For non-RDF sources is harder, since the strong equivalence requirement does not apply to them. What can be done is that you append to the representation with the same media type as in the Content-Type of the request.

It wasn't intended to take on the full complexity here, I really only wanted append for RDF Sources, the rest isn't very important. To avoid tackling that, I'll restrict the scope of this issue.

kjetilk avatar Sep 14 '21 09:09 kjetilk

So, something more like:

Append operations to a RDF bearing non-container resource

When a POST method request targets an RDF bearing non-container resource with an existing representation, the server MUST interprete that as an Append operation. If the request targets a different resource, the Server must respond with the 415 status code.

If the target resource and the request payload has a RDF bearing representation, then the server MUST use the RDF Merge algorithm to append the request payload into the target resource. This is equivalent but simpler than to PATCH the target resource using INSERT DATA from SPARQL Update.

kjetilk avatar Sep 14 '21 09:09 kjetilk

The Protocol uses RDF document https://www.w3.org/TR/rdf11-concepts/#dfn-rdf-document (in a concrete RDF syntax).

csarven avatar Sep 14 '21 09:09 csarven