specification icon indicating copy to clipboard operation
specification copied to clipboard

Add server to support Last-Modified header field

Open csarven opened this issue 1 year ago • 6 comments

This PR makes a correction class 4 change:

  • Add server requirement https://solidproject.org/ED/protocol#server-last-modified-includes to include the Last-Modified header field.

In an nutshell, the Last-Modified header field is useful for clients in a number of ways. RFC 9110 only goes as far as setting a SHOULD requirement level for servers, e.g., https://www.rfc-editor.org/rfc/rfc9110#name-last-modified , https://www.rfc-editor.org/rfc/rfc9110#section-15.3.1-5 . This PR makes it a MUST as there is already precedent of its use, and ensures that clients can rely on it where applicable:


I'm generally considering this PR to be a class 4 change because the Last-Modified header wasn't explicitly required in the Solid Protocol, hence, it can be considered as a new feature in and itself. This PR can alternatively be considered a class 3 change on the grounds that it:

clears up an ambiguity or under-specified part of the specification in such a way that data, a processor, or an agent whose conformance was once unclear becomes clearly either conforming or non-conforming.

and given that the Solid Protocol already describes a number of behaviours and advisements involving the Last-Modified header field, e.g.:

  • https://solidproject.org/ED/protocol#server-container-last-modified describes behaviour involving responses that include Last-Modified.
  • https://solidproject.org/ED/protocol#dcterms-modified-corresponds-last-modified describes behaviour where dcterms:modified value under https://solidproject.org/ED/protocol#contained-resource-metadata corresponds with HTTP responses including the Last-Modified header field and its field value.
  • https://solidproject.org/ED/protocol#container-last-modified-comparison describes how to interpret the field value in the context of containers.

Resolves https://github.com/solid/specification/issues/154 .


Preview | Diff

csarven avatar Aug 24 '24 12:08 csarven

I think this is pretty strong, if we make it a MUST then that means all existing servers that don't have it will be incompliant. Why is a SHOULD not enough?

michielbdejong avatar Aug 28 '24 14:08 michielbdejong

I recall @CxRes explaining some pushback from IETF when Solid is trying to monkey patch HTTP. I don't know if this PR falls under that, but it might still be worth double-checking.

Why is a SHOULD not enough?

We should be able to reference a requirement derived from use cases for every spec requirement. We should also document the consequences of not having this requirement to clarify what adding it tries to remedy.

elf-pavlik avatar Aug 28 '24 18:08 elf-pavlik

@elf-Pavlik HTTP Semantics compliance is not an issue with this PR. From that point, this PR is OK! "Last-modified" is problematic in the context of Containment Triples, but that is another issue.

I am ambivalent about the MUST only because I do not understand why it is a SHOULD in HTTP Semantics? Providing a date to clients for caching seems a natural thing to support. I would need to investigate that!

CxRes avatar Aug 28 '24 20:08 CxRes

Solid Protocol does not currently require Last-Modified or ETag header fields. If clients can't consistently rely on servers to support ETag or Last-Modified, applications are going to be mediocre and incompatible at best.

These header fields are essential for caching and efficiency, conflict detection, synchronisation, archiving... and actually delivering applications with good UX. As it stands, inconsistent server support leaves the ecosystem with varying levels of application quality, and so developers are ultimately forced to come up with their own (often incompatible) solutions.

Solid is aiming to build a decentralised web that's "harder, better, faster, stronger" - or something - right? Servers should bear the minimal burden to support these features so that the ecosystem can support smart, responsive, and reliable applications.

Since information pertaining to Last-Modified is already included in the Solid Protocol, it is logical to enforce its use, and also helps the specification to be more internally consistent. We should expect servers to consistently support ETag as well (but that's for another PR.)

Focusing on this PR (Last-Modified), while the primary use cases are for improving application functionality, the last updated timestamp (with different formatting / localisation) can also be displayed to users in various context.

csarven avatar Aug 28 '24 20:08 csarven

Thanks @csarven! This sounds reasonable to me.

I admit it is unfair to raise the bar for this PR to have a clear trail to the appropriate documented use case and requirement. Conversely, it would be beneficial to start taking a more disciplined approach. If we don't do it with this PR we should try to get there eventually.

Based on my implementation experience with SAI, we add updatedAt statements to various records related to authorization and discovery. Having the server guarantee it would remove the need to track it in the data itself. This might fit with the broad topic of server-managed statements; creator is one example discussed often.

What we should do before merging is check with known active implementations how it impacts them. For example, CSS with SPARQL backend. I just asked on CSS matrix channel but we should try to reach out to other known active implementers.

elf-pavlik avatar Aug 28 '24 21:08 elf-pavlik

While it is ok to have "Last-Modified" header as a MUST for consistency and historicity, my 2 cents is that the 1 second resolution makes it pretty useless for "harder, better, faster, stronger" - or something. Servers really ought to support ETags and perhaps Versions (https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-versions-01) for real caching.

CxRes avatar Aug 28 '24 21:08 CxRes

Implementation feedback for a feature where an application (dokieli) makes use of the available Last-Modified header field and field value ( as per Server's https://solidproject.org/ED/protocol#server-last-modified-includes ) in the UI. When the user triggers Edit (authoring mode) from the menu by incorporating 1) a human-readable modified date, and 2) the machine-readable modified date (schema:dateModified) into the document. Simple implementation.

dokieli-last-modified.webm

Source: https://dokie.li/media/video/dokieli-last-modified.webm

Aside: Not shown in the screencast but the modified information in the document is updated when the user saves the document.

csarven avatar Sep 10 '24 11:09 csarven

Aside: Not shown in the screencast but the modified information in the document is updated when the user saves the document.

Is the modified date managed by the client or the server?

elf-pavlik avatar Sep 10 '24 12:09 elf-pavlik

The Server generates the Last-Modified of the selected representation.

The Client manages the information in the representation, whatever the statements may be, e.g., schema:dateModified or equivalent properties such as dcterms:modified, or human-readable information.

Conceptually those two things (server's and client's use of modified) are not one of the same and there is no expectation that they have to be synchronised.

The purpose of the Last-Modified header in this context is to demonstrate that the available modified date is used as the initial value for both human and graph representation. When the user continues to edit and save the document, the application sends an update representation to the server. There is no requirement for the server to process the information in the payload. The server only needs to generate the header based on its own clock and whatever internal processes it takes into account to to generate the header for a selected representation when the resource is requested.

csarven avatar Sep 10 '24 12:09 csarven

I'm trying to find a scenario where both modified dates would be useful. This could be in cases where the app works offline and later synchronizes data with the server. In this case, the date when the user modified something, e.g., a geolocation log entry, will be legitimately earlier than the modified date when the data was published to the server once the device comes back online.

here is no requirement for the server to process the information in the payload.

The opposite might be a useful requirement, so the server always uses its own modified data and ignores any statements made by the client. This could also be tested.

Besides generic shape validation, where the shape may require a timestamp, the server shouldn't treat statements in any particular way unless required by the spec. Having a hard separation between client and server-managed statements would once again help avoid any possible issues caused by conflating the two.

elf-pavlik avatar Sep 10 '24 14:09 elf-pavlik