postgrest icon indicating copy to clipboard operation
postgrest copied to clipboard

Implement ETag using xmin

Open daurnimator opened this issue 5 years ago • 13 comments

An idea from @tysonclugg is to use PostgreSQL's xmin column to compute a HTTP ETag. This would allow clients to do cache validation. It could also help with preventing data races (See #1069).

For each row a query obtains data from, we should probably take the minimum xmin. This could be used directly as the ETag; (possibly prefixed with the current txid epoch).

daurnimator avatar Aug 30 '18 05:08 daurnimator

One limitation with xmin would be that the ETag could only be computed for TABLEs and not VIEWs since they don't have it.

steve-chavez avatar Jan 30 '20 17:01 steve-chavez

There was another idea for ETag here https://github.com/PostgREST/postgrest/issues/893#issuecomment-310276369.

Still having automatic ETag generation with xmin seems a really useful idea(mobile clients have to generate their own cache keys otherwise). So I'm thinking we could mention in docs that xmin is required and the user would have to expose that in a VIEW or function result. Or maybe just warn it only works for TABLEs for now.

First, we'll need to see how our generated query with xmin would look like.

steve-chavez avatar Feb 03 '20 20:02 steve-chavez

I don't think using xmin is going to work that well:

  • From the Mozilla ETag description:

    Entity tag uniquely representing the requested resource.

    I understand "uniquely" as "two different resources should not have the same ETag". I just tested and inserted two rows into a table - they both got the same xmin value.

  • "A resource" is a difficult concept in PostgREST. We do have some requests that only return one entity / row. But we also have "resources" / links that return more entities. For those, we would need to merge the xmin values of every row together somehow. But such an algorithm is highly likely to be prone to errors, when some rows are removed and others are added. E.g., when you sum up the xmin values of 3 rows, then delete 2 of them and add another one. This could very well end up as the same sum as before, with completely different results.

  • Views with joins would need to somehow expose multiple xmins. that can become difficult quite fast. Subqueries? Might need a major rewrite of the whole view.

  • RPC results will face a similar challenge.

  • What is caching on the xmin going to give us anyway? We will still have to execute the whole query with all the joins etc. to get the xmins. Now we could just send a hash of response as well. Ah, ok that might slow us down, yes... :(

wolfgangwalther avatar Dec 11 '20 22:12 wolfgangwalther

I don't think using xmin is going to work that well:

  • From the Mozilla ETag description:

    Entity tag uniquely representing the requested resource.

    I understand "uniquely" as "two different resources should not have the same ETag". I just tested and inserted two rows into a table - they both got the same xmin value.

The statement you quoted needs to be taken in context of the opening statement from that same page:

The ETag HTTP response header is an identifier for a specific version of a resource.

You shouldn't be "interpreting" what is meant by any particular ETag value (ie: two distinct resources sharing the same ETag is irrelevant), ETag values are meant to be opaque.

  • "A resource" is a difficult concept in PostgREST. We do have some requests that only return one entity / row. But we also have "resources" / links that return more entities. For those, we would need to merge the xmin values of every row together somehow. But such an algorithm is highly likely to be prone to errors, when some rows are removed and others are added. E.g., when you sum up the xmin values of 3 rows, then delete 2 of them and add another one. This could very well end up as the same sum as before, with completely different results.

So don't do that. Instead, take a hash (eg: SHA512) of the sorted list of all entity xmin values (concatenated with a joiner) - then you have an ETag value that is practically robust against hash collisions.

  • Views with joins would need to somehow expose multiple xmins. that can become difficult quite fast. Subqueries? Might need a major rewrite of the whole view.
  • RPC results will face a similar challenge.

As per above, take a hash of all xmin values.

  • What is caching on the xmin going to give us anyway? We will still have to execute the whole query with all the joins etc. to get the xmins. Now we could just send a hash of response as well. Ah, ok that might slow us down, yes... :(

The If-None-Match precondition gives you the ability to reply with 302 Not Modified and skip sending a HTTP response body. Regardless of the database performance being unaffected, the network traffic will be dramatically reduced.

The other thing that we are given from having ETag support is the ability to prevent lost updates when updating existing resources.

tysonclugg avatar Feb 26 '21 01:02 tysonclugg

I understand "uniquely" as "two different resources should not have the same ETag". I just tested and inserted two rows into a table - they both got the same xmin value.

The statement you quoted needs to be taken in context of the opening statement from that same page:

The ETag HTTP response header is an identifier for a specific version of a resource.

You shouldn't be "interpreting" what is meant by any particular ETag value (ie: two distinct resources sharing the same ETag is irrelevant), ETag values are meant to be opaque.

You're right - two different resources could share the same ETag. Interestingly, the RFC you quote also says:

An entity-tag is an opaque validator for differentiating between multiple representations of the same resource, regardless of whether those multiple representations are due to resource state changes over time, content negotiation resulting in multiple representations being valid at the same time, or both.

This means we need to return a different ETag for the same URL, when requested with either Accept: application/json or Accept: application/vnd.pgrst.object+json. This makes xmin alone unsuitable, because it'd be the same.

  • "A resource" is a difficult concept in PostgREST. We do have some requests that only return one entity / row. But we also have "resources" / links that return more entities. For those, we would need to merge the xmin values of every row together somehow. But such an algorithm is highly likely to be prone to errors, when some rows are removed and others are added. E.g., when you sum up the xmin values of 3 rows, then delete 2 of them and add another one. This could very well end up as the same sum as before, with completely different results.

So don't do that. Instead, take a hash (eg: SHA512) of the sorted list of all entity xmin values (concatenated with a joiner) - then you have an ETag value that is practically robust against hash collisions.

Unfortunately not in the general case. Simple example:

INSERT INTO items (id) VALUES (1), (2), (3);

Those will all have the same xmin.

Now request GET /items?order=id&limit=1.

Then delete the first row and do the request again. The xmin-based ETag for this resource stays the same, although the actual resource changed.

  • Views with joins would need to somehow expose multiple xmins. that can become difficult quite fast. Subqueries? Might need a major rewrite of the whole view.
  • RPC results will face a similar challenge.

As per above, take a hash of all xmin values.

The complexity here is not in doing something useful with the xmin values, but in collecting those in the first place.

  • What is caching on the xmin going to give us anyway? We will still have to execute the whole query with all the joins etc. to get the xmins. Now we could just send a hash of response as well. Ah, ok that might slow us down, yes... :(

The If-None-Match precondition gives you the ability to reply with 302 Not Modified and skip sending a HTTP response body. Regardless of the database performance being unaffected, the network traffic will be dramatically reduced.

Absolutely. I'm not arguing against implementing some form of ETags. I'm just saying that xmin is not a solution to the general case of that problem - and involves a lot of complexity, too.

Hashing the database result seems a lot easier to do and also works in all cases...

The other thing that we are given from having ETag support is the ability to prevent lost updates when updating existing resources.

... except here, because we can't easily compute the hash of the would-be-response before the change is made. But again, xmin is going to solve that in only a few cases as well.

wolfgangwalther avatar Feb 26 '21 08:02 wolfgangwalther

An entity-tag is an opaque validator for differentiating between multiple representations of the same resource, regardless of whether those multiple representations are due to resource state changes over time, content negotiation resulting in multiple representations being valid at the same time, or both.

This means we need to return a different ETag for the same URL, when requested with either Accept: application/json or Accept: application/vnd.pgrst.object+json. This makes xmin alone unsuitable, because it'd be the same.

What if we add all content negotiation request headers (ie: ones that should be mentioned in the Vary response header) to the hash that forms our ETag?

Unfortunately not in the general case. Simple example:

INSERT INTO items (id) VALUES (1), (2), (3);

Those will all have the same xmin.

Now request GET /items?order=id&limit=1.

Then delete the first row and do the request again. The xmin-based ETag for this resource stays the same, although the actual resource changed.

That's why I suggested that we take all xmin values concatenated by a joiner: sha512("foo:foo:foo") != sha512("foo:foo").

  • Views with joins would need to somehow expose multiple xmins. that can become difficult quite fast. Subqueries? Might need a major rewrite of the whole view.
  • RPC results will face a similar challenge.

As per above, take a hash of all xmin values.

The complexity here is not in doing something useful with the xmin values, but in collecting those in the first place.

I agree that getting the query right is complex, but I see it as worthwhile.

Absolutely. I'm not arguing against implementing some form of ETags. I'm just saying that xmin is not a solution to the general case of that problem - and involves a lot of complexity, too.

Hashing the database result seems a lot easier to do and also works in all cases...

Yes, but only if the hashed result can be sent from the DB to PostgREST avoiding any network overheads. Hashing content becomes especially burdensome if large amounts of data need to be read from disk and processed.

The other thing that we are given from having ETag support is the ability to prevent lost updates when updating existing resources.

... except here, because we can't easily compute the hash of the would-be-response before the change is made. But again, xmin is going to solve that in only a few cases as well.

The If-None-Match precondition check should be against the state of the data before the change is made, it has nothing to do with the "would-be-response" state after the change is made.

tysonclugg avatar Apr 26 '21 04:04 tysonclugg

Those will all have the same xmin. Now request GET /items?order=id&limit=1. Then delete the first row and do the request again. The xmin-based ETag for this resource stays the same, although the actual resource changed.

That's why I suggested that we take all xmin values concatenated by a joiner: sha512("foo:foo:foo") != sha512("foo:foo").

I think you missed the point here. The query has a LIMIT 1 - even the concatenated xmin values would be the same.

Hashing the database result seems a lot easier to do and also works in all cases...

Yes, but only if the hashed result can be sent from the DB to PostgREST avoiding any network overheads. Hashing content becomes especially burdensome if large amounts of data need to be read from disk and processed.

Well, we need to run the database query in any case. And the overhead would be much bigger if we ran it twice. So we would need to return the response body and its hash at the same time. Although we could match the hash on the server and return an empty body of course. But yes, reading the data from disk would be necessary.

... except here, because we can't easily compute the hash of the would-be-response before the change is made. But again, xmin is going to solve that in only a few cases as well.

The If-None-Match precondition check should be against the state of the data before the change is made, it has nothing to do with the "would-be-response" state after the change is made.

Exactly. That's, why I wrote "the would-be-response before the change". We could easily compute the hash of the would-be-response after the change is made - and then roll back the transaction. But that's not useful at all, as you say. But to get the ETag of the current version before the change - we'd have to create another full read query. Just adding a WHERE xmin=... will not do it, because of embedding and views.

wolfgangwalther avatar Apr 26 '21 06:04 wolfgangwalther

Hi all! I'm really interested in this one, are there any news about it? I need it for both cache validation, to avoid the lost update problem and for bandwith savings. I already implemented this in a custom built api manager for Oracle database in the past and I might provide some ideas for this:

  • There are two types of eTags, weak and hard. Weak etags are computationally easy to generate, like checking the max last_update_date and last_updated_by of implied rows for a given request. This would be a good indicator for checking if data has been modified always that those columns are properly updated but it might fail if the data is modified without updating the who columns, and it doesn't work for functions or binary data.
  • Hard ones try to validate the full response by just hashing the response body with for example SHA-256 or any other fast and secure hashing algorithm, and this should work in all cases. For security and simplicity it should be the way to go if there isn't any other native mechanism in the database for achieving this.

For GET requests:

  1. Client sends prior received Etag to the server with if-match header.
  2. The server executes the query, function or watever as normal.
  3. The server generates the Etag by hashing the entire response including request endpoint and query parameters.
  4. The server compares generated Etag with received Etag, if they match, server responds with standard 304 Not-Modified response and an empty body, just a few bytes are transfered to the client. If it doesn't match or there is no received eTag, the full body content is downloaded again with the new Etag response header.

This serves for the caching purposes and bandwith savings.

For POST/PATCH requests:

  1. Cliend sends an update request to the server with exactly the same prior url usead for the prior read request in this resource, and previous received Etag with if-match header.
  2. Server executes the GET request as normal with received url.
  3. Server generates Etag as normal.
  4. Server compares generated Etag and received Etag, if they match the request can proceed normally. If they don't match, it means that the resource has been changed from last time it was read. Then a standard http 412 Precondition Failed error is returned to the client.
  5. In case of a 412 error, client might decide to overwrite any changes performed by anybody else by sending the same request but with a if-none-match header instead of a if-match header. Or it might properly discard local changes by reading a fresh copy of the request, and sending it again to the server.

This serves for avoiding the lost update problem.

For PUT requests

  1. Works as normal by inserting the data and returning generated Etag to the client.

For delete requests

  1. It doesn't apply here.

yevon avatar Apr 04 '22 20:04 yevon

@yevon Yeah, I guess it's quite clear that ETags would be very useful. But as discussed above, we haven't found a reliable way to calculate them in all cases. Yet?

wolfgangwalther avatar Aug 11 '22 19:08 wolfgangwalther

Server executes the GET request as normal with received url.

I see you proposed to run the full GET request for the POST/PATCH case, too.

I think this is the only way to do it.

So basically:

  • Create a hash of the full response within each request and set it as a hard ETag.
  • When the hash matches on a GET request, answer with 304.
  • When a If-Match is sent on a mutation request, we need to run the full read query before any mutation in a CTE and compute the "before" hash.
  • After we receive the result, we can check whether the hash matches. If it doesn't, we rollback the transaction and respond with 412.

For GET requests we could get a performance improvement, because of reduced network traffic.

For mutation requests, we don't care about the performance overhead of running the read request again - we care about the lost update problem, which is solved by that.

wolfgangwalther avatar Aug 11 '22 19:08 wolfgangwalther

Yes! That's a good summary. It would be nice to have a setting to limit the maximum body size which the hash is calculated. Just imagine a 100gb file download, calculating its hash could take considerable time.

yevon avatar Aug 11 '22 20:08 yevon

A 100GB file download itself would take a long time - the additional overhead for calculating a hash wouldn't matter much. Especially compared to the savings because of network traffic. The bigger the resource, the more useful ETags are, no?

wolfgangwalther avatar Aug 12 '22 06:08 wolfgangwalther

It depends on the client and it's cache. A typicall web browser won't ever cache any single file > 5MB aprox, because the bigger the file the least probably you would download that file again and you would end up wasting tons of space. But it could be interesting for an enterprise proxy-cache for example. It depens on the needs of every application.

yevon avatar Aug 12 '22 08:08 yevon

I tried to implement the etag header in an nginx ingress proxy in a kubernetes cluster, but it seems that it is not possible due to the streamming nature of the proxy, it does stream the response, so the etag must be calculated in the backend. Are there any news on this one?

yevon avatar Dec 12 '22 19:12 yevon

Hm, etags would only work on public data right? So only for tables where the db-anon-role has the SELECT privilege.

steve-chavez avatar Dec 25 '22 03:12 steve-chavez

  • For those, we would need to merge the xmin values of every row together somehow.

You take the maximum. The value of xmin is the transaction ID that last edited the row: if you take the maximum of multiple xmins, you get the oldest transaction that the current result set could exist.

Then delete the first row and do the request again. The xmin-based ETag for this resource stays the same, although the actual resource changed.

Ah, that is indeed a problem. I don't think postgres has any way to discover if rows have been deleted from a table since a given transaction ID. I'd love to be proven wrong on that; but until then I think this idea is dead in the water.

daurnimator avatar Dec 27 '22 00:12 daurnimator

Hm, etags would only work on public data right? So only for tables where the db-anon-role has the SELECT privilege.

It can also work for authenticated requests if you just do a hash of the whole response. If in the same machine a previous authenticated user requested the same endpoint with same parameters, and that user has the same privileges as you, the response would be the same and eTag would match. If user privileges are different, the response would be different so the eTag would not match.

The only problem to consider is that this mechanism should be optional, as in some very very restricted environments might be considered a security issue to allow the web browser to keep data locally shared between different users. Even though not even developers can access to the web browser cache.

yevon avatar Jan 02 '23 16:01 yevon

and that user has the same privileges as you

Hmm, would that also work with row level security policies that use our http context?

steve-chavez avatar Jan 03 '23 02:01 steve-chavez

It should work in any case I think, just imagine this:

(Consider three users sharing the same machine at different times)

1 - User 1 executes the /departments endpoint. He is allowed to see all of them, so he receives a list of 300 departments, and an eTag: 12345. There is no previous etag, so full response is downloaded. HTTP 200 OK.

2 - User 2 executes /departments, but he is only allowed to see his own department. Web browser will send If-None-Match: 12345. The response would just return 1 department, you calculate the hash of whole response, so eTag would be for example eTag: 321. It doesn't match, so full response is downloaded and eTag updated.HTTP 200 OK.

3 - User 3 is from same department as User 2, so they will see the same response. Web browser sends If-None-Match: 321. You calculate the hash of the response, so it will be the same as previous one, 321. It does match, so server will return 304 Not-Modified response. Web browser can use it's cached value.

This behaviour is the same as if there was no cache at all, but you don't really need to transfer the content, just the confirmation that the cached value is valid or not.

Some considerations:

  • It is important to use strong etags like a hash of the http body content, and do not use any identifier that must be secret, like db identifiers, file id's etc. Using xmin as suggested in this issue, would not respect this.
  • If security or user is different, the response will be different so etags would not match, so security is respected.
  • I don't see a reason for this, but you could apply a hash to the "http context" + "http body response" for calculating the hash etag. This way users would never share the local cache. But it makes no sense for me, if the response is exactly the same between users, they should be able to profit cache entries from other users as well.
  • The Cache-Control header should be "no-cache,no-transform, private" by default. It means "always check if cache is valid with server for every request, do not transform the output like compressing images, and store cache only on web browser (no public caches like a proxy)",
  • You should be able to control the cache headers.
  • Ideal solution would be being able to control cache-headers for every endpoint individually, for example you might decide that is not really needed to check if a public image stored in DB is valid every time, and the web browser can show it directly from the cache without making any request to the server. Or you might decide to completely disable any cache for security reasons for some endpoints and not for others.

yevon avatar Jan 03 '23 15:01 yevon