haproxy icon indicating copy to clipboard operation
haproxy copied to clipboard

HTTP cache improvements

Open bedis opened this issue 5 years ago • 36 comments

Hi, I create one single issue to discuss some missing features for the HTTP cache in HAProxy. Then, later, we could split this list into issue for tracking each development.

  • [ ] 1. Access-Control-Max-Age The idea here is to allow caching for the period defined in server's max-age found in Cache-Control or in Access-Control-Max-Age. That said, I think the longest period could still be limited by the HAProxy bucket's max-age. This would give more flexibility to the devs while still give ability to the ops to decide where the flexibility should stop See #251.
  1. support ETAG on both the client and server side ETAG can be used to revalidate an object from client to HAProxy and from HAProxy to server.
  • [x] 2.1 client to HAProxy: HAProxy would return a 304 if the client's ETAG matches the current object's ETAG (#821)

  • [ ] 2.2 haproxy to server: when an ETAG is available in a cached object, then HAProxy should re-use it when refreshing this object with the server. If the server returns a 304, then HAProxy should re-apply a cache period for this object

  • [x] 3. "Support 304 answers" (#821) When clients send "conditional requests" (containing If-Modified-Since a the like headers), then HAProxy should be able to return a 304 and not the full body.

  • [x] 4. support Vary header Vary header is sent by the server to tell the cache to use a different version of the object for each variation of the header pointed to by Vary.

  • [x] 5. Sample Fetches (#900) A sample fetch which returns whether the response is a cache hit or not. A sample fetch with the cache name could be useful too.

bedis avatar Aug 08 '19 12:08 bedis

Hi Baptiste,

I don't have tested the cache too much yet, but do we have already a vary on the Host header ? I'm thinking about 2 vhosts on the same backend with cache enable, with 2 same uri. (www.toto.com/test.jpg and www.tata.com/test.jpg when both files are differents)

aderumier avatar Aug 12 '19 05:08 aderumier

The host is always part of the hashing key, it doesn't need to be announced in Vary.

wtarreau avatar Aug 12 '19 06:08 wtarreau

Baptiste, please, cut this one into one feature per entry otherwise it's impossible to proceed nor to follow progress on them and we're back to the e-mail based todo list.

Also, the cache-control:max-age is already handled by the cache so you can delete this entry.

wtarreau avatar Aug 12 '19 06:08 wtarreau

Also, the cache-control:max-age is already handled by the cache so you can delete this entry.

Hi Willy, The documentation is not clear about max-age. Is the max-age in haproxy max limit ? (like application return max-age 10000, and haproxy max-age 200, the max-age will by 200 ?)

An interesting feature could be to force max-age on haproxy side. (If application don't setup max-age, or max-age is really low and we want to force it to high value on haproxy)

aderumier avatar Aug 12 '19 06:08 aderumier

My understanding always was that haproxy's max-age is the boundary. But it's possible that the doc is unclear about this, improvement suggestions or patches are welcome (better share this on the mailing list so that we don't go back and forth multiple times).

Regarding forcing a higher max-age, I disagree. Not about the fact that it could be "interesting", as it could certainly create interesting problems, but in that it's a hack only to work around application deficiencies or misconfigurations and that each time you force a cache to a longer period than announced you create real problems. It starts with DNS that don't propagate fast enough, it continues with pages on various sites which are not properly reloaded or updated, etc.

HAProxy's main function is to be a load balancer. It does a bit of caching because we found that for a lot of small objects, the cost of contacting the server is higher than delivering from local memory (hence the codename "favicon cache" in the past). However caching is a tough job which must not be taken lightly. Doing it better or more advanced requires some deep application knowledge and is far out of haproxy's scope. Just to get a sense of how complex it can become once you put your finger there, have a look at Varnish. It had to create its own configuration language to describe the expected application behaviour. Thus for me the limit is simple : if you place one haproxy instance in front of 1000 applications, it's not at all an option that a change required for only one of these applications has even the slightest impact on any of the other ones. This also means I'm not necessarily against having an "http-response" action to force the response's max-age (if technically feasible) because this would then be based on explicit actions from the admin and would not then risk to break other applications as a side effect. But this is clearly the limit I'm willing to accept here.

wtarreau avatar Aug 12 '19 06:08 wtarreau

the max-age was point was in the case the Cache-Control value is lower than the configured one in HAProxy.

bedis avatar Aug 21 '19 07:08 bedis

I'll split up these issues into individual ones.

bedis avatar Aug 21 '19 07:08 bedis

Hello. Primitive cache HAProxy blocked migrating our services, from Nginx to HAProxy.

List critical cache functional:

  1. HTTP cache revalidate (ETag) between HAProxy and backends - Nginx documentation
  2. Use stale cache while revalidate - Nginx documentation
  3. Cache lock - only one request send to backend for update cache - Nginx documentation

This two points are part of the specification of HTTP/1.1 It is not a custom language of directives as in Varnish.

This functionality has huge practical benefits:

  1. Revalidation makes it possible to reuse the HAProxy cache as much as possible and not specify a long cache lifetime, this is important for a dynamic backend
  2. If the backend is not available, HAProxy can respond cache to clients, because the backend explicitly allowed this in its response headers: Cache-Control: max-age=10, stale-while-revalidate=60, stale-if-error=1200 even modern browsers understand this header, the HAProxy must also be able to understand it
  3. Cache lock - prevent race condition for backends, only one request is sent to the backend, until the backend responds, all clients will receive replies from the HAProxy cache, if the backend allows this in the response headers Cache-Control: stale-while-revalidate=1200

If you are interested and you plan to improvements a cache, I can join on your mailing list. Thanks.

uasan avatar Aug 26 '19 12:08 uasan

Thanks for your useful update. Regarding the cache lock we have actually already identified something bad that we intend to address, which is that while multiple requests may go to the server when the object is not in the cache, all of them will compete to place their response in the cache and this can make all of them fail for a while. So I'd add as first step to make sure that only one of those sent to the backend is used to feed the cache. For the other points, well I'd say I just don't know and will let William judge :-) With this said, please keep in mind that our primary goal is not to be a full-fledged cache but a load balancer, and that the caching feature was added as a demand for minimalist caching to avoid bothering the servers for trivial stuff (hence why we used to call it a favicon cache). Cache contents are not kept across reloads for example. But your points above do not seem to contradict this at all and could possibly constitute nice improvements.

wtarreau avatar Aug 26 '19 13:08 wtarreau

So I'd add as first step to make sure that only one of those sent to the backend is used to feed the cache.

Yes, this is absolutely true. Maybe I didn’t write correctly, but in Nginx the same behavior.

uasan avatar Aug 26 '19 13:08 uasan

OK, thanks for confirming!

wtarreau avatar Aug 27 '19 02:08 wtarreau

I filed feature request #251 to detail what should be done for the CORS specific stuff (caching of responses to preflight requests). This automatically removes request 1 (follow access-control-max-age).

Regarding Vary I think we could do it without too much effort if we implement a single Vary combination (which is often true for most responses). The idea is that we can store a status bit in the response to an object lookup indicating if it's an actual object or if it's based on a vary header. If it's based on a vary header, then the object contains only the vary header (normalized) so that the lookup is performed again by appending all the values of the headers mentioned in this field. Then this gives an alternate caching key that will be used to retrieve a cached object or to fill it on a miss. In case of a miss, the response comes from the server and we have to parse the Vary header. If it's present and the cache key is an alternate key, then simply store the object. Otherwise just store the contents of the vary header and set the bit indicating it's only a variant. This means that it will be possible to cache a first response in two steps without having to preliminarily duplicate the request. The first response will be used to discover the contents of the Vary header, the second one to store the contents at the right location.

wtarreau avatar Sep 03 '19 07:09 wtarreau

From my experience, using Vary greatly reduces cache hits and greatly increases cache size. I am wondering if anyone has any real useful experience using Vary?

uasan avatar Sep 03 '19 10:09 uasan

I wouldn't be surprized that it could often be the case with a constrained size. I.e. if you vary on the user-agent (as some used to do for compression), you end up with thousands of copies of your home page in the cache. It could possibly be the same when caching CORS responses if there are many origins. For other use cases like caching a normalized accept-encoding, it can significantly improve the situation at the expense of using quite more storage. Ideally a production cost should be assigned to each object and it should be multiplied by its hit request rate to decide which one to evict. This would for example make sure that large compressed objects are kept longer than small raw ones.

wtarreau avatar Sep 03 '19 11:09 wtarreau

Reason non effective use Vary header - non canonical client header values. If backend response Vary: Accept-Encoding, clients header request:

Chrome Accept-Encoding: gzip, deflate, br

Safari Accept-Encoding: br, gzip, deflate

uasan avatar Sep 03 '19 12:09 uasan

This is exactly why I spoke about normalized accept-encoding. You must decide how to convert such a list prior to using it as a lookup key. You can order known fields the way you want after trimming spaces etc so that in the end both will be the same.

wtarreau avatar Sep 03 '19 13:09 wtarreau

is there any love for POST caching?

packeteer avatar Feb 12 '20 06:02 packeteer

is there any love for POST caching?

@packeteer Why do you even need that

saubi1993 avatar Apr 30 '20 15:04 saubi1993

POST is not allowed to cache according to RFC

chipitsine avatar Apr 30 '20 15:04 chipitsine

POST is not allowed to cache according to RFC

That's not true. It's just not cached by default but you can perfectly use cache-control for this if you want.

@packeteer indeed there's no love for POST caching. The primary reason is simple: haproxy is first and foremost a load balancer. See it as a layer7 router if you want. In order to optimize network usage and scale better on small objects, it supports some caching but the primary goal is that it remains maintenance-free. This means that if for example you accidentely placed a wrong object in the cache, just wait a few seconds and it will vanish. This is critically important because you NEVER EVER want anyone to start to fiddle with your load balancer. And that's the point where it starts to draw a line between regular caching and advanced caching. There are already excellent caches like Varnish, who for similar reasons also do a bit of load balancing but that's not its primary goal. I'd say that if you need a cache in front of your application, you have to use a real cache. And you can place haproxy in front of varnish, both complete excellently. In addition enabling short-time small object caching on haproxy will further increase the performance by avoiding to forward such requests on the wire.

However if you mainly need load balancing and think "let's see what we can gain by enabling caching" then it's fine to enable haproxy's cache and it will show you some appreciable savings. That's what we do on haproxy.org for example.

Now, it might be possible that for some API gateways we'd find that some POST requests could benefit from specific caching and that we'd work in that direction. But in this case I suspect we'd make it so that users have to explicitly enumerate requests that need it. Caching POSTs in general is extremely dangerous as it means you don't perform the expected action on the server. Think about authentication requests... Think about logout pages which are supposed to destroy your context from the application server... Think about object deletion requests that could be repeated after multiple upload issues, while only the first one would work and the next ones would be cached and return "done" without doing anything... That's extremely dangerous and usually the clients should not use a POST to request data that must not act on the server.

wtarreau avatar Apr 30 '20 16:04 wtarreau

@packeteer Why do you even need that

because we have POSTs sent to our api which does calculations and returns a result. now the important part here is that the number of permutations possible is greater than 30 zeroes, and complex calculations can take minutes. the data is "static" and only changes every 2 weeks, so we try to cache as much as possible to get responses to customers in a timely fashion.

fwiw, we are now using the Nuster project for our needs.

packeteer avatar May 04 '20 07:05 packeteer

complex calculations can take minutes. the data is "static" and only changes every 2 weeks

So this is a perfect example of something which requires a real cache and not a caching load-balancer. Just think that by trying to cache this on haproxy you will be terrified by the idea of reloading it, thinking that suddenly you may lose hours or days of computations and immediately kill your service. You must have persistent storage for this type of activity.

wtarreau avatar May 04 '20 08:05 wtarreau

Just think that by trying to cache this on haproxy you will be terrified by the idea of reloading it, thinking that suddenly you may lose hours or days of computations and immediately kill your service.

In order not to lose data during a reload HAProxy, the cache can be stored in the shared memory of the main process, then the cache will be stable and / or store the cache in files as Nginx does

uasan avatar May 18 '20 14:05 uasan

Unfortunately it's not that simple because the main process is exec() again during a reload to allow updates of the binary, so the anonymous mmap which is used curently won't work for this, this need to be replaced by something portable which will survive an exec, without leaking once the process will leave.

Regarding files there is unfortunately no portable and standard way of doing async filesystem accesses without blocking so it's against the project policy to do that, it could block a complete thread which could be a serious problem.

wlallemand avatar May 18 '20 15:05 wlallemand

Ok, before rebooting, the process can dump the cache dump into the file, when the process starts, it will read the dump from the file and load it into the shared memory. I am sure your team can find effective solutions to these problems.

From myself, I want to say that the development of cache in HAProxy is very important, because Nginx solve this problem without sacrificing their balancing functions.

uasan avatar May 18 '20 15:05 uasan

As already explained by Willy, this is not what the haproxy cache will do. It's a simple cache for things like the favicon.

Haproxy is not nginx or varnish. You need to use the right tool for the job, and in this case, that is not haproxy.

lukastribus avatar May 18 '20 15:05 lukastribus

On Mon, May 18, 2020 at 08:18:49AM -0700, S.A.N wrote:

because Nginx solve this problem without sacrificing their balancing functions.

No that's the opposite: Nginx implemented some load balancing without affecting its file serving functions.

You have access to three awesome opensource components which work marvellously together (haproxy, varnish, nginx). Each of them excels at its function and does its best to cover the most basic parts that others do in order to ease adoption and deployments of basic cases. As soon as you need something advanced, robust or performant, you must use the right product for each function, because you can't expect them to excel at everything and you can't ask all their users to accept a severe degradation in what they do well for the sole purpose of simplyfing your deployment.

Willy

haproxy-mirror avatar May 18 '20 16:05 haproxy-mirror

HAProxy -- http request --> Nginx(cache) -- http request --> Node.js

Explain why in this chain Nginx? The cache should always be as close to the client as possible.

I understand when you talk about unix way, but real practice proves that products (like Systemd) that solve the whole complex of related problems are very successful.

Thanks.

uasan avatar May 18 '20 16:05 uasan

We're trying to explain to you that a network-based product making file system accesses causes latencies that are multiple orders of magnitude more than acceptable on the network processing and that literally kills performance. Plus, the "solutions" that you propose like "I'm sure you guys could develop this or that" clearly proves you don't know what you are talking about since that brings absolutely zero value in the design proposal. If you want to mix haproxy with a file-based cache, have a look at the Nuster project. But then complain about your issues there and not here.

wtarreau avatar May 18 '20 16:05 wtarreau

I proposed a option with a file system by analogy with Redis, there also a file system is used only for persistent store.

But by the way, Nginx respond to client, from cache files (230k files) if request hit in cache - 5ms.

uasan avatar May 18 '20 17:05 uasan