roadrunner
roadrunner copied to clipboard
[π‘FEATURE REQUEST]: Add HTTP Cache Middleware [IMPLEMENTATION TRACKING TICKET]
Is your feature request related to a problem? Please describe.
Roadrunner is already handling SSL, as a replace for load balancer, nginx, or other similar service. It would be nice to support a kinde of HTTP Cache, so things like Varnish (which don't support ssl) could be handled
Describe the solution you'd like
HTTP Cache Handling could be added as kind of middleware I think. It could be implemented in different Steps and so not all type of caching need to be added as once.
- [x] Level 1: HttpCache over
Cache-ControlandVaryHeader build on top ofpublicands-maxageattributes forGETandHEADrequests Http Specification. Added: TheAgeresponse header when loaded from the cache should correctly be set. Issue: https://github.com/spiral/roadrunner-plugins/issues/187 - [x] Level 2: PURGE request to invalidate the cache based Varnish Purge configuration which IPs are allowed to purge.
- [x] Level 3: Custom TTL Header for cache lifetime instead of s-maxage widely used
X-Reverse-Proxy-TTL(Symfony FOSHTTpCache CustomTtlListener. This header value is used then instead ofs-maxagehow long the response is cached. FOS Custom TTL Docs - [x] Level 4: BAN BAN request with regex Varnish BAN
- [x] Level 5: Tag Based Caching. Response tag by
X-Cacheheader and with invalidation also by Tags Varnish Key - [ ] Level 6: ESI / Edge side includes (https://en.wikipedia.org/wiki/Edge_Side_Includes) allows to not cache or cache a part of a page differently. Surrogate ESI header need to be send so esi tags are returned.
- [x] Level 7: Grace Keep Mode (a request to a cached page which did expire still returns the cache page but the cache is refreshed in background, this means always fast responses to client) With
X/Y-Keyfeature to just invalidate the cache but not remove it. Varnish Grace Mode Reference / Varnish YKEY - [ ] Level 8: User context Caching (A endpoint is first trigger to get the user context which is used as cache context) so different security groups example can be cached. FOSHttpCache User Context Caching
Describe alternatives you've considered
Using Varnish, Then sadly roadrunner can not longer use as SSL provider and requires a additional nginx in front of varnish handling SSL.
Additional Context
Multi Domain support should keep in mind. example.org/test is different to example.com/test.
Things which should be discussed. File based or in memory cache a hybrid solution would be great to be as fast as possible with low memory usage.
Static file response content don't need additional be stored in cache?
It's a great proposal, thanks @alexander-schranz . I'll plan it after v2.5.0 is released (~18 of October).
@rustatian I updated the issue with some hopefully helpful links.
@alexander-schranz Great, thank you π
Hey @alexander-schranz, sorry for the delay, got sick with covid in November π .
I am going to support at least Redis, boltdb, and in-memory (LRU, with configurable capacity) drivers for the cache in the first phase. The basic idea is to have a wrapper on the RR side that would work with any driver (Varnish, Redis, Postgresql (haha)) which you can easily configure via .rr.yaml.
@rustatian No problem, I hope you are fine and healthy again. Take your time what you need!
For varnish I think there is nothing required todo, as roadrunner would just need to forward the headers from the application correctly like it does it currently and purge request would directly go to varnish and not hit roadrunner in that scenario.
For built-in cache directly with roadrunner what I wanted to target in the proposal, redis sounds great as storage, with boltdb I'm not familiar - but seems something similar. A filebased cache would also something which would be great to work out of the box without additonal service then roadrunner. At the end I think the storage should just be abstracted so we can add additional in the future, we just make sure that the abstraction supports correclty the above scenarios based on http standard headers and common practices listed. The storage required to cache the content + response headers. Also we should make sure to set/calculate the correct Age response header when loaded from the cache to response how old the entry is. I forget that one, added above for completeness. So we need response content, response headers and the time when the entry was cached.
At the end I just thought it would be a great addition to have a builtin http cache directly in roadrunner to make usage of the SSL feature in our applications which requires such a cache. Still make sure you are doing fine, no stress on this one, work on it when you like it and don't forget to enjoy other things! For any questions or if I can test or provide you something just ping me here, happy to help if I can.
I hope you are fine and healthy again.
Yeah, everything is good now, thanks π
A filebased cache would also something which would be great to work out of the box without additonal service then roadrunner.
BoltDB is a file-based cache. This is a sqlite3 analog. So, no configuration from the user end, just specify the driver and run. Much like the memory driver, but persist between RR restarts :)
For any questions or if I can test or provide you something just ping me here, happy to help if I can.
Great, thank you very much.
If you don't mind, I'll mark this ticket as an implementation tracking ticket (thank you one more time for the good and detailed request) and it'll be open till the last level is implemented. For the RR 2.7 (middle of Jan) I'll implement Level 1, and will add 1-2 levels per RR version just to make sure that everything goes smoothly and w/o bugs. So you and the community would have the ability to seamlessly integrate the cache into the existing ecosystem.
CI closed the issue by mistake π¦‘
@rustatian Really nice to see that you had some time to implement a first version. π
You are very welcome π
It'll take some time to implement the whole RFC-7234 and add new drivers (Varnish, BoltDB, etc), but, we won't stop, I promise π§βπ
Hello everyone π
I was thinking about another cache system implementation. I already wrote an HTTP cache system (called Souin) used by the caddy cache-handler module, compatible with Træfik, Tyk and many other reverse-proxies/API Gateway. It supports the RFC-7234, can partially cache graphQL request, invalidate using xkeys/ykeys like Varnish. It also implements the Fastly purge using the Surrogate-key header. It can store and invalidate a CDN (cloudflare, fastly, Akamai), and it supports the Cache-Status RFC directive. It implements two in-memory/fs storages (badger & nutsdb) and two distributed storages (olric & etcd) that are fully configurable. The keys can be tweaked (e.g. serve the same cached css for multiple domains) and we can change the Cache-Status name through the configuration.
What do you think about implementing it in the cache repository and make something like the cache-handler, the Souin repo could be the stable development repository and the roadrunner-server/cache could be the ultra stable production ready.
Or we can reimplement each features directly in the roadrunner-server/cache repository.
Let me know your preference about that. βοΈ
Hey @darkweak, nice to meet you ππ»
What do you think about implementing it in the cache repository and make something like the cache-handler, the Souin repo could be the stable development repository and the roadrunner-server/cache could be the ultra stable production ready. Or we can reimplement each features directly in the roadrunner-server/cache repository.
We may have a Souin handler in the roadrunner-server/cache repository (to not repeat the code). You may delete everything from the Middleware (https://github.com/roadrunner-server/cache/blob/master/plugin.go#L94) π and put you code in it.
You may propose a storage interface since we need to store the requests. Previously I used a straightforward one: https://github.com/roadrunner-server/api/blob/master/plugins/cache/interface.go#L7. It's not final; feel free to change it.
@darkweak Souin sounds very promising π If i understand it correctly Souin itself can also be used without a reverse proxy? And so Souin is doing the caching? What is supported by Souin from the listed things above? If I understand it correctly Souin would then just be compiled into roadrunner and I don't need additional application running that? Because that is what I'm targetting to have out of the box support for caching without additonal reverse proxy.
@alexander-schranz Yes it can be used as a middleware (http.Handler). The more complex part will be the configuration parsing I think.
@darkweak We use the same config type: yaml. So, you may add any configuration you need under the http.cache key π
EDIT: Here is the RR's cache configuration: https://github.com/roadrunner-server/roadrunner/blob/master/.rr.yaml#L557
@darkweak If you need any support from my side, I'd be happy to help. You may also join our discord server and ping me directly π
@darkweak @alexander-schranz Starting from the RR 2.11.0, the Souin cache will be a default cache plugin for the RR.
@darkweak Could you please tell me what features from this feature request are supported by Souin?
According to the docs, I guess Souin supports all described features except Edge Side Includes, am I right?
@rustatian ATM it doesn't support the ESI and the Level 8: User context Caching section.
- ESI is not a validated RFC, can cause some cache pollutions and add latencies on the cache system.
- The user context caching won't be included in the cache system because it depends on the application context.
I planned to work on the ESI support but it takes time to have a robust and efficient system.
Got u, thanks ππ»
@darkweak nice to hear that we could achieve most of the things via Souin. At the end the to open things if not official supported by Souin could still be achieved if there are hooks provided.
Edge Side Includes
This is some kind of a "Response" Hook directly before content is send back to the user (after response is already saved to the cache). So if Souin provides a response hook which allows to manipulate the response content it can achieve this. As additional "plugin", ... could hook into it parse the response and and replaces the esi-includes with the content.
User Context Based Caching
This is some kind of a "Request" Hook, the more common example is a cache based on "User-Agent" header. So I want to have "mobile" and "desktop" browser different content. So when the "Request" is coming in I'm parsing "User-Agent" and normalize it into a custom X-User-Agent: mobile" / X-User-Agent: desktop header and tell Souin that the cache key is not only is the Url of the page but also the X-User-Agent. So Souin would here need to support then that we can create a "Request Hook" and that "Caching Context" can only be the Url but also a request header. The plugin can then via response hook at Vary: User-Agent and return the content.
For a real "user context based" caching the plugin can also get the "user context" for role based caching from the application, but that is something Souin would not need to take into account as that logic would then not live there.
The hooks also don't need to live in Souin they could also live in Roadrunner, maybe already possible via middleware architecture of it.
So ESI could be an own middleware which Souin don't need to support if you don't want it there.
For user context based caching, Souin would need to support that Caching works not only based on the Url instead also on an additional send header which can be configured.
Some context to "ESI". I currently seeing 2 different implementation of ESI around.
The simple solution: Parse whole Response Content and Replace
This is mostly implemented currently in user land like Symfony Framework in PHP. Before it begins to send the content to the browser. It parses the whole "Content" and replaces all "ESI-Includes" before sending it all content together.
The performance solution: Streamed sending and replacing
If you use ESI via varnish you will see that Varnish directly is sending the cache content. But it detects the ESI while sending it. So if a ESI appears all what was in the Response content before is already send to the browser. Varnish then makes the request, waits for it response, sends the response also to the browser and continue with the rest of the cached item. This is very performant as not the whole "content" need to be kept in memory but also more complex to implement.
For user context based caching, Souin would need to support that Caching works not only based on the Url instead also on an additional send header which can be configured
There is the default_cache headers and urls headers directives in the configuration to add more properties in the key generation. But I'm not sure if it works well.
The ESI tags are now handled in the latest version. I plan to implement the response streaming later.
Can you update the tasks and check the ESI support please? :)
The ESI tags are now handled in the latest version. I plan to implement the response streaming later. Can you update the tasks and check the ESI support please? :)
Sure, thank you very much for your work β‘
@darkweak nice to see this is still moving forward. I want to mention a new RFC which is an alternative to the Level 3 Custom TTL Header part. Why the custom TTL is still currently supported and used in Symfony Application a new RFC targets to solve a common issue is now around. I already opened a discussion in Symfony about its support: https://github.com/symfony/symfony/issues/47288
The source of the RFC is https://datatracker.ietf.org/doc/rfc9213/ and it is called Targeted HTTP Cache Control. As an example you can in the application define which specific cache-control header you keep in mind. E.g. an application could set:
Cache-ControlRoadrunner-Cache-ControlFastly-Cache-Control
Roadrunner would only look in this case at Roadrunner-Cache-Control and ignore the others. The different to X-Reverse-Proxy-TTL it does not only have a number instead all common cache values max-age, must-revalidate, ... Difference to the Cache-Control header is that the max-age is relavant and not the s-max-age part (that does not exist in targeted-cache-control headers).
Reading about it is common that reverse proxy keep CDN-Cache-Control in mind and a specific one in case of fastly it is Fastly-Cache-Control. In our case it could be CDN-Cache-Control, Roadrunner-Cache-Control, Souin-Cache-Control. When nothing of the header is represented the Cache-Control headers s-max-age should in my opinion still keep in mind, but think that is also shown in the RFC.
So the target of the RFC is that an application can have multiple reverse proxy caches and with specific headers the different reverse proxy caches can be controlled. The CDN-Cache-Control is the one supported currently by all, but every provider have its custom one specifically targetting it Cloudflare-Cache-Control / Fastly-Cache-Control / Akamai-Cache-Control and so would be nice to have a Roadrunner-Cache-Control or Souin-Cache-Control.
To keep you updated:
I'm working hard on the streaming response but that's hard to make the esi streamable because there are a lot of calculation to handle and process the esi tags asynchronously.
The {cache_name}-Cache-Control support will be easy to implement in the Souin codebase because we already support the dynamic cache name depending the configuration.
@darkweak nice to hear that you are working on it. Basic ESI support sounds already great, streamed is just improvement of the performance which would be great but isn't required to fullfill the support.
The custom cache control headers sounds great, does things like stale-while-revalidate, stale-if-error cache directives also work there?
@alexander-schranz Yep it supports the stale- directive :)
The {name}-Cache-Control is now merged in master and I tagged the new version to include it! π
Hey guys ππ»
As far as I understand (correct me if I am wrong), the level 8 is implemented via the default_cache headers and urls headers configuration options. And since this is the maximum that we can do for the caching (keeping in mind, that RR is not a web-server), may I close this ticket as Done? @alexander-schranz
Closing this as done. Thank you very much @alexander-schranz @darkweak ππ»