-srxbuf and -sTransient configuration
This ticket is to ask for a decision on a storage configuration interface change, triggered by the special stores rxbuf and Transient:
The current storage configuration format is -s<name>=<stevedore>[,arguments]. This implies that, unless stevedore implements some special magic, each name will refer to a separate instance of the stevedore.
The only way I am aware of to use a single instance for "normal objects", rxbuf and Transient is to configure the Transient storage and use it for all objects.
IMHO, this is far from POLA.
I am open for alternative suggestions, but the obvious solution to me seems to be to also support -s<special>=<name> to "alias" the special stores to some storage defined earlier.
This would also replace the special h2_rxbuf_storage parameter.
Example:
-smystore=engine,pa=ra,me=ter -sfast=engine,diff=erent,conf=ig -sTransient=mystore -srxbuf=fast
I would rather keep the h2_rxbuf_storage parameter and even consider extending the logic further, with for example shortlived_storage defaulting to Transient to maintain current behavior.
If we keep the current expectation that there is always a storage.Transient symbol, we can rely on two other precedents:
- no explicit storage selection results in round-robin selection
- none backend
My suggestion: varnishd -sTransient=none
@Dridi if we go with parameters, shouldn't those constitute aliases in the storage.* VCL object namespace? And shouldn't they be changeable at runtime?
e.g.
varnishadm param.set transient_storage storage.mystore -> VCL (storage.Transient == storage.mystore) == true
I think you misunderstood what I meant.
Right now we either define a storage backend called Transient or we get an implicit -s Transient=default. The transient storage is used under the hood for several purposes:
- uncacheable/private objects
- short-lived objects
It would have also been used for the h2 rxbuf if @bsdphk had not explicitly complained against it (https://github.com/varnishcache/varnish-cache/pull/3661#issuecomment-894109149).
Between "magic" storage conventions and parameters, the consensus was a parameter.
I am arguing that short-lived objects fall under the same category (the edit in https://github.com/varnishcache/varnish-cache/pull/3661#issuecomment-906237426 is from back then, way before this ticket was opened) and that we should get a shortlived_storage parameter that defaults to "Transient". What if I have a stevedore implementation that is optimized for short TTLs but not so much for uncacheable contents?
And finally, we introduced none backends, which sets another precedent.
I'm suggesting that we add -s Transient=none as a means to have no transient storage at all, but a symbol that would still exist. This way the existing parameter h2_rxbuf_storage and the hypothetical shortlived_storage default values would still work because Transient would be resolved, but core code would fall back to round-robin for actual storage selection. As a result you could get your setup like this:
--smystore=engine,pa=ra,me=ter -sfast=engine,diff=erent,conf=ig -sTransient=mystore -srxbuf=fast
+-smystore=engine,pa=ra,me=ter -sfast=engine,diff=erent,conf=ig -sTransient=none -p h2_rxbuf_storage=fast
What wouldn't work is the redirection from Transient to mystore, so we could imagine another parameter:
-smystore=engine,pa=ra,me=ter -sfast=engine,diff=erent,conf=ig -sTransient=none -p transit_storage=mystore -p h2_rxbuf_storage=fast
The name transit_storage is of course referring to #3572.
@Dridi so we have now three parameters on our list
-
h2_rxbuf_storage -
shortlived_storage -
transit_storage
I am not sure where I possibly misunderstood you, I can see your point regarding the parameters.
What I tried to say with my previous comment was, that there should also exist VCL objects reflecting the parameter settings such that VCL can explicitly query (and control) them.
To illustrate: If I change my transit_storage via a parameter change, I would also want have a storage.transit, which changes with it. By the above, we should also have storage.shortlived. I have no strong opinion whether or not want to keep storage.Transient, it can probably go.
To allow for a switft transition, we could set the tree parameters to Transient IFF storage.Transient is explicitly configured via the command line.
Oh, something I have in my backlog that I was keeping as a first exercise for a future team member (coming soon) was to add a subset of param.* symbols in VCL. I didn't have storage backends in mind, mostly timeouts for starters and object lifetime defaults.
Would that work for you?
-(storage.Transient == storage.mystore) == true
+(param.transit_storage == storage.mystore) == true
@Dridi it absolutely would work for me and I had the same idea yesterday, but thought it would be reaching too far. But now that there's two of us...
I've had this on the back of my mind for a few months already but had to restrain myself to keep it as a mentoring device...
Updated suggestion after bugwash:
-
-sarguments only define storages, not purposes - parameters define default storages for purposes, their default value is the first
-sargument. For a transition period, if-sTransientis defined, it becomes the default for all butdefault_storage-
h2_rxbuf_storage -
shortlived_storage -
transit_storage -
default_storage
-
- VCL gains read-only access to some parameters, the ones above in particular.
- the
*.storageVCL variables change semantics. They are initialized asNULLinternally.- if read,
NULLresolves toshortlived_storage,transit_storageordefault_storagebased on some built-in routing rules (basicallyvbf_allocobj()to start with) - if set, that value is used unconditionally
- if read,
More bugwash discussions yeterday.
A parameter like default_storage storage should default to "none" and fall back under the hood to round robin selection to match current behavior. However this opens a new can of worms, what should round robin extend to? Everything except the one named Transient? Or everything except the storage referenced by the transit_storage parameter? Or all the other "special" storage parameters?
What if with -s Transient=none -s uniq=something there's exactly one storage backend?
Another point raised is that the current behavior, implicitly using Transient for uncacheable and shortlived objects, is really unVCL. Maybe we should have more variables:
-
beresp.storage -
beresp.h2_rxbuf_storage -
beresp.shortlived_storage -
beresp.transit_storage
This way VCL could override global parameters (see https://github.com/varnishcache/varnish-cache/issues/3854#issuecomment-1254851646, we could also have the same but read-only variables under param instead of beresp).
Objections were raised (by me) for the following proposal:
the
*.storageVCL variables change semantics. They are initialized asNULLinternally.
- if read,
NULLresolves toshortlived_storage,transit_storageordefault_storagebased on some built-in routing rules (basicallyvbf_allocobj()to start with)
We should avoid magic implicit resolution, and if we really want to probe which storage to expect at a given time, a VMOD would probably be a better fit.
The question of multi-tenant systems and the ability to restrict a given VCL label to a subset of available storage backends was raised. The conclusion was that this problem space probably deserves to be on a VDD agenda.
IRC followup discussion:
10:49 < dridi> i had an idea but didn't give it much thought: using tags 10:49 < dridi> we create storage -s foo=stuff arguments 10:49 < dridi> we then tag storage backends 10:50 < dridi> for example the h2_rxbuf tag means it can be used for that 10:50 < slink> that's basically the same as multiple values for parameters 10:50 < dridi> then you can have arbitrary tags, like tenant1 10:51 < dridi> and you tag a label with tenant1 10:51 < slink> ah i see where you are heading 10:51 < dridi> then any storage backend with both "tenant1, h2_rxbuf" may be used in that scenario 10:51 < dridi> Transient has an implicit transit tag 10:52 < dridi> if a tenant doesn't have a storage with both "tenantX, transit" tags, it may fall back to just transit 10:54 < dridi> if there is no transit tag (or -s Transient=none) => round robin on all tenantX tags 10:54 < dridi> same for any other storage selection 10:54 < dridi> (shortlived, normal, h2_rxbuf) 10:55 < dridi> so we'd have new cli commands like storage.tag
and vcl.tag