Followup for #3014 (502 status)
Now that #3014 is merged, I have two patches to apply on top:
- Fail fetches with 502 status when relevant
- Show that directors can now specify an error status (test coverage only)
We could go further in core code, see commit messages for more information.
Split in two (possibly 3) subjects:
-
Does being more informative, ie: 502 vs. 503) responses require VCL bump to 4.2 ? A: YES.
-
Exactly what should be returned and when.
-
What can custom directors return and how/why.
Dridi to make attempt on first point.
I extracted the director question to its own pull request: #3017.
Before I make progress on this topic here is a summary of my digging:
- 502 (bad gateway): the backend response was empty (zero bytes EOF) or junk
- 503 (service unavailable): we couldn't even resolve or connect to a backend
- 504 (gateway timeout): the backend response headers fetch timed out
- 500 (internal server error): none of the above (eg. out of storage or workspace)
In hindsight, return (fail) should have led to a 500 response from day one, we should consider changing this too as part of vcl 4.2 to get back on the standards track (pun intended).
Custom directors (#3017) can be more specific but to be honest I don't see any other status they would use besides the list above. Besides "cluster" directors akin to vmod-director we can have "backend" directors like VBE, a well-behaved custom backend should err with the correct status.
#3017 was extracted to its own pull request because this is doable since #3014. The question now remains for our VBE backend implementation.
Ok, here is my decision, relevant for #3017 as well:
We will not change the current 503 behavior in VCL4.1
We will introduce this new behavior in VCL4.2
We can add a facility like pythons "from future import MAGICWORD" to enable the new behavior in branches which do not support VCL4.2. The exact syntax we use is TBD.
see also #3017 : It appears to me that this PR should be dropped in favor of the new error handling agreed at VDD19Q3
FYI, this pull request changes behavior, #3017 merely shows that a new ability was granted to directors as a side effect and adds coverage.
Reopening for a future VCL bump.
I pushed a new set of patches because they were lagging 500 commits behind.
This could in theory be merged before even introducing a VCL bump, it does take it into account but cannot add coverage for the new stuff until the chicken-egg dependency is solved.
Brief discussion with @bsdphk: for the VBE director implementation, we can add a new property to enable non-503 status codes depending on the error, disabled by default. We could toggle the default value in a future VCL bump, but we aren't making such plans until we have something concrete.
backend example {
.host = "example.org";
.gateway_errors = true;
}