faas-netes
faas-netes copied to clipboard
Simplify faas-netes timeout configuration for the openfaas chart
Description
Simplify faas-netes timeouts in the openfaas chart
Use the geteway readTimeout and writeTimeout values as default if faasnetes.readTimeout or faasnetes.writeTimeout is not set.
Motivation and Context
OpenFaaS has a many different timeouts to keep track of. In most cases timeouts for the gateway and faas-netes are set to the same value. To simplify configuration the values set for the gateway can be used as defaults for faas-netes.
How Has This Been Tested?
- Deployed OpenFaaS Pro with this updated chart and set some small timeouts for testing.
helm upgrade openfaas --install chart/openfaas \
--namespace openfaas \
--set functionNamespace=openfaas-fn \
--set generateBasicAuth=true \
--set openfaasPro=true \
--set autoscaler.enabled=true \
--set gateway.readTimeout=5s \
--set gateway.writeTimeout=5s \
--set gateway.upstreamTimeout=4s
Verified the correct env values are set in the deployment by running
kubectl describe deploy/gateway -n openfaas
- Deployed the sleep function
faas-cli store deploy sleep
- Invoked the function using hey
hey -n 200 -c 4 -H "X-Sleep: 3900ms" http://127.0.0.1:8080/function/sleep
Summary:
Total: 195.1964 secs
Slowest: 3.9062 secs
Fastest: 3.9034 secs
Average: 3.9039 secs
Requests/sec: 1.0246
Total data: 22800 bytes
Size/request: 114 bytes
Response time histogram:
3.903 [1] |
3.904 [46] |■■■■■■■■■■■■■■■■■■■
3.904 [97] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
3.904 [42] |■■■■■■■■■■■■■■■■■
3.905 [10] |■■■■
3.905 [0] |
3.905 [0] |
3.905 [0] |
3.906 [0] |
3.906 [0] |
3.906 [4] |■■
Latency distribution:
10% in 3.9036 secs
25% in 3.9037 secs
50% in 3.9038 secs
75% in 3.9040 secs
90% in 3.9042 secs
95% in 3.9044 secs
99% in 3.9062 secs
Details (average, fastest, slowest):
DNS+dialup: 0.0000 secs, 0.0000 secs, 0.0002 secs
DNS-lookup: 0.0000 secs, 0.0000 secs, 0.0000 secs
req write: 0.0000 secs, 0.0000 secs, 0.0001 secs
resp wait: 3.9038 secs, 3.9033 secs, 3.9060 secs
resp read: 0.0000 secs, 0.0000 secs, 0.0001 secs
Status code distribution:
[200] 200 responses
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
Checklist:
- [x] My code follows the code style of this project.
- [x] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [x] I've read the CONTRIBUTION guide
- [x] I have signed-off my commits with
git commit -s
- [ ] I have added tests to cover my changes.
- [ ] All new and existing tests passed.
@kevin-lindsay-1 WDYT?
Sorry, been swamped.
I'm not in love with this approach, because I don't really like the gateway acting as global config in the first place. Aren't basically all timeouts "recommended" to be based off a base timeout that a user specifies? For example, most users probably leave read and write the same as their main timeout, and then simply take timeout - 5s
(or a percentage) for exec_timeout.
Seems to me that you could condense all of this with 1 timeout and do a little math to figure out what to set it to under the hood, just like how the pod's graceTerminationPeriodSeconds
is something like exec_timeout - 3s
.
Specifically for the idea of a global config, I have almost no functions that I want a timeout inferred, so I'd use the gateway's "default" config exactly 0% of the time. I'd rather the gateway/k8s scream that I haven't specified a timeout, because I can't imagine a scenario where a default timeout assists me as the developer in knowing what's going on. I don't really like "environmental" config like this.
That said, I think it would be valuable to override these calculated settings for functions. I think if a specific timeout is not specified, it should be inferred from a "main" timeout value. Then, users can focus on setting only 1 timeout value, and if they need to make a specific timeout a particular value, they can override. I think this approach would also work fine on the gateway, as it would likely result in the same thing that already exists, but with less config/boilerplate.
If you care, I use https://github.com/creasty/defaults for this, and it works very well for my use cases, including more complex things like overrides.
Hi Kevin,
It doesn't sound like we've explained ourselves well here.
There is already a global timeout which is set at the gateway and has to be copied exactly to faas-netes.
They should not and can not be different and this PR simply reflects that, and removes the amount of places where it needs to be updated to work. It's purely a user-experience change, and not a functional change.
If you like to do extra typing, the change means that you can carry on copying the timeout from the gateway's values.yaml section to the faasnetes section.
Alex