vHive
vHive copied to clipboard
Invocation bugs creeping up at high request rates
Describe the bug I am running a multi-node cluster (Master + 2 Worker nodes), with vHive running on this cluster. When I deploy a short running function (execution time ~ 0.9 ms), using just the compute resource, at very high request rates (~250 ips), I see a lot of errors coming up. They look like: "Failed to invoke linpack-0.default.192.168.1.240.sslip.io:80, err=rpc error: code = Unavailable desc = no healthy upstream". This results in failed invocations for these cases. Note that there are some successful invocations here and there, too.
To Reproduce Use the vHive start-up guide to set up a 2-worker-node cluster. Deploy a quick running function (e.g. helloworld). Invoke it at a very high ips rate (~300), for a period of 5 minutes. Make sure the debug flag is set to true in the Invoker's client code. You should see errors like the ones above creep in, in some time. Also, increase the timeout parameter in the invoker code to a large enough value (~300000 sec).
Expected behavior Ideally, the invocations should have run normally, with successful returns.
Notes I am running this cluster on google compute instances. (Intel Skylake)