spicedb
spicedb copied to clipboard
Intermittent 502s when running SpiceDB behind AWS ALB
What platforms are affected?
linux
What architectures are affected?
amd64
What SpiceDB version are you using?
v1.34.0-amd64
Steps to Reproduce
After making several basic check permissions requests in a loop through an AWS ALB to SpiceDB, after some amount of time (usually within ~20 minutes), we see the following error come up:
Err terminated with errors error="rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway)"
This error resolves on future requests without any changes on our side, but in the meantime a request failed—and the error continues to come up at seemingly arbitrary points if we let the loop continue. The SpiceDB target group protocol is set to gRPC.
Expected Result
We expect not to see these transient errors running SpiceDB behind an ALB.
Looking into the access logs for our ALB, we see the 502. In the log the request_processing_time
and target_processing_time
are set, meaning the request reached SpiceDB. We're not sure what it means that the target_processing_time
is set indicating the load balancer may have received headers from SpiceDB, but the response_processing_time
is -1 meaning the load balancer didn't receive a response from the target. AWS suggests that this might happen when either the target closed the connection while the load balancer had an outstanding request, or the target response is malformed or contains invalid HTTP headers.
We tried setting the --grpc-max-conn-age
flag to a large number to check the issue isn't that the keep-alive for SpiceDB is shorter than the timeout on the load balancer, but still saw the same errors.
Actual Result
Error: