gRPC-haskell
gRPC-haskell copied to clipboard
Timeouts Far Earlier Than Expected
I wrote a quick-and-dirty script to test gRPC vs Thrift Haskell libraries when moving large, structured data over a channel. As can be seen by running the main application, even though all client timeouts are set to 10000000 seconds, I consistently get a time-out after only a second or so:
ClientIOError GRPCIOTimeout
CallStack (from HasCallStack):
error, called at src/GRPCvsThrift/GRPCClient.hs:43:43 in gRPC-vs-Thrift-0.1.0.0-94CrnjUzsegHPgjx1PI6me:GRPCvsThrift.GRPCClient
I cannot figure out why this is. Is it possible that a max size limit violation is being interpreted as a timeout? Can the max size limit be changed?
This only occurs when the data structure being sent is large enough. To give an idea of the approximate size, when show
is called on a "too large" data structure, the result is about 10,000,000 characters long. A known "not too large" data structure show
s to about 6,000,000 characters.
Help?
P.S. gRPC is way, way faster than Thrift when sending big messages (~ 20-30x). Thanks so much!
Hi @isheff,
We've run into a small handful of scenarios where certain error conditions (e.g. an unreachable host under certain network conditions) seem to be reported as a timeout from the C core, regardless of the actual timeout value supplied when making the client call. So it wouldn't surprise me if something similar is happening w.r.t. a max size violation.
You might try playing with the GRPC_TRACE
and GRPC_VERBOSITY
environment variables and see if there's evidence in the debug spew of a channel size violation.
Max size limits on channels can be modified via channel args, cf. this test and https://github.com/awakesecurity/gRPC-haskell/pull/35.
However, it looks like like I only bound MaxReceiveMesssageLength
and not the corresponding send limit, which seems to be what you want to tweak. Also, note that the header indicates that a size limit of -1
connotes "no limit" but we've used a Natural
-- IIRC, that was intentional, and the comment in the header was deemed stale, but apparently I didn't feel like that was important enough to mention on #35. :tada:
So it looks like there's certainly some attention needed there, specifically for the max send size, but the test I linked above seems to indicate a different status code for the failure case (StatusInvalidArgument
), at least when it's the receive limit that's been exceeded.
The fix is likely to add support for the max send size channel arg, and set it appropriately; it would also be useful to determine the source of the error reporting bug. PRs welcome =). BTW, I'm more than happy to dig into this, but won't have cycles in the super short term to do so.
Thanks for the feedback!