grpc-kotlin Streaming client requests are lazily requested

I was tracking down a seemingly slow call, and found that the call.request(1) for the initial stream happened lazily in grpc-kotlin compared to where it happens in the observer version. I have The PR here

https://github.com/grpc/grpc-kotlin/pull/282

In short, this moves the call.request(1) to the startCall part instead of in the flow. This probably has negligible impact for small messages, as I'm unsure how much buffer the grpcServer has. It seemed to make a slight difference in our prod code path.

The main reason is it was a curve ball for us. There was a discrepancy between the unary calls and the streaming calls in the amount of time the method was running, since the streaming rpc calls suspend on the request.collect() call.

It would be even nicer to pre-hydrate the first request on the flow if possible, so that the rpc implementation method isn't called till at least one object in the flow is ready, but that would be much more complex I believe.

Aug 12 '21 19:08 taer

Sorry, I'm trying to figure out the implications of this for the general cold flows principle. If I'm understanding correctly, the situation is that the server starts giving responses before any requests arrive, it's not actually requesting responses from the server until the requests start getting processed, and that the current implementation assumes you're not expecting responses until requests arrive?

Aug 12 '21 20:08 lowasser

Mostly.. What I've noticed is that the example in the PR

    fun exampleStream(requests: Flow<Int>): Flow<Int> {
        val start = System.nanoTime()
        return flow<Int> {
            requests.collect { value ->
                val collected = System.nanoTime()
                val thisTime = collected-start
            }
        }
    }

one would expect that the thisTime time would be very small, almost 0. But in reality, we're seeing that number be well above 5ms and pretty variable. With this patch, it's now a pretty constant 4ms.

I'm not 100% sure there's not a buffer that mostly makes this moot, but the intention of the serverHandler is to process requests ASAP. So it makes sense to trigger any machinery that helps get us requests faster vs starting the implementation method, possibly dorking around with headers for a bit, then when we ask for the request, it wasn't there yet. I could potentially see this being useful with large request bodies that would fill any buffers that may or may not be there.

WRT the cold flows, these flows are closer to the hot variety. We don't get to consume them more than once, and we don't have much control of the production.

Also, this brings the streaming and the unary into closer alignment. The unary calls effectively already do this. They pre-consume the flow which makes that call.request(1) to execute. The implementation method isn't even called till one request appears.

Aug 13 '21 16:08 taer

I would expect thisTime to be near-0 if and only if the result flow of the RPC got collected almost immediately. Is that the case here?

Aug 13 '21 20:08 lowasser

It's much larger than 0. Even with this patch, it's non 0. But it seems slightly more consistent and slightly lower in our prod load testing.

We are collecting immediately.

On Fri, Aug 13, 2021, 15:32 Louis Wasserman @.***> wrote:

I would expect thisTime to be near-0 if and only if the result flow of the RPC got collected almost immediately. Is that the case here?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/grpc/grpc-kotlin/issues/283#issuecomment-898705895, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQA4VC2QKTPX7ZVOSA4U3T4V6OVANCNFSM5CB744AQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

Aug 14 '21 00:08 taer

I'm doing a bit more testing with this, and today I'm not getting much of an improvement.

First off, the medians are very low. 0.3ms or so

For p99. The time in the non-patch is between 13ms and 30ms. Even with my patch, I'm not seeing much difference anymore, I'm going to keep experimenting with this to see if I can find anything new.

Aug 16 '21 20:08 taer

grpc-kotlin grpc-kotlin copied to clipboard

Streaming client requests are lazily requested

grpc-kotlin
grpc-kotlin copied to clipboard