grpc-node icon indicating copy to clipboard operation
grpc-node copied to clipboard

streaming flow control

Open callmehiphop opened this issue 7 years ago • 15 comments

Forgive me if I've overlooked documentation on this subject, but I'm trying to understand how to properly configure flow control on streaming interfaces. Usually with Node streams I believe you would set the highWatermark option, however it appears that gRPC does not accept that option.

I thought maybe there was a grpc specific configuration that could be set, but after looking through the available options, it wasn't very clear to me if there was and how to use it correctly.

Essentially the issue I have is that I want to temporarily stop buffering results, typically in Node you would use the pause() and resume() methods to do this, however even if a stream is paused its internal buffer continues to get filled up and if I'm unable to configure the size of said buffer, I start to see increased memory usage.

callmehiphop avatar Dec 11 '17 19:12 callmehiphop

Ping @murgatroid99

stephenplusplus avatar Feb 09 '18 21:02 stephenplusplus

We do not currently expose an API to change the highWaterMark option, and we don't change it ourselves, and I agree that we should do so. And I didn't realize this until just now, but the default value is actually very high for object mode streams, so it would probably be good for us to choose a different default. Probably something very small, like 1 or 5.

murgatroid99 avatar Feb 09 '18 22:02 murgatroid99

Actually, I'm looking at the documentation again and I see that the default high water mark for object mode is only 16, which seems pretty reasonable to me. There might be a different explanation for the high memory usage.

murgatroid99 avatar Feb 10 '18 01:02 murgatroid99

I think regardless of whether or not that's the case, flow control is definitely something I would like to see implemented in gRPC. It's a pretty basic feature of Node streams in general, so I would think its something users would expect to have.

callmehiphop avatar Feb 10 '18 01:02 callmehiphop

@murgatroid99 after doing some testing, I'm at a bit of a loss on how to proceed or even describe my issue. I can replicate what users are seeing with the increased memory usage, however for me it only occurs when I have a large volume of data waiting to be sent to a worker. I thought applying the highWatermark option would resolve said issue, but testing locally has me thinking otherwise.

It seems like there needs to be some kind of mechanism that tells gRPC to stop making requests altogether, but I don't believe that exists (not in the Node version at least?) and I'm not really sure how something like that fits in with a Node streaming interface. Part of me wonders if this is even the correct place to try and resolve such an issue.

callmehiphop avatar Feb 23 '18 20:02 callmehiphop

I'm not sure what you mean by a "mechanism that tells gRPC to stop making requests." You have to explicitly initiate each request, so that's a decision that would be made in a layer above gRPC.

You mention having "a large volume of data waiting to be sent to the worker." Does that mean that you have a lot of objects loaded into memory, and if so, is that not enough to account for the high memory usage?

murgatroid99 avatar Feb 23 '18 20:02 murgatroid99

I'm not sure what you mean by a "mechanism that tells gRPC to stop making requests." You have to explicitly initiate each request, so that's a decision that would be made in a layer above gRPC.

Sorry, I'm talking about a bidi stream. In the case of PubSub, its possible that they will send a continuous stream of large amounts of data to us, but as far as I can tell there is not way to throttle that.

You mention having "a large volume of data waiting to be sent to the worker." Does that mean that you have a lot of objects loaded into memory, and if so, is that not enough to account for the high memory usage?

Yeah, that's definitely what's happening, but as I said before, I think my inability to throttle the data is the real issue.

callmehiphop avatar Feb 23 '18 20:02 callmehiphop

Oh, you're talking about being on the receiving side of a stream. The stream has a pause() method, which should stop it from outputting data events, and if everything works properly, once the buffer fills up it should stop receiving messages. Then you can resume with the resume() function.

murgatroid99 avatar Feb 23 '18 20:02 murgatroid99

Sure, but that doesn't help me in regards to memory usage, because the stream will continue to receive data from the server.

callmehiphop avatar Feb 23 '18 20:02 callmehiphop

What I'm saying is that the stream will stop receiving data from the server. Once the buffer fills up, it stops requesting messages from the gRPC C core, which will in turn stop refilling the HTTP2 flow control window, which will eventually result in the server not sending any more messages until the receiving side calls resume() and the buffers start draining. The memory usage there is limited to the total size of the messages that can fit in the read buffer plus the size of the flow control window (I don't remember the default for that, but I think it is configurable).

murgatroid99 avatar Feb 23 '18 20:02 murgatroid99

I see. Was there any plans to allow the internal buffer size to be configurable? Also is there any documentation on adjusting the flow control window?

callmehiphop avatar Feb 23 '18 21:02 callmehiphop

The flow control window might actually not be configurable. I'm not sure how to do it. But the buffer I'm talking about is just the stream buffer that is configured by the highWatermark.

murgatroid99 avatar Feb 23 '18 21:02 murgatroid99

Has that been made configurable? When you mentioned it 2 weeks ago, it was not.

callmehiphop avatar Feb 23 '18 22:02 callmehiphop

No, we haven't changed that yet. I'm just pointing out that we're still talking about that same buffer.

murgatroid99 avatar Feb 23 '18 22:02 murgatroid99

Hello. 👊 Bumping this. @murgatroid99 flow control configuration would be nice to have 🙏.

nickewansmith avatar Sep 14 '23 17:09 nickewansmith