grpc-go icon indicating copy to clipboard operation
grpc-go copied to clipboard

rpc_util: Reuse memory buffer for receiving message with stat handlers

Open ash2k opened this issue 2 years ago • 4 comments

Use case(s) - what problem will this feature solve?

Use shared buffer pool when stats handler(s) is used i.e. address the limitation documented in

https://github.com/grpc/grpc-go/blob/a758b62537636eac84b6f0ce70db850bb55e8268/server.go#L572-L591

Proposed Solution

?

Alternatives Considered

?

Additional Context

See https://github.com/grpc/grpc-go/pull/5862/files#r1171991389.

ash2k avatar Sep 25 '23 10:09 ash2k

If you have a proposal for how to do this, then please include it here.

Otherwise, it's a bit tricky of a situation since the stats handler may keep a pointer to the data, meaning re-use would corrupt it.

dfawley avatar Oct 02 '23 16:10 dfawley

I believe we are talking about this field:

https://github.com/grpc/grpc-go/blob/v1.58.2/stats/stats.go#L78-L79

I see two approaches:

  • Nothing in the doc on the field or on the stats.InPayload type or on the stats.Handler type says it's ok to retain the buffer. So we could clarify that it's NOT ok to do that and that the handler implementation should copy the data if it needs it. I don't know if this will be considered a breaking change or not. We could also document this on the RecvBufferPool option that stats handlers in use must not retain the data slice.
  • Keep the current contract of InPayload UNLESS RecvBufferPool option is specified. Document the constrains when that option is in use.

The second option will not break anyone as it's an opt-in experience, but it's also not as nice because of the difference in API. Why would a stats handler retain the buffer? I haven't seen such handlers and I cannot come up with a good use case. Are there any examples?

ash2k avatar Oct 02 '23 23:10 ash2k

I don't know if this will be considered a breaking change or not.

Unfortunately, yes, it would be, as the documentation doesn't currently forbid it.

Yes, stats is an experimental package, and is clearly labeled as such, but I am worried it's being used very extensively now, and am weary of breaking lots of people that may have not noticed the label in the documentation.

Keep the current contract of InPayload UNLESS RecvBufferPool option is specified. Document the constrains when that option is in use.

I think this is the only situation where the contract would be violated anyway. This actually seems pretty reasonable, but note that it requires a full understanding from the user of the implications of setting RecvBufferPool and the behavior of any stats handlers that might be used. So the documentation should be updated to reflect that.

I haven't seen such handlers and I cannot come up with a good use case.

I believe there are use cases where the stats handler is used to populate a debug page, for example, where the payload is saved and used in a lazy fashion. Copying it early creates often-unnecessary garbage (when the page is never viewed), and allowing it to be overwritten would corrupt that data.

dfawley avatar Oct 04 '23 18:10 dfawley

Copying it early creates often-unnecessary garbage (when the page is never viewed), and allowing it to be overwritten would corrupt that data.

But isn't this the current situation - each buffer is garbage after it had been used once. Such debug page is rarely used so this buffer is pure garbage and is discarded. What we are discussing would make it possible to make the much more common case, where the buffer can be reused, more efficient (resource-usage-wise). The debugging stats handler would only need to allocate a buffer and copy the data. The only overheard vs the current situation is copying the data, not the allocation. Presumably such stats handler can have a single buffer that it overwrites on each message so in that case the new approach is actually still more efficient than status quo.

Current: new buffer for each message

New: 1 buffer in gRPC + 1 buffer in stats handler + copy from first to the second.

Copying memory should be cheaper than creating garbage+GCing it later.

Also, if such a stats handler is for debugging, it should be ok to have some overhead (i.e. "price") for that. It's not cost for nothing, it's the cost of debugging functionality, not pure waste. But the common case becomes way more efficient.


https://github.com/grpc/grpc-go/pull/6309 gave me a 33% RAM usage reduction (if I calculated it correctly...), so I'm sure doing the same for the read path would be a great improvement too. I don't understand why a common use case (receiving messages) needs to take such a huge efficiency hit because of an esoteric stats handler used by a very few, especially when there is a clear workaround for their use case. 🤷

Screenshot 2023-09-25 at 5 08 33 pm

ash2k avatar Oct 05 '23 04:10 ash2k