[Python] FlightServerBase don't support inject grpc options
Describe the enhancement requested
I'm using FlightServerBase as a server which acts as an UDFServer.
But I did not find another document or places to inject GRPC options in FlightServerBase.
https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightServerBase.html
Component(s)
Python
Hi @sundy-li, were you thinking about options like we have in the tests?
https://github.com/apache/arrow/blob/cd06982fddcc0b4327cade6e5429f903dd77fd1a/python/pyarrow/tests/test_flight.py#L2036-L2039
If so, I think we could improve this with:
- Adding examples to https://arrow.apache.org/docs/python/generated/pyarrow.flight.connect.html#pyarrow-flight-connect
- Adding a Python cookbook entry for this
Would you be interested in sending a PR in for either/both?
@amoeba
Thanks for the reply. But I am not looking about set grpc options on client side. Let me explain the issue more directly.
I am using Arrow Flight as a server in databend-udf, it's python based.
And I want to make the server handle a long-time-response request (such as time.sleep(300)). Now I got the error from client side(it's rust based) after 240 s:
Decode record batch error: Tonic(Status { code: Unavailable, message: "Too many pings", source: None })
I searched the internet, and users suggested me to add grpc options on server side rather than client side.
So I want to know how to add grpc options in FlightServerBase (such as set GRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATA to be zero ).
Hi @sundy-li, sorry about that. It doesn't look like we expose those options to PyArrow at the moment but it seems useful to expose them. Would you be interested in submitting a PR?
Hi @sundy-li, sorry about that. It doesn't look like we expose those options to PyArrow at the moment but it seems useful to expose them. Would you be interested in submitting a PR?
I'm afraid not. I am new to this repo and I found it will involve lots cpp codes and pyx codes to have this feature . It's not an easy task I think.
No worries @sundy-li. Filing issues like you've done is a great way to contribute and if you ever want to take a crack at a PR, there's lots of good options tagged as good-first-issue.
I'm running into a similar usecase when trying to configure the "generic_options" for the server side but it's about the GRPC_ARG_MAX_SEND_MESSAGE_LENGTH. If this is still impossible yet for python-based flight server, I'm curious why changing the max_chunksize for batch stream on the send data (on the server side) to be bigger than 4MB (the default gRPC max size) won't cause any errors. For references, my code looks like the follows:
reader = arrow.ipc.RecordBatchReader().from_batches(
data.schema, data.to_batches(max_chunksize=8 * 1024 * 1024)
)
return flight.RecordBatchStream(reader)
On the client side, I use the reader.read_chunk() and find that it has the same length as the send chunk (8MB). Is it because some hidden mechanisms in the cpp layer that automatically chop send data into the appropriate size?
This issue has been marked as stale because it has had no activity in the past 365 days. Please remove the stale label or comment below, or this issue will be closed in 14 days. If this improvement is still desired but has no current owner, please add the 'Status: needs champion' label.