arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[Ruby] Improve non-blocking I/O for Flight client?

Open datbth opened this issue 2 weeks ago • 3 comments

Describe the enhancement requested

I'm trying to do multiple Flight DoGet requests in parallel Ruby Threads/Fibers and observe that ArrowFlight::Client#do_get doesn't yield to another Thread/Fiber until the first Arrow Record (RecordBatch) is received.

Image

Is it possible to yield more frequently, such as:

  • Right after sending the request ticket?
  • After each chunk in the Record (RecordBatch)?

Component(s)

Ruby

datbth avatar Dec 11 '25 07:12 datbth

Could you provide a script that reproduces this? Especially, I want to know how to parallelize.

kou avatar Dec 11 '25 07:12 kou

Sorry, I just tested again and see that using Threads works as expected:

threads = [
  Thread.new do
    client = new_client
    reader = client.do_get
    reader.read_all
    1
  ensure
    client&.close
  end,
  Thread.new do
    client = new_client
    reader = client.do_get
    reader.read_all
    2
  ensure
    client&.close
  end,
]
threads.each(&:join)
Image

What I tried earlier: using async gem (with ruby 3.2.6):

require 'async'

Async do
  [
    Async do
      client = new_client
      reader = client.do_get
      reader.read_all
      1
    ensure
      client&.close
    end,
    Async do
      client = new_client
      reader = client.do_get
      reader.read_all
      2
    ensure
      client&.close
    end,
  ].map(&:wait)
end.wait

But I haven't had much experience with it either, so I'm now not sure if this is the normal/expected behavior when using async or not.

Can close this issue if you're not interested.

datbth avatar Dec 11 '25 08:12 datbth

Underlying C++ implementation doesn't support asynchronous DoGet. So we can't use Fiber to concurrent DoGet.

We may need to implement pure Ruby gRPC client and re-implement Flight by Ruby to integrate async gem...

kou avatar Dec 14 '25 02:12 kou