gohbase icon indicating copy to clipboard operation
gohbase copied to clipboard

Keep scanner alive (auto-renew) OR handle expired scanner and restart from previous row

Open jonbonazza opened this issue 6 years ago • 2 comments

Is it possible to support configurable scanner timeout and client RPC timeouts? These would be equivalent to hbase.client.scanner.timeout.period and hbase.rpc.timeout respectively in the in Java HBase client. This would allow for fine tuning scans (and other requests) to prevent scanner leases from expiring prematurely.

Also, what are the current values for these timeouts?

jonbonazza avatar May 15 '18 17:05 jonbonazza

AFAIK, scanner timeout values are configurable on the server side: https://github.com/apache/hbase/blob/branch-2.0/hbase-common/src/main/resources/hbase-default.xml#L597

I don't see any relevant fields in ScanRequest (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289) or Scan (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L246) that would allow the client to specify the timeout.

So the problem you are trying to solve is how do you renew or automatically recover from scanner failure (due to timeout or regionserver going down). That is tricky due to partial results returned by hbase, meaning if half of a row is returned, the challenge is to restart request from the second half of the row and continue on scanning. It looks like hbase already has some logic to handle case like this (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L295 and https://issues.apache.org/jira/browse/HBASE-5974), but I haven't spent much time on researching the nuances.

timoha avatar May 24 '18 17:05 timoha

Necro-ing because we hit this recently.

I think we can fix this by sending a ScanRequest with the renew bool set to true https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289.

So I guess the question of implementation is either:

  • handle this transparently in gohbase by starting a goroutine per scan that renewed on an interval
  • add a function (like Renew) to the Scanner

I personally think 1 is preferable, especially if this is optional and off by default. We could also add a configurable renew limit to kill slow clients with our own per scanner timeout.

Thoughts?

devoxel avatar Oct 07 '21 21:10 devoxel