gohbase
gohbase copied to clipboard
Keep scanner alive (auto-renew) OR handle expired scanner and restart from previous row
Is it possible to support configurable scanner timeout and client RPC timeouts? These would be equivalent to hbase.client.scanner.timeout.period and hbase.rpc.timeout respectively in the in Java HBase client. This would allow for fine tuning scans (and other requests) to prevent scanner leases from expiring prematurely.
Also, what are the current values for these timeouts?
AFAIK, scanner timeout values are configurable on the server side: https://github.com/apache/hbase/blob/branch-2.0/hbase-common/src/main/resources/hbase-default.xml#L597
I don't see any relevant fields in ScanRequest (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289) or Scan (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L246) that would allow the client to specify the timeout.
So the problem you are trying to solve is how do you renew or automatically recover from scanner failure (due to timeout or regionserver going down). That is tricky due to partial results returned by hbase, meaning if half of a row is returned, the challenge is to restart request from the second half of the row and continue on scanning. It looks like hbase already has some logic to handle case like this (https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L295 and https://issues.apache.org/jira/browse/HBASE-5974), but I haven't spent much time on researching the nuances.
Necro-ing because we hit this recently.
I think we can fix this by sending a ScanRequest with the renew bool set to true
https://github.com/apache/hbase/blob/branch-2.0/hbase-protocol/src/main/protobuf/Client.proto#L289.
So I guess the question of implementation is either:
- handle this transparently in gohbase by starting a goroutine per scan that renewed on an interval
- add a function (like Renew) to the Scanner
I personally think 1 is preferable, especially if this is optional and off by default. We could also add a configurable renew limit to kill slow clients with our own per scanner timeout.
Thoughts?