client-java icon indicating copy to clipboard operation
client-java copied to clipboard

Txn Client scan with endKey not working

Open tank-plus opened this issue 3 years ago • 9 comments

Bug Report

1. Describe the bug

We use txn client to scan data, with a certain start key and end key, but it does not woking with a KV:Storage:InvalidReqRange error, here is the log output by Tikv

[2022/10/18 14:07:42.333 +00:00] [ERROR] [errors.rs:407] ["txn aborts"] [err_code=KV:Storage:InvalidReqRange] [err="Error(Txn(Error(InvalidReqRange { start: Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 51, 220, 2, 0, 255, 0, 26, 0, 0, 1, 19, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]), end: Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 51, 220, 2, 0, 255, 0, 26, 0, 0, 1, 19, 0, 1, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]), lower_bound: Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 46, 66, 1, 0, 255, 0, 4, 0, 0, 1, 19, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 255, 1, 0, 0, 0, 247, 139, 7, 0, 255, 0, 10, 0, 0, 0, 0, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]), upper_bound: Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 51, 220, 2, 0, 255, 0, 26, 0, 0, 1, 19, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 255, 1, 0, 0, 0, 67, 139, 1, 0, 255, 0, 30, 0, 0, 0, 0, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]) })))"]

Howerver, we remove the end Key, it works!

2. Minimal reproduce step (Required)

Just use Txn scan api

3. What did you see instead (Required)

None

4. What did you expect to see? (Required)

We really want it works to scan with start and end key!

5. What are your Java Client and TiKV versions? (Required)

3.3.0

  • Client Java:
  • TiKV:

tank-plus avatar Oct 19 '22 03:10 tank-plus

Could you offer a minimal step to reproduce this issue?

iosmanthus avatar Oct 19 '22 03:10 iosmanthus

Could you offer a minimal step to reproduce this issue?

We don't know what that error means, so we haven't been able to reproduce that issue with a small amount of data. Only when loading LDBC SF100 data in our encoding will this problem occur in some specific scan operations.

The details of the problem we encountered here are as follows

  1. Load the data of sf100 to TIKV in our encoding method
  2. Rewritten RegionStoreClient in TIKV java and added the function to send scan request according to star, end, limit and version.
public List<KvPair> scan(
          BackOffer backOffer, ByteString startKey,ByteString endKey, int limit, long version, boolean keyOnly) {
    boolean forWrite = false;
    while (true) {
      // we should refresh region
      region = regionManager.getRegionByKey(startKey, backOffer);

      Supplier<ScanRequest> request =
              () ->
                      ScanRequest.newBuilder()
                              .setContext(
                                      makeContext(
                                              getResolvedLocks(version), this.storeType, backOffer.getSlowLog()))
                              .setStartKey(codec.encodeKey(startKey))
                              .setVersion(version)
                              .setKeyOnly(keyOnly)
                              //.setEndKey(codec.encodeKey(endKey))  //comment this code,KV:Storage:InvalidReqRange will not happen
                              .setLimit(Math.min(limit, conf.getScanBatchSize()))
                              .build();
  1. Query a prefix such as "2, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 51, -36, 2, 0, 0, 26, 0, 0, 1 , 19, 0, 0”.And with this prefix , star will be "2, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 51, -36, 2, 0, 0, 26, 0, 0, 1, 19, 0, 0", end will be "2, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 51, -36, 2, 0, 0, 26, 0, 0, 1, 19 , 0, 1” , Send scanRequest, you will encounter the "KV:Storage:InvalidReqRange" on TIKV's log mentioned earlier.
  2. Not all scan will encounter such problems, only some specific prefixes will encounter such problems, but we cannot find any difference between these prefixes and other normal prefixes.

So we want to know what this error really means, and see if we can find what the wrong prefix and data have in common with it.

AIFun avatar Oct 19 '22 04:10 AIFun

Could you offer a minimal step to reproduce this issue? ... So we want to know what this error really means, and see if we can find what the wrong prefix and data have in common with it.

We did not encounter this problem with small amounts of data.

AIFun avatar Oct 19 '22 05:10 AIFun

The cause of error KV:Storage:InvalidReqRange is that end is bigger than upper_bound(i.e., Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 51, 220, 2, 0, 255, 0, 26, 0, 0, 1, 19, 0, 1, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]) >Some([2, 0, 0, 0, 0, 0, 0, 1, 255, 1, 0, 0, 0, 51, 220, 2, 0, 255, 0, 26, 0, 0, 1, 19, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 255, 1, 0, 0, 0, 67, 139, 1, 0, 255, 0, 30, 0, 0, 0, 0, 0, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 247]), and out of the region scope.

To address this issue, I think you can assign the end as the smaller one between start+1 and upper_bound of the region (RegionStoreClient.region.getEndKey())

What's the usage of the new scan method ?

pingyu avatar Oct 19 '22 06:10 pingyu

Thanks! After adding the judgment of the endkey of the region, the new scan function can be executed correctly.

The reason why this new scan function is defined is that the “limit” of most of our scan request will be relatively large, but the results of many queries may not be very large. However, the original scan function that supports “version” does not support sinking the endkey to the TIKV server. It only pulls the limit (or scanBatchSize) amount of data to the client and then filters it in the iterator, which is too inefficient for us. . So we want to use the endkey in the interface, and then only the data we need will be transmitted when querying.

The new code is

public List<KvPair> scan(
          BackOffer backOffer, ByteString startKey,ByteString endKey, int limit, long version, boolean keyOnly) {
    boolean forWrite = false;
    while (true) {
      // we should refresh region
      region = regionManager.getRegionByKey(startKey, backOffer);
      ByteString finalEndKey  = compare(region.getEndKey().toByteArray(),endKey.toByteArray())<0 ?region.getEndKey():endKey;
      Supplier<ScanRequest> request =
              () ->
                      ScanRequest.newBuilder()
                              .setContext(
                                      makeContext(
                                              getResolvedLocks(version), this.storeType, backOffer.getSlowLog()))
                              .setStartKey(codec.encodeKey(startKey))
                              .setVersion(version)
                              .setKeyOnly(keyOnly)
                              .setEndKey(codec.encodeKey(finalEndKey))
                              .setLimit(Math.min(limit, conf.getScanBatchSize()))
                              .build();

AIFun avatar Oct 19 '22 07:10 AIFun

Got it.

I think this change is great, and it's very welcome to raise a pull request. @iosmanthus How do you think ?

pingyu avatar Oct 19 '22 12:10 pingyu

@AIFun feel free to open a pull request, we could use this scan to implement all variant versions of the scan.

iosmanthus avatar Oct 20 '22 08:10 iosmanthus

There are still some bug in my code. I'm trying to fix them.I'll open a PR after it work.

AIFun avatar Oct 31 '22 12:10 AIFun

This issue is stale because it has been open 30 days with no activity.

github-actions[bot] avatar Dec 01 '22 00:12 github-actions[bot]