accumulo icon indicating copy to clipboard operation
accumulo copied to clipboard

Scanning a table without the versioning iterator can drop keys

Open ivakegg opened this issue 3 months ago • 3 comments

I have conclusively proved that if you scan a table without a versioning iterator, and that table contains identical keys but different values, keys will be dropped. I played with using batch scanners and single scanners, and I played with varying buffer sizes and the symptoms where the same; keys would be lost. I had to go to directly reading the rfile to be able to see all of the keys I needed for processing. None of the keys have the delete flag set.

Accumulo 2.1.4 Redhat 8

I have an example of a table with only 1 file that demonstrates this issue. I have not attempted to create a test example as of yet.

I expect that a scan of a table without any iterators or any delete keys would be equivalent to a direct scan of the rfiles.

I have noted that in the example I have, if I scan the separate rows directly then it is less likely to drop keys. However if I do a full scan of the table that I am more likely to drop keys.

ivakegg avatar Oct 06 '25 12:10 ivakegg

Here is a script that will create a table without and versioning iterator configured:

#!/bin/sh echo "createtable test" echo "config -t test -d table.iterator.majc.vers" echo "config -t test -d table.iterator.minc.vers" echo "config -t test -d table.iterator.scan.vers" for i in {1..100}; do for j in {1..10}; do declare -i vsize=$(( RANDOM % 2000 )) value=tr -dc 'A-Za-z0-9' < /dev/urandom | head -c $vsize; echo echo "insert row$i cf$i cq$i "$value" -ts 100 -t test" done done echo "flush test -w" echo "compact test -w"

Pipe the output of this script in to a file and then use the shell to "execfile" that file. The result should be a test table that contains exactly 1000 entries and indeed if you dump the contents of the rfile you will see all of them. HOWEVER if you do a full scan of the table using the shell or a scanner (scan -t test -np) you will find that you do not get all of the keys.

ivakegg avatar Oct 06 '25 19:10 ivakegg

Here is a failing JUnit test case too:

  @Test
  public void testDuplicateTimestampScanLosesKeys() throws Exception {
    ClientContext context = (ClientContext) client;
    final int numRows = 100;
    final int mutationsPerRow = 10;
    final int expectedEntries = numRows * mutationsPerRow;
    SecureRandom random = new SecureRandom();
    byte[] randomValue = new byte[8192];

    client.tableOperations().create(tableName);

    Set<String> versionIterProps =
        Set.of("table.iterator.scan.vers", "table.iterator.minc.vers", "table.iterator.majc.vers");
    client.tableOperations().modifyProperties(tableName,
        properties -> properties.keySet().removeAll(versionIterProps));

    try (BatchWriter bw = client.createBatchWriter(tableName)) {
      for (int i = 0; i < numRows; i++) {
        for (int j = 0; j < mutationsPerRow; j++) {
          Mutation m = new Mutation("row" + i);
          random.nextBytes(randomValue);
          m.put("cf" + i, "cq" + i, 100L, new Value(randomValue));
          bw.addMutation(m);
        }
      }
    }
    client.tableOperations().flush(tableName, null, null, true);
    client.tableOperations().compact(tableName, new CompactionConfig().setWait(true));

    client.tableOperations().offline(tableName, true);
    long offlineCount;
    try (OfflineScanner offlineScanner =
        new OfflineScanner(context, context.getTableId(tableName), Authorizations.EMPTY)) {
      offlineCount = offlineScanner.stream().count();
    }

    client.tableOperations().online(tableName, true);
    long onlineCount;
    try (Scanner scanner = client.createScanner(tableName, Authorizations.EMPTY)) {
      onlineCount = scanner.stream().count();
    }

    assertEquals(expectedEntries, offlineCount);
    assertEquals(offlineCount, onlineCount, "Online scan lost keys compared to direct RFile scan");
  }

And this fails on the final assert with:

org.opentest4j.AssertionFailedError: Online scan lost keys ==> 
Expected :1000
Actual   :888

DomGarguilo avatar Oct 07 '25 14:10 DomGarguilo

From what I have gathered I think this is what is going on here:

When a server cuts a scan batch, it remembers the last key it returns to the client. When it resumes the scan, it exclusive-seeks to that key it remembered. MemKey is used in the scan and has a tie breaker for the keys which is the kvCount. That is not remembered by the server though so when it resumes, all the mutations with the "same" key are skipped.

DomGarguilo avatar Oct 07 '25 19:10 DomGarguilo