cassandra icon indicating copy to clipboard operation
cassandra copied to clipboard

Implement support for writing AA SAI indexes

Open jkni opened this issue 10 months ago • 4 comments

This PR adds support for writing AA indexes. This also fixes several pre-existing issues with reading AA that were uncovered due to expanded test coverage. The entire test suite has also been run with LATEST set to AA.

jkni avatar May 01 '24 18:05 jkni

I'm sympathetic to the goal of deduplicating as early as possible in the read path, particularly because late deduplication obscures other behaviors. I put up an additional PR to test an approach at this. If CI looks good, I will bring it into this PR. If not, I'll defer resolving that to a future issue.

jkni avatar May 08 '24 20:05 jkni

I'm sympathetic to the goal of deduplicating as early as possible in the read path, particularly because late deduplication obscures other behaviors. I put up an additional PR to test an approach at this. If CI looks good, I will bring it into this PR. If not, I'll defer resolving that to a future issue.

Let's defer this to a potential follow up. Pushing deduplication below an antijoin for a partition-aware primary key map has complications.

jkni avatar May 08 '24 22:05 jkni

To ease review burden on the large Lucene imports, I am providing the diffs/short summaries of the difference of imported Lucene files (notice about the source will be pushed in an additional commit) from lucene-solr 7.5.0 (https://github.com/apache/lucene/releases/tag/releases%2Flucene-solr%2F7.5.0):

LegacyByteBuffersIndexInput

The only changes are participating in the type hierarchy of our IndexInput (which provides byte ordering), returning LegacyByteBuffersIndexInput from slice/clone, and wrapping LegacyByteBuffersDataInput instead of `ByteBuffersDataInput.

LegacyByteBuffersDataInput

As above, return types are changed. The following read methods are added, as they previously used the superclass implementations, that were little-endian. skipBytes did not exist in 7.5 and had to be added.

    @Override
    public short readShort() throws IOException
    {
        return (short) (((readByte() & 0xFF) <<  8) |  (readByte() & 0xFF));
    }

    @Override
    public int readInt() throws IOException
    {
        return ((readByte() & 0xFF) << 24) | ((readByte() & 0xFF) << 16)
               | ((readByte() & 0xFF) << 8) | (readByte() & 0xFF);
    }

    @Override
    public long readLong() throws IOException
    {
        return (((long)readInt()) << 32) | (readInt() & 0xFFFFFFFFL);
    }

    @Override
    public void skipBytes(long l) throws IOException
    {
        if (l < 0) {
            throw new IllegalArgumentException("l must be >= 0, got " + l);
        }
        if (l > size() - pos) {
            throw new EOFException();
        }
        pos += l;
    }

LegacyByteBuffersIndexOutput

As above, return types are changed, and this is updated to wrap LegacyByteBuffersDataOutput. This now extends our IndexOutput, which provides byte-ordering.

LegacyByteBuffersDataOutput

As above, return types are changed. toDataInput now returns a LegacyByteBuffersDataInput for correct decoding. writeShort/writeInt/writeLong previously used the superclass implementation (which is little-endian) when writing crossblock. For these cases, I added the writeCrossBlockShort/writeCrossBlockInt/writeCrossBlockLong methods. For example, this was the upstream writeInt implementation:

  @Override
  public void writeInt(int v) {
    try {
      if (currentBlock.remaining() >= Integer.BYTES) {
        currentBlock.putInt(v);
      } else {
        super.writeInt(v);
      }
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }    
  }

jkni avatar May 09 '24 18:05 jkni

I've pushed commits addressing all review comments so far. One of the two test failures is flaky and unrelated to this PR. I see the problem and will fix it separately. The second is a legitimate problem with the tests; some of the parameterized SAI tests are quite slow on CI machines, and with additional SAI versions covered, they may legitimately hit a 10 minute timeout. I'm tempted to bump this in our Jenkins lib. IMO, parameterizing these tests is the cleanest way to run this test coverage, and while we may eventually be able to improve the runtime, they currently legitimately run around 11 minutes.

jkni avatar May 10 '24 04:05 jkni

Thank you all for the collaboration and detailed reviews!

jkni avatar May 10 '24 22:05 jkni