python-driver icon indicating copy to clipboard operation
python-driver copied to clipboard

Optimize column_encryption_policy checks in recv_results_rows

Open mykaul opened this issue 2 months ago • 4 comments

There's no point in checking a global policy for every single value decoding, not for every row decoded.

Adjusted the code to only check it once per recv_results_rows() call - decode_row() should be defined either as is today with column_encryption_policy enabled, or much simpler without all those extra checks.

Added a unit test from CoPilot.

Fixes: https://github.com/scylladb/python-driver/issues/582

Pre-review checklist

  • [x] I have split my patch into logically separate commits.
  • [x] All commit messages clearly explain what they change and why.
  • [x] I added relevant tests for new features and bug fixes.
  • [x] All commits compile, pass static checks and pass test.
  • [x] PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • [x] I added appropriate Fixes: annotations to PR description.

mykaul avatar Dec 25 '25 17:12 mykaul

CI failure is unrelated, and I've seen it happening on too many PRs already :-/

mykaul avatar Dec 25 '25 19:12 mykaul

numpy_parser.pyx, parsing.pyx also needs to be fixed

It wasn't clear to me what needs to fixing there: numpy - it doesn't use/call column_encryption parsing.pyx - there's just 'not implemented' functions there.

mykaul avatar Jan 06 '26 08:01 mykaul

numpy_parser.pyx, parsing.pyx also needs to be fixed

It wasn't clear to me what needs to fixing there: numpy - it doesn't use/call column_encryption parsing.pyx - there's just 'not implemented' functions there.

Instead of changing only TupleRowParser class you need to do similar changes to ColumnParser and RowParser and all their implementations: ListParser, LazyParser and NumpyParser, you will discover that NumpyParser disregards column_encryption_policy which means that it does not work properly when when columns are encrypted, have a plug there that throwing NotImplementedError. In total all Parser implementation classes should contain implementations for:

    cpdef parse_plain_rows(self, BytesIOReader reader, ParseDesc desc)

    cpdef parse_enc_rows(self, BytesIOReader reader, ParseDesc desc)

While Abstract classes:

cdef class ColumnParser:
    cpdef parse_rows(self, BytesIOReader reader, ParseDesc desc):
        if desc.column_encryption_policy:
            return self.unpack_enc_row(reader, desc)
        return self.unpack_plain_row(reader, desc)

    cpdef parse_plain_rows(self, BytesIOReader reader, ParseDesc desc)

    cpdef parse_enc_rows(self, BytesIOReader reader, ParseDesc desc)

dkropachev avatar Jan 06 '26 15:01 dkropachev

numpy_parser.pyx, parsing.pyx also needs to be fixed

It wasn't clear to me what needs to fixing there: numpy - it doesn't use/call column_encryption parsing.pyx - there's just 'not implemented' functions there.

Instead of changing only TupleRowParser class you need to do similar changes to ColumnParser and RowParser and all their implementations: ListParser, LazyParser and NumpyParser, you will discover that NumpyParser disregards column_encryption_policy which means that it does not work properly when when columns are encrypted, have a plug there that throwing NotImplementedError. In total all Parser implementation classes should contain implementations for:

Let me see if I understand this:

  • [x] TupleRowParser - Implemented unpack_ce_row() and unpack_row()
  • [x] ColumnParser - its parse_rows() creates a TupleRowParser object - and then I call its unpack_ce_row() / unpack_row()
  • [ ] RowParser- its unpack_row() is raising NotImplementedError - I'm not sure I should change this in this PR?
  • [x] ListParser - its parse_rows() creates a TupleRowParser object - and then I call its unpack_ce_row() / unpack_row()
  • [x] LazyParser - its parse_rows() calls parse_rows_lazy() - which creates a TupleRowParser object - and then I call its unpack_ce_row() / unpack_row()
  • [ ] NumpyParser its parse_rows() calls its _parse_rows() which completely ignores encryption and calls unpack_row() - but I have no idea where it's implemented. BUT - I'm not introducing a regression here, no?
  • [x] BoundStatement.bind() - added in the last commit in the series.

Is that an accurate representation?

    cpdef parse_plain_rows(self, BytesIOReader reader, ParseDesc desc)

    cpdef parse_enc_rows(self, BytesIOReader reader, ParseDesc desc)

I'm not sure I understand the interaction between Python and Cython - I was not going to change the interfaces whatsoever.

While Abstract classes:

cdef class ColumnParser:
    cpdef parse_rows(self, BytesIOReader reader, ParseDesc desc):
        if desc.column_encryption_policy:
            return self.unpack_enc_row(reader, desc)
        return self.unpack_plain_row(reader, desc)

    cpdef parse_plain_rows(self, BytesIOReader reader, ParseDesc desc)

    cpdef parse_enc_rows(self, BytesIOReader reader, ParseDesc desc)

You raise a good point about the names of the functions - that can clearly be more clear with 'plain' and 'encrypted' in them! I'll fix that.

mykaul avatar Jan 06 '26 15:01 mykaul

You raise a good point about the names of the functions - that can clearly be more clear with 'plain' and 'encrypted' in them! I'll fix that.

Done in latest commit. I think it's ready for re-review.

mykaul avatar Jan 27 '26 16:01 mykaul

CI failure seems like known issue https://github.com/scylladb/python-driver/issues/446

mykaul avatar Jan 27 '26 17:01 mykaul