python-driver icon indicating copy to clipboard operation
python-driver copied to clipboard

Add support for SCYLLA_USE_METADATA_ID and skip metadata

Open andrzej-jackowski-scylladb opened this issue 11 months ago • 10 comments

FOR NOW THIS IS A DRAFT PR Used for testing a fix for https://github.com/scylladb/scylladb/issues/20860

Pre-review checklist

  • [ ] I have split my patch into logically separate commits.
  • [ ] All commit messages clearly explain what they change and why.
  • [ ] I added relevant tests for new features and bug fixes.
  • [ ] All commits compile, pass static checks and pass test.
  • [ ] PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • [ ] I added appropriate Fixes: annotations to PR description.

v2:

  • Fixed test_protocol.py unit tests

So no CQLv5 after all?

Lorak-mmk avatar Mar 14 '25 11:03 Lorak-mmk

So no CQLv5 after all?

As a first step, we started with an implementation of CQLv4 extension (see https://github.com/scylladb/scylladb/pull/23292 in progress), as introducing a full support for CQLv5 will require implementing ten different features (see https://github.com/scylladb/scylladb/issues/20860#issuecomment-2708770932) and introducing them all-at-once would be very difficult.

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to https://github.com/datastax/python-driver/commit/4f3b9ab2eb9f679f1b3877ea239b27a48913c994) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to https://github.com/scylladb/scylla-drivers/issues/61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

@Lorak-mmk, @dkropachev, please share your thought on this.

So no CQLv5 after all?

As a first step, we started with an implementation of CQLv4 extension (see scylladb/scylladb#23292 in progress), as introducing a full support for CQLv5 will require implementing ten different features (see scylladb/scylladb#20860 (comment)) and introducing them all-at-once would be very difficult.

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to datastax@4f3b9ab) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to scylladb/scylla-drivers#61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

@Lorak-mmk, @dkropachev, please share your thought on this.

java-3.x, golang, rust, csharp and cpp-rust-driver it should be pretty easy, maybe rust will need some API changes.

java-4.x - there is an issue with com.datastax.(oss|dse).protocol.internal packages, not sure if we will have to fork them, one of alternatives is to extend codecs and trick them to think it is CQL5 when this feature is there.

on python, true, there is no infra on disable/enable skip metadata, I don't think it is crucial for testing to have it, but for proper production use it is better to put it back in place.

dkropachev avatar Mar 14 '25 14:03 dkropachev

@Lorak-mmk , my proposal is to bring control over skip metadata back, by reverting (one way or another) https://github.com/datastax/python-driver/commit/4f3b9ab2eb9f679f1b3877ea239b27a48913c994 and merge this pr on top of it, removing pairing between SCYLLA_USE_METADATA_ID and _QueryMessage.skip_meta

dkropachev avatar Mar 14 '25 14:03 dkropachev

The PR is still a DRAFT.

v3:

  • Added missing whitespace in cassandra/protocol.py
  • Restored old line order in cassandra/protocol_features.py

I created this draft PR because I needed a driver for testing my changes related to MetadataId, but you are more than welcome to productize those changes. However, keep in mind that usage of _SKIP_METADATA_FLAG was for some reason removed from Python Driver in 2019 (refer to datastax@4f3b9ab) and I needed to restore this functionality in order to really test the fix. I don't think that unconditional usage of _SKIP_METADATA_FLAG that is currently implemented here is safe (as it will expose Python Driver to scylladb/scylla-drivers#61), and I'm not sure if the current implementation is sufficient to allow setting _SKIP_METADATA_FLAG when SCYLLA_USE_METADATA_ID extension is used (a driver expert review is needed to confirm this).

If my understanding is correct, neither gocql nor rust driver currently implement CQLv5. Therefore, implementing a support for SCYLLA_USE_METADATA_ID seems easier than supporting the entire CQLv5 support for those driver.

With java-driver the situation seems more complex, because (correct me if I'm wrong) a proper use of SCYLLA_USE_METADATA_ID requires changes in native protocol that is not forked by Scylla. If so, the only choices for Java driver is to fully implement support on ScyllaDb side (because java-driver already has such implementation) or to somehow modify the native protocol (e.g. fork it, but I think it is a huge decision to make).

Most upstream drivers already support CQLv5. I actually thought all of them do, it is surprising to hear that gocql does not. The existing support of CQLv5 is why I really hoped we would go this way, and use existing code that is compatible with upstream, instead of doubling the work and implementing new extension in all the drivers. Which, as you noticed yourself, is not really straightforward in some of them. Regarding Rust Driver: yes it only supports CQLv4, and Scylla adding support for CQLv5 would be a very good reason for us to change that - which is another reason I hoped we would go with CQLv5.

It is also worth noting that not all users utilize our drivers - some use Datastax drivers. The adoption of new driver versions is also quite slow. In addition, some 3rd party applications, ORMs and other libraries may also use old or datastax drivers. If we went with CQLv5 they could all use that. With custom extension, only users of our newest drivers will be able to.

Lorak-mmk avatar Mar 26 '25 08:03 Lorak-mmk

The existing support of CQLv5 is why I really hoped we would go this way, and use existing code that is compatible with upstream, instead of doubling the work and implementing new extension in all the drivers. Which, as you noticed yourself, is not really straightforward in some of them. Regarding Rust Driver: yes it only supports CQLv4, and Scylla adding support for CQLv5 would be a very good reason for us to change that - which is another reason I hoped we would go with CQLv5.

We are planning to add CQLv5 in the future. For now we are adding the CQLv4 extension,so problems related to https://github.com/scylladb/scylla-drivers/issues/61 can be mitigated if required (also in drivers that doesn't support CQLv5).

Please also refer to https://scylladb.atlassian.net/wiki/spaces/RND/pages/42238631/MetadataId+extension+in+CQLv4+Requirement+Document.

We are planning to add CQLv5 in the future.

Great. Is there an issue for that?

Lorak-mmk avatar Mar 27 '25 10:03 Lorak-mmk

Please resolve conflicts

mykaul avatar Aug 09 '25 07:08 mykaul