dqlite icon indicating copy to clipboard operation
dqlite copied to clipboard

>255 query bindings causes query to silently not return full results

Open tomponline opened this issue 2 years ago • 1 comments

Originally reported from https://github.com/lxc/lxd/issues/10705

@freeekanayaka confirmed the issue is that the query binding ID is limited to 255.

https://github.com/lxc/lxd/pull/10706#issuecomment-1198258908

We should support sqlite's underlying limits:

https://www.sqlite.org/limits.html

See point 9:

the maximum value of a host parameter number is SQLITE_MAX_VARIABLE_NUMBER, which defaults to 999 for SQLite versions prior to 3.32.0 (2020-05-22) or 32766 for SQLite versions after 3.32.0.

And if that limit is reached fail the query rather than silently returning partial results.

tomponline avatar Jul 28 '22 15:07 tomponline

https://github.com/canonical/go-dqlite/pull/194

tomponline avatar Aug 12 '22 19:08 tomponline

Going to take a stab at this.

cole-miller avatar Aug 31 '22 21:08 cole-miller

@cole-miller I think @MathieuBordere was already working on it last time I asked him. You may want to pick up with him about how far he got with it.

tomponline avatar Aug 31 '22 21:08 tomponline

I was looking at this a bit yesterday, and I think the most straightforward way to implement the fix is to introduce new request types corresponding to REQUEST_EXEC, REQUEST_EXEC_SQL, REQUEST_QUERY, REQUEST_QUERY_SQL for which the format of the params tuple has a larger (say, 32 bits) "number of values" field. Alternatives proposed in the other thread that I considered:

  • Using the message schema version field: the documentation says that message format changes that accompany a bump to this version must be backwards-compatible, only adding new fields at the end. In this case, we're increasing the size of the "number of params" field in the tuple header, which comes in the middle of the message. As far as I can tell, current dqlite doesn't check the message schema version (flags field) at all for requests, so old servers would just try to parse new-style requests in the old format.
  • Using a variable-length encoding for the "number of params" field in the tuple header: I don't see how this can be made backwards-compatible. It's perfectly valid for the client to send a REQUEST_EXEC with exactly 255 params, so we can't change how dqlite servers parse a request where the "number of params" field has that value -- and similarly for every other 8-bit unsigned integer.

If we introduce new request types, then all old-style requests will continue to work, so clients like go-dqlite can introduce the new requests gradually. If a client sends one of the new requests to an old server that doesn't recognize it, it will be dropped silently, because we fall out of this switch:

https://github.com/canonical/dqlite/blob/607de645ca4ae282ac9267e220f91b6afc56e0f7/src/gateway.c#L1197-L1205

That's not ideal -- whatever approach we end up taking to the params issue, I think we should patch gateway__handle to send a failure response if the request type isn't recognized. (See also the TODO here.)

cole-miller avatar Sep 22 '22 15:09 cole-miller

I was looking at this a bit yesterday, and I think the most straightforward way to implement the fix is to introduce new request types corresponding to REQUEST_EXEC, REQUEST_EXEC_SQL, REQUEST_QUERY, REQUEST_QUERY_SQL for which the format of the params tuple has a larger (say, 32 bits) "number of values" field. Alternatives proposed in the other thread that I considered:

  • Using the message schema version field: the documentation says that message format changes that accompany a bump to this version must be backwards-compatible, only adding new fields at the end. In this case, we're increasing the size of the "number of params" field in the tuple header, which comes in the middle of the message. As far as I can tell, current dqlite doesn't check the message schema version (flags field) at all for requests, so old servers would just try to parse new-style requests in the old format.
  • Using a variable-length encoding for the "number of params" field in the tuple header: I don't see how this can be made backwards-compatible. It's perfectly valid for the client to send a REQUEST_EXEC with exactly 255 params, so we can't change how dqlite servers parse a request where the "number of params" field has that value -- and similarly for every other 8-bit unsigned integer.

If we introduce new request types, then all old-style requests will continue to work, so clients like go-dqlite can introduce the new requests gradually. If a client sends one of the new requests to an old server that doesn't recognize it, it will be dropped silently, because we fall out of this switch:

https://github.com/canonical/dqlite/blob/607de645ca4ae282ac9267e220f91b6afc56e0f7/src/gateway.c#L1197-L1205

That's not ideal -- whatever approach we end up taking to the params issue, I think we should patch gateway__handle to send a failure response if the request type isn't recognized. (See also the TODO here.)

It's unfortunate that 1) the flags field (which should be renamed to revision or schema or similar) is ignored and 2) we don't return an error for unknown type. Those two things should be fixed, returning an error if the revision can't be handled, or if the type is unknown.

That being said, I believe in this specific case it's fine to modify the client so that it sets the revision field to in 1 when there are more than 255 parameters to bind. In that case it will use some variable-length encoding mechanism, using as a many 8-bit slots as necessary (as opposed to just one, which is what revision 0 assumes).

This is not perfect, but I'd say good enough, because 1) most of the time the client will still use revision 0, so a new client modified in this way will still work fine with old servers 2) if there are more than 255 parameters a new client will set the revision to 1 and send to the old server something that will result in undesired behavior, however even current clients are broken and exhibit undesired behavior in this case, so the situation is not significantly worse.

As you point, adding new request types is still problematic because they will be dropped, so we have a breakage in that case too. At that point, better to go for the most sensible solution protocol-wise, which is to bump the revision.

Regarding the idea that a revision bump should only add new fields at the end of the message, we should probably be a bit more flexible and see what makes sense on a case-by-case basis. As a general rule I'd say that the client is the one that sets the revision in the request, and the server should return an error if it's a revision it doesn't know how to handle. If it understands that revision, it might need to respond to the client with a "matching" revision because for example the revision sent by the client in the request implies a specific response revision, or something along those lines. Hope it's clear what I mean.

freeekanayaka avatar Sep 22 '22 16:09 freeekanayaka

Regarding the idea that a revision bump should only add new fields at the end of the message, we should probably be a bit more flexible and see what makes sense on a case-by-case basis. As a general rule I'd say that the client is the one that sets the revision in the request, and the server should return an error if it's a revision it doesn't know how to handle.

Isn't this exactly how the request type field is currently treated? I can think of the following advantages of adding new request types in this case, compared to changing the significance of the schema version:

  • Smaller diff (we don't need to add code to check the schema version in every request handler and in go-dqlite's response parser)
  • The schema version remains available if we want to add fields to the end of some request (response) in the future without breaking compatibility with old servers (clients)

Could you elaborate on what the advantages are of using the schema version here?

cole-miller avatar Sep 22 '22 19:09 cole-miller

Regarding the idea that a revision bump should only add new fields at the end of the message, we should probably be a bit more flexible and see what makes sense on a case-by-case basis. As a general rule I'd say that the client is the one that sets the revision in the request, and the server should return an error if it's a revision it doesn't know how to handle.

Isn't this exactly how the request type field is currently treated? I can think of the following advantages of adding new request types in this case, compared to changing the significance of the schema version:

  • Smaller diff (we don't need to add code to check the schema version in every request handler and in go-dqlite's response parser)

I don't think there's any need for that actually. As I was mentioning, the client is what "dictates" the schema, because it's the one initiating the conversation, by sending the request. If the schema version in the request that the client sends is not understood by the server, the server will reply with an error. Otherwise, if the server knows about that schema, it will reply with a schema that it knows the client understands. So it's fine for the client to not check the schema version, because it can assume that the server doesn't send it anything it can't understand or hasn't requested.

  • The schema version remains available if we want to add fields to the end of some request (response) in the future without breaking compatibility with old servers (clients)

More or less in the same spirit as above: I believe we need to tweak a bit this idea described in the protocol documentation (i.e. bumping the schema version when adding fields). What makes more sense to do is to have the client specify what it wants and have the server reply with it if it can, or with an error if it can't. Having the server replying with more fields that the client ignores (but maybe shouldn't because they are important), or worse having the client send fields that the server ignores, that sounds a bit pointless.

Could you elaborate on what the advantages are of using the schema version here?

The purpose of the schema version is changing the details of a request (i.e. how an operation is requested) while keeping its type (i.e. what operation is requested). So this case is exactly where a schema version bump should be generally used, as we're changing the details of the operation not the operation itself.

freeekanayaka avatar Sep 22 '22 20:09 freeekanayaka

If the schema version in the request that the client sends is not understood by the server, the server will reply with an error.

Currently we don't check the schema version at all -- the added code to do that and return an error if it's not recognized is what I was referring to.

Otherwise, if the server knows about that schema, it will reply with a schema that it knows the client understands.

Ah, okay, so we'd keep the schema version in sync between requests and the corresponding responses? What about something like RESPONSE_RESULT that corresponds to multiple request types (for which we might want to bump the schema version independently)?

(Thanks for your patience in discussing this, hopefully I'm not taking up too much of your time!)

cole-miller avatar Sep 22 '22 20:09 cole-miller

If the schema version in the request that the client sends is not understood by the server, the server will reply with an error.

Currently we don't check the schema version at all -- the added code to do that and return an error if it's not recognized is what I was referring to.

Ah I thought you meant modifications in the client. I don't think there's a way around modifying the server, it really needs to reply with an error if the client asks something that it can't handle.

Otherwise, if the server knows about that schema, it will reply with a schema that it knows the client understands.

Ah, okay, so we'd keep the schema version in sync between requests and the corresponding responses?

I don't think we need to keep them in sync. In some cases a new schema version in the request might require also a change in the response, and so require a new schema version in the response as well. But that's not always the case. For example in this specific case we need to introduce a new schema version for the request, but the schema version of the response can remain unchanged.

What about something like RESPONSE_RESULT that corresponds to multiple request types (for which we might want to bump the schema version independently)?

The message type of RESPONSE_RESULT is 6 (Statement execution result) and it's supposed to be a message sent by the server in response to a statement execution request. If the specific details of the message need to change we introduce a new schema version for that specific format. The request-type/schema combination is what will dictate the schema version in the response. It's very unlikely that different statement execution request types/schemas will ever require different or conflicting response schemas, but it can be handled in principle by simply using separate schema versions for the result response message.

(Thanks for your patience in discussing this, hopefully I'm not taking up too much of your time!)

You're welcome, my pleasure! I'm actually enjoying it :)

freeekanayaka avatar Sep 23 '22 08:09 freeekanayaka