Interpret `?` as a parameter only if `.bind` was called at least once
Related to https://github.com/ClickHouse/clickhouse-rs/pull/154
Considering this query:
client
.query("SELECT 1 FROM test WHERE a IN 'a?b'")
.fetch_one::<u8>()
.execute()
.await?;
Currently, an unescaped ? is still considered a missing parameter; after the fix, the query should be written as:
client
.query("SELECT 1 FROM test WHERE a IN 'a??b'")
.fetch_one::<u8>()
.execute()
.await?;
This might be problematic if a query is pre-generated elsewhere (file, etc) and contains question marks, but we did not want to do any client-side binding.
The context: this crate uses custom parameters (implemented before CH started to support server-side parameters).
Now, server-side parameters can be used, as shown in https://github.com/ClickHouse/clickhouse-rs/issues/142#issuecomment-2341182968 (not an ideal solution, but already possible). Users can use server-side parameters and don't realize that their queries can fail due to an extra ? in the query.
The idea here has a drawback: users will get an invalid query if they want to use ? but forget all .bind() calls. However, they will get a runtime error from CH in this case, so it's not a big difference.
Could it be solved with https://github.com/ClickHouse/clickhouse-rs/pull/243 ?
Perhaps we can consider this issue closed by that PR, yes.
I agree with closing this, preferring alternative API to cluttered semantics
For comparison with an API that does this, in https://github.com/PeerDB-io/peerdb/pull/3073 I had to stop using parameters completely with clickhouse-go unless I was going to have separate functions for escaping literals/identifiers based on whether the query had parameters or not
I'm against closing this one and don't think that #243 and this issue are replaceable.
Let's start with common statements about our goals regarding the query API (IMO obviously):
- Provide a concise API and use it in most examples.
- Encourage users to use
param()overbind(). - Still support
bind()for advanced usages. - Support
?fields(or some alternative syntax) but probably force users to decide whether they want this or not.
Pros of query_raw:
- It solves (2). However, this issue also solves (2).
Cons of query_raw:
- It's longer than
query(), so it's strange to consider this as some "default API". - It completely disables (4), which is, in my practice, used everywhere.
- It completely forbids (2).
- (can be fixed) It's available at the top level, forcing people to think about it too early.
Pros of lazy interpretation of ? (this issue):
- It follows (1), leaving the current API as is.
- It solves (2), preventing accidental replacements until a user directly uses
bind(). - It allows (3) and is even backward compatible for users already using it.
- It supports (4).
Cons of lazy interpretation of ?:
- It doesn't target the problem of disabling
?fieldsin (4). - Users will get an invalid query if they want to use
?but forget all.bind()calls.
It seems to me that this issue is more appropriate as a default solution than query_raw() API.
According to the issues noted above, we lack only disabling ?fields in (4), so let's dedicate ourselves to this topic.
Note also that in API without T: Row, this replacement makes no sense, for instance, in the fetch_bytes() method.
So, let's consider an alternative solution where we force users to think whether they want or not to replace ?fields:
// replace ?fields, do not replace ?
client.query("SELECT ?fields FROM table").fetch::<MyRow, true>()?
// replace nothing ("query_raw")
client.query("SELECT a,b FROM table").fetch::<MyRow, false>()?
// replace ?fields and ?
client.query("SELECT ?fields FROM table WHERE a = ?").bind(42).fetch::<MyRow, true>()?
It's a breaking change, but caught at compile time. I don't like the non-descriptive "true" and "false" here, probably there are better options.
Another way is to introduce a new bind_fields() method and disable ?fields by default. However, this will require passing the name of the struct twice, and can be used in the wrong way (call .bind_fields::<A>().fetch::<B>()). Besides this, it's a runtime breaking change which is much worse than the previous option.
We can also add a method to disable ?fields, replacing it while leaving it replace-by-default.
@ozgurcancal @serprex @slvrtrn what do you think? To clarify: I'm not opposed to query_raw (although I probably would be against leaving it at the top level); instead, I'm referring to the default query API that isn't covered by query_raw.
I think @loyd has a valid point.
Considering boolean constants:
client.query("SELECT ?fields FROM table WHERE a = ?")
.bind(42)
.fetch::<MyRow, true>()?
I wonder if we could make bitmask enum work using const magic:
enum QueryInterpolation { None = 0x01, Fields = 0x02, Bind = 0x04 }
// ... add extra traits to QueryInterpolation for type-safety, so it's not just u8
// ... use binds for both
client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields bind works!
.bind(42) // also works!
.fetch::<MyRow, QueryInterpolation::Fields | QueryInterpolation::Bind>()
// ... use just ?fields
client.query("SELECT ?fields FROM table WHERE a = ?") // only ?fields bind works!
.bind(42) // and this one is ignored!
.fetch::<MyRow, QueryInterpolation::Fields>()
// ... use just ?, ignore ?fields
client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields ignored!
.bind(42) // only this one works!
.fetch::<MyRow, QueryInterpolation::Bind>()
// ... or even disable both
client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields ignored!
.bind(42) // ignored!
.fetch::<MyRow, QueryInterpolation::None>()
then we add QueryInterpolation::Fields | QueryInterpolation::Params as the default in the impl, and perhaps that won't even be a breaking change?
In addition to your comments, the bind function is also used with execute, so it would be verbose to handle it in the fetch function for ex in here:
client
.query("DROP DATABASE IF EXISTS ?")
.with_option("wait_end_of_query", "1")
.bind(Identifier(&db_name))
.execute()
.await
.unwrap_or_else(|err| panic!("cannot drop db {db_name}, cause: {err}"));
I think we should move the interpolation template parameters to the Query struct. So user can either use the query as-is (with the default value), or explicitly pass Query<QI::Bind> when needed.