clickhouse.rs icon indicating copy to clipboard operation
clickhouse.rs copied to clipboard

Interpret `?` as a parameter only if `.bind` was called at least once

Open slvrtrn opened this issue 1 year ago • 7 comments

Related to https://github.com/ClickHouse/clickhouse-rs/pull/154

Considering this query:

client
    .query("SELECT 1 FROM test WHERE a IN 'a?b'")
    .fetch_one::<u8>()
    .execute()
    .await?;

Currently, an unescaped ? is still considered a missing parameter; after the fix, the query should be written as:

client
    .query("SELECT 1 FROM test WHERE a IN 'a??b'")
    .fetch_one::<u8>()
    .execute()
    .await?;

This might be problematic if a query is pre-generated elsewhere (file, etc) and contains question marks, but we did not want to do any client-side binding.

slvrtrn avatar Sep 23 '24 14:09 slvrtrn

The context: this crate uses custom parameters (implemented before CH started to support server-side parameters). Now, server-side parameters can be used, as shown in https://github.com/ClickHouse/clickhouse-rs/issues/142#issuecomment-2341182968 (not an ideal solution, but already possible). Users can use server-side parameters and don't realize that their queries can fail due to an extra ? in the query.

The idea here has a drawback: users will get an invalid query if they want to use ? but forget all .bind() calls. However, they will get a runtime error from CH in this case, so it's not a big difference.

loyd avatar Sep 24 '24 07:09 loyd

Could it be solved with https://github.com/ClickHouse/clickhouse-rs/pull/243 ?

ozgurcancal avatar Jun 26 '25 17:06 ozgurcancal

Perhaps we can consider this issue closed by that PR, yes.

slvrtrn avatar Jun 26 '25 17:06 slvrtrn

I agree with closing this, preferring alternative API to cluttered semantics

For comparison with an API that does this, in https://github.com/PeerDB-io/peerdb/pull/3073 I had to stop using parameters completely with clickhouse-go unless I was going to have separate functions for escaping literals/identifiers based on whether the query had parameters or not

serprex avatar Jun 26 '25 18:06 serprex

I'm against closing this one and don't think that #243 and this issue are replaceable.

Let's start with common statements about our goals regarding the query API (IMO obviously):

  1. Provide a concise API and use it in most examples.
  2. Encourage users to use param() over bind().
  3. Still support bind() for advanced usages.
  4. Support ?fields (or some alternative syntax) but probably force users to decide whether they want this or not.

Pros of query_raw:

  • It solves (2). However, this issue also solves (2).

Cons of query_raw:

  • It's longer than query(), so it's strange to consider this as some "default API".
  • It completely disables (4), which is, in my practice, used everywhere.
  • It completely forbids (2).
  • (can be fixed) It's available at the top level, forcing people to think about it too early.

Pros of lazy interpretation of ? (this issue):

  • It follows (1), leaving the current API as is.
  • It solves (2), preventing accidental replacements until a user directly uses bind().
  • It allows (3) and is even backward compatible for users already using it.
  • It supports (4).

Cons of lazy interpretation of ?:

  • It doesn't target the problem of disabling ?fields in (4).
  • Users will get an invalid query if they want to use ? but forget all .bind() calls.

It seems to me that this issue is more appropriate as a default solution than query_raw() API.

According to the issues noted above, we lack only disabling ?fields in (4), so let's dedicate ourselves to this topic. Note also that in API without T: Row, this replacement makes no sense, for instance, in the fetch_bytes() method.

So, let's consider an alternative solution where we force users to think whether they want or not to replace ?fields:

// replace ?fields, do not replace ?
client.query("SELECT ?fields FROM table").fetch::<MyRow, true>()?

// replace nothing ("query_raw")
client.query("SELECT a,b FROM table").fetch::<MyRow, false>()?

// replace ?fields and ?
client.query("SELECT ?fields FROM table WHERE a = ?").bind(42).fetch::<MyRow, true>()?

It's a breaking change, but caught at compile time. I don't like the non-descriptive "true" and "false" here, probably there are better options.

Another way is to introduce a new bind_fields() method and disable ?fields by default. However, this will require passing the name of the struct twice, and can be used in the wrong way (call .bind_fields::<A>().fetch::<B>()). Besides this, it's a runtime breaking change which is much worse than the previous option.

We can also add a method to disable ?fields, replacing it while leaving it replace-by-default.

@ozgurcancal @serprex @slvrtrn what do you think? To clarify: I'm not opposed to query_raw (although I probably would be against leaving it at the top level); instead, I'm referring to the default query API that isn't covered by query_raw.

loyd avatar Jul 02 '25 07:07 loyd

I think @loyd has a valid point.

Considering boolean constants:

client.query("SELECT ?fields FROM table WHERE a = ?")
  .bind(42)
  .fetch::<MyRow, true>()?

I wonder if we could make bitmask enum work using const magic:

enum QueryInterpolation { None = 0x01, Fields = 0x02, Bind = 0x04 }

// ... add extra traits to QueryInterpolation for type-safety, so it's not just u8

// ... use binds for both

client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields bind works!
  .bind(42) // also works!
  .fetch::<MyRow, QueryInterpolation::Fields | QueryInterpolation::Bind>()

// ... use just ?fields

client.query("SELECT ?fields FROM table WHERE a = ?") // only ?fields bind works!
  .bind(42) // and this one is ignored!
  .fetch::<MyRow, QueryInterpolation::Fields>()

// ... use just ?, ignore ?fields

client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields ignored!
  .bind(42) // only this one works!
  .fetch::<MyRow, QueryInterpolation::Bind>()

// ... or even disable both

client.query("SELECT ?fields FROM table WHERE a = ?") // ?fields ignored!
  .bind(42) // ignored!
  .fetch::<MyRow, QueryInterpolation::None>()

then we add QueryInterpolation::Fields | QueryInterpolation::Params as the default in the impl, and perhaps that won't even be a breaking change?

slvrtrn avatar Jul 02 '25 11:07 slvrtrn

In addition to your comments, the bind function is also used with execute, so it would be verbose to handle it in the fetch function for ex in here:


client
    .query("DROP DATABASE IF EXISTS ?")
    .with_option("wait_end_of_query", "1")
    .bind(Identifier(&db_name))
    .execute()
    .await
    .unwrap_or_else(|err| panic!("cannot drop db {db_name}, cause: {err}"));

I think we should move the interpolation template parameters to the Query struct. So user can either use the query as-is (with the default value), or explicitly pass Query<QI::Bind> when needed.

ozgurcancal avatar Jul 24 '25 19:07 ozgurcancal