dbmate icon indicating copy to clipboard operation
dbmate copied to clipboard

Support all DSN variants for clickhouse DATABASE_URL

Open StarpTech opened this issue 2 years ago • 8 comments

Description Hello, the clickhouse driver should support all DSN variants. Currently, I need to pass two connection strings to use dbmate with an application that uses clickhouse HTTP protocol.

  • Version: 2.6.0
  • Database: clickhouse
  • Operating System: Linux

Steps To Reproduce

DATABASE_URL=http://default:changeme@localhost:8123/default

Expected Behavior

HTTP should just work. The clickhouse driver has a clickhouse.ParseDSN function that does the job for all variants. I can provide a PR.

DATABASE_URL=http://default:changeme@localhost:8123/default

StarpTech avatar Oct 18 '23 16:10 StarpTech

I would be in favor of supporting this "experimental" HTTP transport, based on the ClickHouse README change on Jul 8, 2022 and still marked experimental today, as long as the database URL driver is prefixed, for example, clickhouse+http://.

dossy avatar Nov 15 '23 17:11 dossy

Related discussion: #474.

dossy avatar Nov 16 '23 17:11 dossy

Well, even if we do add the support for http client for clickhouse, this won't change the fact you will need to use a different DATABASE_URL for dbmate, as we still need a db-specific prefix to map to the correct driver.

@dossy In my opinion, we can just the update the scheme to be "http" instead of "clickhouse" in case the user provided a port of "8123" (http) or 8443 (https). Instead of using a new specific driver prefix.

What do you think?

FatherCandle avatar Nov 25 '23 16:11 FatherCandle

@dossy In my opinion, we can just the update the scheme to be "http" instead of "clickhouse" in case the user provided a port of "8123" (http) or 8443 (https). Instead of using a new specific driver prefix.

What do you think?

I'm not a fan of the idea of using the port number to infer the transport layer: it could lead to unintended surprises when a user, not aware of this "magical" association, uses a port that suddenly changes the transport layer.

We often use the same port numbers for particular services out of tradition, because most services are not officially registered with IANA. But, ultimately, port numbers are arbitrary: there is nothing technically stopping you from running your database bound to port 1/tcp, aside from some antiquated "binding to privileged ports" restriction which is simply a matter of configuration, or using a port redirection like NAT.

Requiring a user to specify the transport in the URL, e.g., clickhouse:// vs. clickhouse+http:// or clickhouse+https://, is explicit and clear. There is no question as to what database driver, and what transport layer protocol will be used, when looking at the URL.

Now, I think the idea of defaulting the port, if not specified in the URL, depending on the URL scheme, may make sense if there is an expectation that a particular port will be used by default. In other words, specifying clickhouse+http:// as the scheme could default to port 8123 if a port is not specified explicitly in the URL, and clickhouse+https:// default to port 8443 if not explicitly specified in the URL. If those are the two most commonly used and expected ports by ClickHouse users who use the HTTP/HTTPS transports, then I think it makes sense to use those as defaults.

dossy avatar Nov 25 '23 19:11 dossy

Thanks for the elaborated response, I agree with the reasonings + I think that supplying the default based on scheme is a great idea. 8123 and 8443 are indeed the acceptable ports for http and https by clickhouse as can be read from the docs: https://clickhouse.com/docs/en/guides/sre/network-ports

I think I will try to tackle this one

FatherCandle avatar Nov 25 '23 21:11 FatherCandle

Agree with @dossy, we cannot register something generic like http:// to Clickhouse. clickhouse+http:// is a good solution and is clear.

If we want to support interop with existing clickhouse connection strings, then we could potentially offer a new optional CLI flag to override the driver. For example something like:

dbmate --url "http://default:changeme@localhost:8123/default" --driver clickhouse up

Thus the clickhouse driver could support both clickhouse+http:// and http://, but only clickhouse+http:// would automatically load the correct driver. I haven't thought through all the implications of this, but I think it might work.

amacneil avatar Nov 25 '23 21:11 amacneil

If we want to support interop with existing clickhouse connection strings, then we could potentially offer a new optional CLI flag to override the driver.

I'm in favor of this approach, because ClickHouse users can use the same DATABASE_URL string with code that directly uses the clickhouse-go client or dbmate. Does it make sense to use DBMATE_DRIVER as an environment variable for specifying the driver, too?

dossy avatar Nov 26 '23 03:11 dossy

Yeah DBMATE_DRIVER env var makes sense too - matches the other supported env vars

amacneil avatar Nov 26 '23 06:11 amacneil