Cluster support for clickhouse
Feature description
ALTER, CREATE, DROP queries in multiple nodes Clickhouse setup should include "on cluster <cluster_name>" clause to run query on all nodes in cluster. Now it seems that it has not been implemented.
Are you a dlt user?
I'd consider using dlt, but it's lacking a feature I need.
Use case
Correct work with multiple nodes setup
Proposed solution
on_cluster_mode and cluster_name added to clickhouse.credentials all ddl parameterised with "on cluster"
Related issues
https://github.com/dlt-hub/dlt/issues/1183
same issue here.
Hi, We have the same issue, we have sharded setup and additionaly to ON CLUSTER we also need to create a base and a distributed table pair to have the full data set available on all replicas. I have modified the execute_query function in the ClickHouse sql_client to modify queries generated by dlt to work with the distributed ClickHouse setup. What we do: CREATE TABLE:
- add ON CLUSTER
- add additional create statement so that we create the pair base table and distributed table DROP TABLE:
- add ON CLUSTER
- add additional statement to drop both table ALTER TABLE
- add ON CLUSTER
- add additional statement to alter both table DELETE FROM
- adjust to delete from the base table
The base table would use ReplicatedMergeTree engine and the distributed table would use theDistributed engine and reference the base table
I also added additional configuration parameters distributed_tables = true cluster = "cluster_name" base_table_name_postfix = "base" base_table_database_prefix = ""
Only if distributed_tables is set to true and cluster is set, will the queries get adjusted. I'll create an PR soon. Main issue is, that the code cannot be tested without having Clickhouse setup with replicas and distributed tables. I can also copy the code here if anyone want's to try it out.
Here is the PR if you wanna see the code changes
Do you think we could add tests in the same way they are made in DBT-ClickHouse? For example, see https://github.com/clickhouse/dbt-clickhouse/tree/main/tests/integration.
Do you think we could add tests in the same way they are made in DBT-ClickHouse? For example, see https://github.com/clickhouse/dbt-clickhouse/tree/main/tests/integration.
It looks like it should be possible, I can get a better answer next week from teammates. If I get any useful info I'll add it here
Any idea when this will be included in a release?
Any idea when this will be included in a release?
probably next one
Any update of this patch ?
implementation note: based on work done in two PRs linked to this issue we should consider this a separate destination. query rewrite looks massive and reuse for MERGE jobs may not be possible.