clickhouse-operator Is there a place to put initial Schema with DDL

I plan to create the initial table on a ClickHouse Cluster with Distributed DDL Queries like the following.

CREATE TABLE events_local ON CLUSTER '{cluster}' (
    time DateTime,
    event_id  Int32,
    uuid UUID
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/{table}', '{replica}')
PARTITION BY toYYYYMM(time)
ORDER BY (event_id);

CREATE TABLE events_main ON CLUSTER '{cluster}' AS events_local
ENGINE = Distributed('{cluster}', default, events_local, rand());

Usually we can put initial db code in /docker-entrypoint-initdb.d. But when I use the code above as the initial db code, I will receive the following errors.

 (version 21.11.11.1 (official build))
2022.05.06 04:03:28.640488 [ 181 ] {} <Debug> DNSResolver: Updated DNS cache
2022.05.06 04:03:29.282773 [ 251 ] {} <Debug> DDLWorker: Initializing DDLWorker thread
2022.05.06 04:03:29.313217 [ 251 ] {} <Debug> DDLWorker: Initialized DDLWorker thread
2022.05.06 04:03:29.313475 [ 251 ] {} <Debug> DDLWorker: Scheduling tasks
2022.05.06 04:03:29.317209 [ 251 ] {} <Debug> DDLWorker: Will schedule 1 tasks starting from query-0000000000
2022.05.06 04:03:29.329013 [ 251 ] {} <Error> DNSResolver: Cannot resolve host (chi-clickhouse-clickhouse-0-1), error 0: chi-clickhouse-clickhouse-0-1.
2022.05.06 04:03:29.329476 [ 251 ] {} <Error> DDLWorker: Unexpected error, will try to restart main thread:: Code: 198. DB::Exception: Not found address of host: chi-clickhouse-clickhouse-0-1. (DNS_ERROR), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x9b7f194 in /usr/bin/clickhouse
1. ? @ 0x9c2c1d1 in /usr/bin/clickhouse
2. ? @ 0x9c2c9a2 in /usr/bin/clickhouse
3. DB::DNSResolver::resolveAddress(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned short) @ 0x9c2d8a3 in /usr/bin/clickhouse
4. DB::HostID::isLocalAddress(unsigned short) const @ 0x11f7142b in /usr/bin/clickhouse
5. DB::DDLTask::findCurrentHostID(std::__1::shared_ptr<DB::Context const>, Poco::Logger*) @ 0x11f743a1 in /usr/bin/clickhouse
6. DB::DDLWorker::initAndCheckTask(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, std::__1::shared_ptr<zkutil::ZooKeeper> const&) @ 0x11f7dd03 in /usr/bin/clickhouse
7. DB::DDLWorker::scheduleTasks(bool) @ 0x11f81548 in /usr/bin/clickhouse
8. DB::DDLWorker::runMainThread() @ 0x11f7b4cd in /usr/bin/clickhouse
9. ThreadFromGlobalPool::ThreadFromGlobalPool<void (DB::DDLWorker::*)(), DB::DDLWorker*>(void (DB::DDLWorker::*&&)(), DB::DDLWorker*&&)::'lambda'()::operator()() @ 0x11f8fcda in /usr/bin/clickhouse
10. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x9bc1217 in /usr/bin/clickhouse
11. ? @ 0x9bc4c1d in /usr/bin/clickhouse
12. start_thread @ 0x9609 in /usr/lib/x86_64-linux-gnu/libpthread-2.31.so
13. __clone @ 0x122293 in /usr/lib/x86_64-linux-gnu/libc-2.31.so

It seems there is a deadlock that the pod is waiting for other replicas to be ready, while Kubernetes does not schedule another pod until the first replica is ready.

I've tried that these Distributed DDL Queries can be run manually in clickhouse-client when every pod is ready. But is there any recommended way to automate these initial db code?

May 17 '22 04:05 yanjunz97

See https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/02-templates-07-bootstrap-schema.yaml

Don't use on cluster in this case, if you put init script on every node it will be executed automatically. Alternatively, you can do it on a first node only and let operator do the reset.

May 24 '22 16:05 alex-zaitsev

See https://github.com/Altinity/clickhouse-operator/blob/master/docs/chi-examples/02-templates-07-bootstrap-schema.yaml

Don't use on cluster in this case, if you put init script on every node it will be executed automatically. Alternatively, you can do it on a first node only and let operator do the reset.

Thanks @alex-zaitsev Removing on cluster works for the initialization. But I notice the pod will be in CrashLoopBackOff status with the following errors when it is restarted. Do you know how to deal with this?

Received exception from server (version 22.3.6):
Code: 253. DB::Exception: Received from localhost:9000. DB::Exception: Replica /clickhouse/tables/bootstrap/0/events_local/replicas/chi-bootstrap-bootstrap-0-0 already exists. (REPLICA_IS_ALREADY_EXIST)
(query: CREATE TABLE IF NOT EXISTS events_local (
    time DateTime,
    event_id  Int32,
    uuid UUID
  )
  ENGINE = ReplicatedMergeTree('/clickhouse/tables/{cluster}/{shard}/{table}', '{replica}')
  PARTITION BY toYYYYMM(time)
  ORDER BY (event_id);)

May 24 '22 21:05 yanjunz97

I suppose it would be possible to add the ability for the operator to simply run some Job according to a template whenever it successfully completes the installation or update process.

May 25 '22 09:05 R-omk

At the moment we have done in our helm chart run the job that waits until access to each pod of the cluster over the network appears and then executes cluster DDL queries on any pod. But this is not entirely correct, in fact, we need to wait for a signal from the cluster object that the reconcile process is completed.

May 25 '22 09:05 R-omk

Don't use on cluster in this case, if you put init script on every node it will be executed automatically. Alternatively, you can do it on a first node only and let operator do the reset.

Sorry I didn't get your point here @alex-zaitsev, what if I need a certain distributed table to be available once the cluster is ready? Removing on cluster doesn't seem like it leads to the same result.

I suppose it would be possible to add the ability for the operator to simply run some Job according to a template whenever it successfully completes the installation or update process.

I ran into this issue too and I suppose having a field for DDL schema initialization on the CRD that gets triggered by the Operator once it receives the cluster available signal sounds like the best solution.

Jun 29 '23 14:06 czephyr

any updates on this?

Jul 27 '23 09:07 czephyr

My workaround using ansible atm:


  - name: Stopping until cluster is ready
    kubernetes.core.k8s_info:
      kind: ClickHouseInstallation
      name: chi
    register: result
    until: result.resources | json_query('[*].status.status') | select('match','Completed') | list | length == 1
    retries: 20
    delay: 15
 
  - name: Finding out name of the pod hosting shard 0
    kubernetes.core.k8s_info:
      kind: Pod
      label_selectors:
        - clickhouse.altinity.com/shard = 0
    register: result
  
  - name: Executing sh with DDL mounted through configmap
    kubernetes.core.k8s_exec:
      pod: "{{ result.resources[0].spec.hostname }}"
      command: ./opt/ddl-provision-script/initdbScript.sh

sh file:

  apiVersion: v1
  kind: ConfigMap
  metadata:
    name: ddl-tables-configmap
  data:
    initdbScript.sh: |
      #!/bin/bash
      set -e
  
      clickhouse client -n <<-EOSQL
          ~DDL queries here~ 
      EOSQL

Aug 30 '23 13:08 czephyr