clickhouse-docs
clickhouse-docs copied to clipboard
Improve replication documentation
(you don't have to strictly follow this form)
Describe the issue I am using the guide at: https://clickhouse.com/docs/en/architecture/replication and it took me 2 days to figure out how to make replication work because the guide did not have clear instructions.
In section https://clickhouse.com/docs/en/architecture/replication#replication-and-sharding-configuration For self-hosters that have the servers and keepers running on different machines as the guide recommends, follow these steps to configure config.xml:
- On each server, in the commandline issue: hostname --fqdn
- Use exactly that value for the host under <remote_servers>
otherwise replication will not work at all! Example:
<replica>
<host>dev-us-west-infra.us-west1-a.c.leftoverstoday-dev.internal</host>
<port>9000</port>
</replica>
<replica>
<host>dev-us-east-infra.us-east4-a.c.leftoverstoday-dev.internal</host>
<port>9000</port>
</replica>
Additional context Ideally SELECT hostname(); should also return this complete name dev-us-west-infra.us-west1-a.c.leftoverstoday-dev.internal instead of just returning dev-us-west-infra because its confusing and the short name is not what is used in system.distributed_ddl_queue as the initiator_host.
Please publish a self-host guide where each component is on a different VM and the values actually match what the system expects to function.
Without the above fix you end up with this issue: https://github.com/ClickHouse/ClickHouse/issues/18341 DDLWorker: Will not execute task query-0000000005: There is no a local address in host list
Ideally SELECT hostname(); should also return this complete name dev-us-west-infra.us-west1-a.c.leftoverstoday-dev.internal instead of just returning dev-us-west-infra because its confusing and the short name is not what is used in system.distributed_ddl_queue as the initiator_host.
FYI there is SELECT fqdn(); for it
There are also two features to not make it manually at all specifically to not make mistakes in cluster configuration files as it is hard to maintain
- https://clickhouse.com/docs/en/engines/database-engines/replicated
- https://clickhouse.com/docs/en/operations/cluster-discovery
Oh wow auto discovery is really cool! I wish the guide had sections that pointed to these tools to make it easy to discover and use! Thank you for sharing.