clickhouse-sink-connector
clickhouse-sink-connector copied to clipboard
Do not parse Engine from Clickhouse table definition
Sync Connector tries to parse the Clickhouse table engine definition to get the names of version and deleted columns for ReplacingMergeTree.
That creates a problem with creating tables with custom structures with different engines such as ReplicatedReplacingMergeTree, CollapsingMergeTree, ReplicatedVersionedCollapsingMergeTree, EmbeddedRocksDB, Null, etc. It would be difficult to make robust code and test it for such a wide variety of Engines (new Engines also could be added to Clickhouse in the future). Better not to parse it at all.
Sync Connector only needs column list and column types for processing. So it could parse only them. _version and _deleted columns could have fixed names or their names could be defined in the config file:
clickhouse.table.version: "_version"
clickhouse.table.deleted: "_deleted"
Currently only the RMT is the only supported engine. We may support MergeTree for history tables. But that's an enhancement.
It's not a small enhancement, but the feature that extends the overall functionality of the Connector to the very high level by MVs in Kafka Engine stile with any possible Clickhouse SQL functionality.
The Null Engine is needed in the first place. Not MergeTree.
For example, it could be used to make a workaround for aggregation tables until exactly once delivery is implemented. See discussion here - https://github.com/Altinity/clickhouse-sink-connector/issues/364#issuecomment-1803237794
Many complicated DWH transformations could be created with Null Engine and MVs.