dbt-clickhouse
dbt-clickhouse copied to clipboard
feat: allow to add connection overrides for dictionaries
Summary
This PR fixed an issue in which ClickHouse was using the configured host to load the dictionary, however, this would fail if it was running on Docker and would be slower if it was using a URL to connect to ClickHouse. It implements a new config for the dictionary model allowing people to override connection params.
Before
SOURCE(CLICKHOUSE(USER 'ch_admin' PASSWORD '[HIDDEN]' HOST 'clickhouse' PORT '8123' QUERY ...)
After
SOURCE(CLICKHOUSE(USER 'ch_admin' PASSWORD '[HIDDEN]' QUERY ...)
Checklist
Delete items not relevant to your PR:
- [ ] Unit and integration tests covering the common scenarios were added
- [ ] A human-readable description of the changes was provided to include in CHANGELOG
- [ ] For significant changes, documentation in https://github.com/ClickHouse/clickhouse-docs was updated with further explanations or tutorials
@BentsiLeviav This PR is ready for review
@BentsiLeviav, I was literally about to create an issue asking for very similar change. Only difference is I think it might be nice to support omitting credentials altogether for the table source? i.e.
SOURCE(CLICKHOUSE(TABLE 'table_name'))
Specifying the user in our case fails in our development environment because developers are not able to directly authenticate to ClickHouse from the server itself. As a result we get an auth error.
I think @rjoelnorgren brings up a good point. Would it be possible to update the credentials
dict with connection_overrides
and, if a given field is empty, omit that from the configuration?
As an example, here's what I imagine a potential model definition to look like (using the example in the dictionary test):
{{ config(
materialized='dictionary',
fields=[
('id', 'Int32'),
('name', 'String'),
('hacker_alias', 'String')
],
primary_key='id',
layout='COMPLEX_KEY_HASHED()',
lifetime='1',
source_type='clickhouse',
connection_overrides={
"host": "",
"port": "",
},
) }}
select
id,
name,
case
when name like 'Dade' then 'crash_override'
when name like 'Kate' then 'acid burn'
when name like 'Eugene' then 'the plague'
else 'N/A'
end as hacker_alias
from {{ source('raw', 'people') }}
and the subsequent dictionary DDL would omit the HOST
and PORT
sections of the configuration. Does that make sense?