airflow-clickhouse-plugin icon indicating copy to clipboard operation
airflow-clickhouse-plugin copied to clipboard

Define a special connection type for ClickHouse

Open babaralishah opened this issue 3 years ago • 16 comments

Hi! I am on this issue for a few days but am unable to complete this task. I am using apache airflow on docker. I have installed the click house plugins in my container. Now the issue that I am having is I am unable to see the connection type for the click house. I have happily connected the MySql to the airflow, but I am unable to connect the click house to the airflow. I have also attached some SS for a better understanding of my problem statement before u people. image_2022_01_05T09_56_33_525Z image_2022_01_05T10_00_16_606Z image_2022_01_05T09_56_10_887Z

babaralishah avatar Jan 06 '22 13:01 babaralishah

@babaralishah I've faced the same issue. Connection types are registered from provider.yaml files (like this). It is explained here in the official docs. Might be a good idea to create a community provider from this plugin, btw.

For now, the workaround would be be to create a new connection via API call or via CLI airflow connections add 'clickhouse_connection' --conn-uri 'clickhouse://user:password@host'. Bear in mind, that if you edit connection created this way in UI, it's connection type might be overwritten by whatever the UI shows.

georborodin avatar Jan 06 '22 15:01 georborodin

@babaralishah I've faced the same issue. Connection types are registered from provider.yaml files (like this). It is explained here in the official docs. Might be a good idea to create a community provider from this plugin, btw.

For now, the workaround would be be to create a new connection via API call or via CLI airflow connections add 'clickhouse_connection' --conn-uri 'clickhouse://user:password@host'. Bear in mind, that if you edit connection created this way in UI, it's connection type might be overwritten by whatever the UI shows.

Hi! I am unable to find the Airlfow Cli, what should i do now? is it possible that i can run that in the cli of Docker?

babaralishah avatar Jan 07 '22 10:01 babaralishah

I have ran these commands: airflow connections add 'clickhouse_connection3' --conn-uri 'clickhouse+native://default@localhost:4040' airflow connections add 'clickhouse_connection' --conn-uri 'clickhouse://default@localhost:4040'

image

and got the click house in the connections list but still unable to execute the queries, and getting this error: clickhouse_driver.errors.NetworkError: Code: 210. Cannot assign requested address (localhost:4041)

babaralishah avatar Jan 07 '22 10:01 babaralishah

@babaralishah just docker exec that command in any container running Airflow.

georborodin avatar Jan 07 '22 10:01 georborodin

@babaralishah just docker exec that command in any container running Airflow.

yeah ran that command: image

babaralishah avatar Jan 07 '22 10:01 babaralishah

@babaralishah where is your ClickHouse server running? This plugin uses NATIVE clickhouse protocol, so most likely you should use port 9000 and make sure the server is available on localhost or use the correct host to connect. You can set the connection type to HTTP.

ne1r0n avatar Jan 07 '22 11:01 ne1r0n

@babaralishah where is your ClickHouse server running? This plugin uses NATIVE clickhouse protocol, so most likely you should use port 9000 and make sure the server is available on localhost or use the correct host to connect. You can set the connection type to HTTP.

image

@ne1r0n i have mentioned my ports of click house that is running in the docker

babaralishah avatar Jan 07 '22 11:01 babaralishah

@babaralishah you should use not localhost but ClickHouse container IP to access it from another container (Airflow). localhost inside a container is a container itself. Read more about Docker network configuration.

@georborodin thank you for your help.

Looks like this is an issue with Airflow v2+. In case you are able to configure a connection provider and make a PR, I may review it. But please make sure to test it properly for different versions.

You may also use older versions of Airflow, e.g. v2.0–2.2 are confirmed to be supported.

bryzgaloff avatar Jan 07 '22 12:01 bryzgaloff

@bryzgaloff I would definitely be interested in contributing. If all goes well, I'll provide a PR in a couple of days, the process seems straightforward (at least now 😀). Should I start a separate issue explaining the benefits of turning this plugin into a provider?

georborodin avatar Jan 07 '22 12:01 georborodin

Should I start a separate issue explaining the benefits of turning this plugin into a provider?

Wow, will it change this plugin completely? I am not yet familiar with providers. Are they much different to plugins? I expected that you will have to only define some YAML file to let Airflow know details about new connection types.

bryzgaloff avatar Jan 07 '22 14:01 bryzgaloff

@bryzgaloff don't know for sure. If the route from docs is to be taken, the project layout will need changes as well as maybe migrating code to Airflow repo.

I think, that only the provider.yaml file might be enough, but haven't tested it yet.

georborodin avatar Jan 07 '22 15:01 georborodin

It started working when I passed the connection type as Sqlite and the port that I have exposed to the click house is 9000

babaralishah avatar Jan 08 '22 17:01 babaralishah

Thank you all for your feedback, connection-related instructions are now added to README: https://github.com/whisklabs/airflow-clickhouse-plugin/blob/master/README.md#how-to-create-an-airflow-connection-to-clickhouse

This issue is kept open till we define a special connection type for ClickHouse.

bryzgaloff avatar Aug 08 '22 05:08 bryzgaloff

@babaralishah where is your ClickHouse server running? This plugin uses NATIVE clickhouse protocol, so most likely you should use port 9000 and make sure the server is available on localhost or use the correct host to connect. You can set the connection type to HTTP.

How do I switch to HTTP?

markusf1895 avatar Feb 10 '24 02:02 markusf1895

Use Sqlite as Connection Type in Airflow Ui and the below {"secure":false,"verify":false} to the Extra option

stilyng94 avatar Mar 29 '24 12:03 stilyng94

@babaralishah where is your ClickHouse server running? This plugin uses NATIVE clickhouse protocol, so most likely you should use port 9000 and make sure the server is available on localhost or use the correct host to connect. You can set the connection type to HTTP.

How do I switch to HTTP?

Hi @markusf1895, this plugin does not support HTTP interface since underlying clickhouse-driver does not support it.

bryzgaloff avatar Mar 29 '24 14:03 bryzgaloff