metabase-clickhouse-driver icon indicating copy to clipboard operation
metabase-clickhouse-driver copied to clipboard

can you support foreign key?

Open captainbob opened this issue 6 years ago • 13 comments

why do you not support foreign key?

captainbob avatar Apr 22 '19 03:04 captainbob

ClickHouse does not have any foreign key concept, so the driver would have to implement some kind of "fake" support. Currently, I want to focus on making all integration tests pass before trying to experiment with fake foreign keys.

What is your actual use case?

enqueue avatar Apr 23 '19 07:04 enqueue

ClickHouse does not have any foreign key concept, so the driver would have to implement some kind of "fake" support. Currently, I want to focus on making all integration tests pass before trying to experiment with fake foreign keys.

What is your actual use case?

My data model is a star model, it has Fact Table and Dimension Table . so when searching , i need join several tables to search.

captainbob avatar Apr 26 '19 03:04 captainbob

As you know, ClickHouse DBMS does not have any explicit foreign keys, so the driver cannot help MetaBase discover the relationships between your fact table and the dimension tables.

Have you tried modelling your schema's relations using the MetaBase admin overview? If this does not work for you, what kind of errors do you encounter? See the following example: Messages exchanged with airlines identified by AWB prefix.

Airlines (dimension table):

metabase_ch_primary_key

Messages (fact table):

metabase_ch_foreign_key

As a result you should be able to navigate from a question to the related dimension entry.

ClickHouse works very well with flat, large tables. So you might also consider creating a de-normalized representation of the star schema, e.g. via a (materialized) view or via a custom ETL script.

I hope that helps. Looking forward to hearing about your experiences.

enqueue avatar Apr 26 '19 05:04 enqueue

@captainbob I am closing this issue for now, please see my explanations above.

enqueue avatar Jun 13 '19 17:06 enqueue

I am going to give it another try. ClickHouse has better support for Joins, and it would be really good if we could expose it.

enqueue avatar Aug 22 '19 09:08 enqueue

With the new JOIN feature from the Metabase 0.33.0 version, this would be awesome.

StephenSorriaux avatar Aug 23 '19 09:08 StephenSorriaux

Can you please try with the 0.5-fks release? I enabled the foreign key feature and have started to add some "fake" foreign key support for tests.

enqueue avatar Aug 23 '19 21:08 enqueue

Sure, thanks for the release!

StephenSorriaux avatar Aug 26 '19 07:08 StephenSorriaux

I just tested your 0.5-fks release which performed pretty well except that:

  • When querying a DateTime field from the joined table, ClickHouse raised a exception with code 40: ClickHouse exception, code: 47, host: 1.2.3.4., port: 8123; Code: 47, e.displayText() = DB::Exception: Received from 4.5.6.7:9000. DB::Exception: Unknown identifier: user_creation_date. (version 19.8.3.8 (official build)) The query generated was :
SELECT (toMonday((toDate(`user__via__user_uuid`.`user_creation_date`) + 1)) - 1) AS `user_creation_date`,
count(*) AS `count`
FROM `user`.`subscription`
LEFT JOIN `user`.`user` `user__via__user_uuid` ON `user`.`subscription`.`user_uuid` = `user__via__user_uuid`.`user_uuid`
GROUP BY (toMonday((toDate(`user__via__user_uuid`.`user_creation_date`) + 1)) - 1)
ORDER BY (toMonday((toDate(`user__via__user_uuid`.`user_creation_date`) + 1)) - 1) ASC,

Changing the alias user_creation_date to anything else solves the problem

  • It breaks another query used for a region map. Compared with the one generated with your 0.5 version driver, it seems also to be related to alias. I suspect this might be related to the deleted lines in https://github.com/enqueue/metabase-clickhouse-driver/commit/27300084c7b95d3aa4da5133a163b5500d90b5de#diff-f5f7d61903bb99f2123db15458387b76L250

Please note I am currently using a ClickHouse server in version 19.8.3.8 which is not the latest one.

StephenSorriaux avatar Sep 05 '19 15:09 StephenSorriaux

@StephenSorriaux thank you very much for testing and reporting your observations :+1: I hope I will find some time to work on this issue shortly.

enqueue avatar Sep 09 '19 05:09 enqueue

@StephenSorriaux I tried to replicate your example here, and it works OK. Perhaps ClickHouse has become better at dealing with column name versus alias conflicts (the reason we used to have the _mb_alias suffix).

Here is my query (subscription = airline_messages, user = Airline):

SELECT
  (modulo(toDayOfWeek(toDateTime(`Airline`.`change_date`)), 7) + 1) AS `change_date`,
  count() AS `count`
FROM
  `myclickhousedb`.`airline_messages`
LEFT JOIN
  `myclickhousedb`.`reference_airlines` `Airline`
ON `myclickhousedb`.`airline_messages`.`comm_partner` = `Airline`.`awb_prefix`
WHERE
(
  `Airline`.`airline_name` IS NOT NULL
  AND notEmpty(`Airline`.`airline_name`) = 1
)
GROUP BY
  (modulo(toDayOfWeek(toDateTime(`Airline`.`change_date`)), 7) + 1)
ORDER BY
  (modulo(toDayOfWeek(toDateTime(`Airline`.`change_date`)), 7) + 1) ASC;

ClickHouse version is 19.11.2.7.

enqueue avatar Oct 11 '19 07:10 enqueue

@enqueue Yes, I think you may be right, I remember having seen some issues related to this that were fixed now. I am planning to update all my ClickHouse instances very soon and give it another try. I will let you know.

StephenSorriaux avatar Oct 11 '19 07:10 StephenSorriaux

@StephenSorriaux what have been your experiences so far after the update?

enqueue avatar Feb 08 '20 20:02 enqueue

I hope that it works OK for now

enqueue avatar Aug 24 '22 14:08 enqueue