metabase-clickhouse-driver icon indicating copy to clipboard operation
metabase-clickhouse-driver copied to clipboard

Enable `:nested-field-columns` feature in the driver

Open frankyso opened this issue 1 year ago • 7 comments

Enhance ClickHouse Driver to Support JSON Parsing

Introduction

  • Brief Description: Proposal to add JSON parsing support to the ClickHouse driver in Metabase.
  • Motivation: ClickHouse has robust JSON functions that are currently underutilized in Metabase, limiting the usability for datasets that contain JSON fields.

Feature Description

  • Detailed Explanation: Integrate JSON parsing functions within the ClickHouse SQL driver in Metabase to allow users to directly query JSON-type fields and extract data in a more flexible and powerful manner.
  • Examples:
    • SELECT JSONExtractRaw(data, '$.key') FROM table_name WHERE JSONHas(data, '$.key');
    • This feature would allow for extracting elements directly from JSON columns in queries.

Benefits

  • Use Cases: Users dealing with JSON data stored in ClickHouse will be able to perform more complex queries directly from Metabase, improving analytics capabilities.
  • Potential Impact: Enhances Metabase's utility for users who work with modern data types, such as JSON, especially in big data and NoSQL contexts.

Additional Context

  • Related Issues or Discussions: None at this time.
  • Implementation Suggestions: Consider leveraging existing ClickHouse functions like JSONExtractRaw, JSONHas, etc., within the Metabase ClickHouse driver.

frankyso avatar Jun 25 '24 09:06 frankyso

From your description, it looks like we need to enable the :nested-field-columns feature in the driver and implement the required methods.

https://github.com/metabase/metabase/blob/v0.50.7/src/metabase/driver.clj#L482-L483

;; Does this database support nested fields but only for certain field types (e.g. Postgres and JSON / JSONB columns)?
:nested-field-columns

This is probably also related: https://www.metabase.com/docs/latest/data-modeling/json-unfolding

Is it correct?

slvrtrn avatar Jun 27 '24 13:06 slvrtrn

Yes, that's correct. Enabling the :nested-field-columns feature for the ClickHouse driver seems to be the appropriate approach for supporting JSON parsing. This feature, as described in the Metabase documentation, matches our requirements for handling JSON fields effectively

frankyso avatar Jun 28 '24 03:06 frankyso

@frankyso, I tried a few things with the driver code, and while ClickHouse works well with JSON stored in String columns, it's tricky to make the driver recognize this as a "nested" type (so the JSON features will be enabled for this particular column) because String is already matched to a "text" MB base type. Hacks like using a workaround based on a type (something like Variant(String)) don't look like a good option to me, and the old JSON ClickHouse type is now obsolete.

However, luckily, the new semi-structured data type (to be used in exactly these scenarios) is planned to be available in 24.7 (see https://github.com/ClickHouse/ClickHouse/issues/54864#issuecomment-2189803374). Using this new type will allow the implementation of the nested-field-columns feature to be much cleaner.

So, when 24.7 (or maybe even the head version containing the required changeset) is out, I will start working on this in the driver code.

slvrtrn avatar Jul 01 '24 10:07 slvrtrn

Hello @slvrtrn !

The new JSON field was released a couple of versions ago (behind a feature flag) but its usage with Metabase is not very convenient :/

Selecting any property of the JSON field without casting it to a proper type fails with:

java.lang.IllegalArgumentException: Unknown data type: Dynamic

Would the implementation of nested-field-columns improve this or are we forced to explicitly type all of the JSON columns we use?

Thanks!

cptjacky avatar Oct 01 '24 15:10 cptjacky

@cptjacky, I think we will need to wait until https://github.com/ClickHouse/clickhouse-java/issues/1833 is resolved.

slvrtrn avatar Oct 01 '24 17:10 slvrtrn

You're right! I'll subscribe to that issue too, thank you!

cptjacky avatar Oct 02 '24 08:10 cptjacky

Hi! @slvrtrn Looks like the issue https://github.com/ClickHouse/clickhouse-java/issues/1833 is solved. Do you have any plans to work on this feature? We use CH + Metabase and working with native json would help our guys to boost productivity significantly.

VatsekVlad avatar Feb 17 '25 08:02 VatsekVlad