metabase-clickhouse-driver
metabase-clickhouse-driver copied to clipboard
JSON unfolding / nested objects
Describe the bug
Nested JSON object are not rendered / queryable
Steps to reproduce
- Load a Clickhouse table containing nested objects
- See the table / try to query it
Expected behaviour
Data able to be displayed
Error log
Configuration
Environment
- metabase-clickhouse-driver version: 1.2.2
- Metabase version: 47.3
- OS: MacOS
ClickHouse server
- ClickHouse Server version: 23.9.1.1653
CREATE TABLEstatements for tables involved:
CREATE or REPLACE TABLE default."arm-dev.bet-updates.json"
(
id Int32,
minStakeLimit Decimal(11, 2),
maxStakeLimit Decimal(11, 2),
numberCombinations Int32,
legs Nested
(
id Int32,
legNumber Int32,
legType String,
parts Nested
(
id Int32,
),
errors Array(String)
),
errors Array(String),
creationTime Float64,
customer Tuple
(
id Int32,
tenantId Int32,
sportsbookRef String,
stakeFactor Decimal(11, 2),
testing Nullable(UInt8)
)
)
- Sample data for all these tables, use [clickhouse-obfuscator]
select id, minStakeLimit, maxStakeLimit, numberCombinations, legs, errors, creationTime, customer from "arm-dev.bet-updates.json" order by id, creationTime;
SELECT
id,
minStakeLimit,
maxStakeLimit,
numberCombinations,
legs,
errors,
creationTime,
customer
FROM `arm-dev.bet-updates.json`
ORDER BY
id ASC,
creationTime ASC
Query id: 70a0bb6a-f5f7-4922-9376-2420898ebcaf
┌──id─┬─minStakeLimit─┬─maxStakeLimit─┬─numberCombinations─┬─legs──────────────────────────────────────────────────────────────────────┬─errors─┬───────creationTime─┬─customer────────────┐
│ 252 │ 0.1 │ 1000 │ 1 │ [(252,1,'WIN',[(430,1,'WIN','SP',1436808,241272,18935,143,5,1,true)],[])] │ [] │ 1695802093.3665981 │ (1,1,'one_gbp',1,0) │
│ 253 │ 0.1 │ 1000 │ 1 │ [(253,1,'WIN',[(431,1,'WIN','SP',1435945,241116,18922,268,5,1,true)],[])] │ [] │ 1695806272.1980581 │ (1,1,'one_gbp',1,0) │
│ 254 │ 0.1 │ 1000 │ 1 │ [(254,1,'WIN',[(432,1,'WIN','SP',1435947,241116,18922,268,5,1,true)],[])] │ [] │ 1695806570.9034572 │ (1,1,'one_gbp',1,0) │
│ 255 │ 0.1 │ 1000 │ 1 │ [(255,1,'WIN',[(433,1,'WIN','SP',1436808,241272,18935,143,5,1,true)],[])] │ [] │ 1695806939.310054 │ (1,1,'one_gbp',1,0) │
│ 256 │ 0.1 │ 1000 │ 1 │ [(256,1,'WIN',[(434,1,'WIN','SP',1435945,241116,18922,268,5,1,true)],[])] │ [] │ 1695808316.8749661 │ (1,1,'one_gbp',1,0) │
│ 257 │ 0.1 │ 1000 │ 1 │ [(257,1,'WIN',[(435,1,'WIN','SP',1435945,241116,18922,268,5,1,true)],[])] │ [] │ 1695813181.35838 │ (1,1,'one_gbp',1,0) │
See how it renders:
And how I can't query it:
This looks like a feature request. I see that clickhouse-jdbc now supports Nested.
We will check if it's possible to map it within Metabase properly, as the generated queries should be quite unusual (like that legs.id bit).
If Metabase cannot support nested typed structures, then theoretically, during the introspection phase, we can recursively flatten it for starters, e.g.
CREATE or REPLACE TABLE default."arm-dev.bet-updates.json"
(
id Int32,
legs Nested
(
id Int32,
legNumber Int32,
parts Nested
(
id Int32,
)
)
);
This could be introspected by Metabase driver as an equivalent of:
CREATE or REPLACE TABLE default."arm-dev.bet-updates.json"
(
`id` Int32,
`legs.id` Int32
`legs.legNumber` Int32,
`legs.parts.id` Int32
);
because something like
SELECT `legs.legNumber`, `legs.parts.id` FROM default."arm-dev.bet-updates.json";
is actually a valid query.
However, I need to verify that this approach flies with the actual generated queries from within Metabase itself, so no promises here yet.
Regarding the Metabase error: isn't it the same in clickhouse-client for the given query?
SELECT id
FROM default.`arm-dev.bet-updates.json`
WHERE legs.id = 42
0 rows in set. Elapsed: 0.002 sec.
Received exception from server (version 23.8.3):
Code: 43. DB::Exception: Received from localhost:9000. DB::Exception: Illegal types of arguments (Array(Int32), UInt8) of function equals: While processing legs.id = 42. (ILLEGAL_TYPE_OF_ARGUMENT)
Thanks @slvrtrn !
Yes you are very right, Clickhouse Client is also erroring!
The WHERE clause operations should assume you are actually working with an array when filtering by nested column.
See how Goal.ID is used in WHERE clause here:
SELECT
Goals.ID,
Goals.EventTime
FROM test.visits
WHERE CounterID = 101500 AND length(Goals.ID) < 5
LIMIT 10
Regarding the implementation, we also have the :nested-field-columns Metabase feature which is used for JSONs and similar stuff, but again, because of a tricky WHERE clause, I am not entirely sure whether it will fit.
I suppose that before any fix is implemented I can explode my JSON in Clickhouse, maybe via a view, right?
@atillier, yes, a view could work fine in this case.