superset icon indicating copy to clipboard operation
superset copied to clipboard

Error when using "TIME COLUMN" along with "X-AXIS" for same value for Clickhouse

Open NikoJ opened this issue 1 year ago • 8 comments

When using Mixed Chart I encountered an Error when using TIME COLUMN along with X-AXIS for the same value when the source is Clickhouse.

The error looks like:

Error: :HTTPDriver for http://xx.xxx.x.84:8123 returned response code 500) Code: 215. DB::Exception: Column `dt` is not under aggregate function and not in GROUP BY. Have columns: ['count()','toStartOfDay(toDateTime(toStartOfDay(toDateTime(dt))))']: While processing toStartOfDay(toDateTime(dt)) AS dt, count() AS ...

I found a mistake in the query generation for Clickhouse:

SELECT toStartOfDay(toDateTime(`dt`)) AS `dt`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.test_table_v2) AS `virtual_table`
GROUP BY toStartOfDay(toDateTime(`dt`))
ORDER BY `count` DESC
LIMIT 10000;

The alias specified in the select SELECT toStartOfDay(toDateTime(dt)) AS dt is then substituted for the section with grouping GROUP BY toStartOfDay(toDateTime(toStartOfDay(toDateTime(dt)))).

How to reproduce the bug

  1. Create Database with Clickhouse Connect
  2. Create table
CREATE TABLE IF NOT EXISTS demo(
id UInt64,
message String,
dt Date
) ENGINE = MergeTree()
ORDER BY id;
  1. Insert test data
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-02-05' as dt FROM numbers(100);
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-02-12' as dt FROM numbers(100);
INSERT INTO demo SELECT abs(toInt8(rand(2))) as id, fuzzBits('Demo',0.01) as message, '2023-03-01' as dt FROM numbers(100);

  1. Create Dataset based on clickhouse data
  2. Create Chart based on Dataset from p. 5 with chart type `Mixed Chart
  • TIME COLUMN = dt
  • TIME GRAIN = Day
  • X-AXIS = dt
  • Query A METRICS = SUM(id)
  • Query B METRICS = SUM(id)

see Screenshots

Expected results

In my opinion there are two ways. Or generate a query with a different alias or substitute an alias in the grouping section.

SELECT toStartOfDay(toDateTime(`dt`)) AS `dt_`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.demo) AS `virtual_table`
GROUP BY toStartOfDay(toDateTime(`dt`))
ORDER BY `count` DESC
LIMIT 10000;
SELECT toStartOfDay(toDateTime(`dt`)) AS `dt`,
   	COUNT(*) AS `count`
FROM
  (SELECT dt,
      	id,
      	message
   FROM default.demo) AS `virtual_table`
GROUP BY `dt`
ORDER BY `count` DESC
LIMIT 10000;

Screenshots

i1

Environment

  • browser type and version: Firefox/110.0
  • superset version: 2.0.1
  • python version: 3.8.12
  • superset-helm-chart-0.8.5
  • clickhouse-connect==0.5.3

Checklist

  • [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • [x] I have reproduced the issue with at least the latest released version of superset.
  • [x] I have checked the issue tracker for the same issue and I haven't found one similar.

NikoJ avatar Mar 15 '23 13:03 NikoJ

This annoying bug happens all the time. I hope it will be fixed soon.

ktncktnc avatar Apr 20 '23 08:04 ktncktnc

I have it too 🙁

danielG1beat avatar Jun 07 '23 11:06 danielG1beat

alias after group by should not be the same with the raw column name.

eriendeng avatar Aug 03 '23 15:08 eriendeng

I know but such query generates Superset. I used a trick to solve this problem but it's inconvenient

NikoJ avatar Aug 03 '23 20:08 NikoJ

I have the same issue on version 2.1.0 when trying to use timestamp as dimension in Aggregate Table.

I used a trick to solve this problem but it's inconvenient

@NikoJ can you share your workaround?

nikolastojkoski avatar Aug 23 '23 04:08 nikolastojkoski

Unfortunately still here in 3.0.0.

image

Rydberg95 avatar Sep 21 '23 14:09 Rydberg95

A workaround I found is actually to edit the Date column that is related to the error to be defined as a "Custom SQL" function, like this:

image

Make sure that the column has the function icon after it's edited (for me i had to input a blank space and remove it before saving in above step. When it is a function, it looks like below, with the f(x) icon at the start.

image

This seems to mitigate the error, and so far is not producing any side effects noticed by me. Obviously not a solution worth a PR, but could be an interim solution.

EDIT: This didn't work for another case, when I tried using a date column as the x-axis of a line chart. I fiddled around some more, and giving the column an alias withing Superset (see below) seems to be another way similar to above interim solution. image

Rydberg95 avatar Sep 21 '23 14:09 Rydberg95

Still present in 3.0.1. The workaround above with setting a custom expression and alias works. Similar situation where we're trying to use a DateTime64 column as X axis and it's grouping by day, with inconsistent use of aliases in the generated query.

chriscomeau79 avatar Oct 24 '23 20:10 chriscomeau79

Can be fixed on ClickHouse side, with override default value for property prefer_column_name_to_alias

via user setting for default profile

<clickhouse>
<profiles>
<default>
<prefer_column_name_to_alias>1</prefer_column_name_to_alias>
....
</default>
...

kovitals avatar Apr 10 '24 12:04 kovitals

Thanks @kovitals - if anyone else on this thread finds this to be a sufficient answer (or even a sufficient workaround), this would be a great documentation update on the Clickhouse docs page. A PR from anyone would be helpful on that point to close this out.

rusackas avatar Apr 10 '24 14:04 rusackas

@rusackas Thanks! When I have free time, I will check it and then close this issue as solved if everything is ok.

NikoJ avatar Apr 15 '24 10:04 NikoJ

Can be fixed on ClickHouse side, with override default value for property prefer_column_name_to_alias

via user setting for default profile

<clickhouse>
<profiles>
<default>
<prefer_column_name_to_alias>1</prefer_column_name_to_alias>
....
</default>
...

Thanks, seems to work great!

As an alternative, if someone else wants to add the setting via SQL-driven workflow, the below command sets up a new settings profile and assigns it to a user:

CREATE SETTINGS PROFILE superset SETTINGS prefer_column_name_to_alias = 1 TO *your_clickhouse_username*

Rydberg95 avatar Apr 15 '24 14:04 Rydberg95