Cannot run `tumble(ext_stream, 1d)` on external stream, DB::Exception: This input format is only suitable for streams with a single column of type String but the number of columns is 3. (BAD_ARGUMENTS)
Describe what's wrong
Use case, I want to check the number of messages every day in the Kafka topic. I created an external stream for the topic, then want to run the tumble window with time rewind
SELECT window_start, count() FROM tumble(ext_stream,1d) where _tp_time>earliest_ts() group by window_start
Got error
Received exception from server (version 1.5.4): Code: 36. DB::Exception: Received from localhost:8463. DB::Exception: This input format is only suitable for streams with a single column of type String but the number of columns is 3. (BAD_ARGUMENTS)
Using the virtual column in the tumble window has the same error
SELECT window_start, count() FROM tumble(ext_stream,_tp_time,1d) where _tp_time>earliest_ts() group by window_start
One workaround is to create a subquery
with cte as (select _tp_time,raw from ext_stream settings seek_to='earliest')
SELECT window_start, count() FROM tumble(cte,1d) group by window_start
But this is anti-intuition.
The issue for the error message
- What is "This input format"? Why applying
tumblefunction will create a input format? - "a single column of type String", should be
string, notString - "only suitable for streams with a single column of type String but the number of columns is 3" I think the message swapped the object. Could be the opposite: "This input format is only suitable for streams with the number of columns is 3 but we got a single column of type String"
This can be reproduced in latest Proton 1.5.4.
@zliang-min may you hep take a look ? Thanks
@jovezhong could you show create to share the stream definition ?
I shared the DDL on our slack, since it contains sensitive info, such as password. Any local Kafka should work too
(Jove Github Bot) assuming it is not done, deferred this ticket to the next sprint.