ch icon indicating copy to clipboard operation
ch copied to clipboard

Ideas - Feel free to post ideas.

Open ruslandoga opened this issue 2 years ago • 14 comments

Feel free to post ideas.

That idea itself is stolen from https://github.com/containers/youki/issues/10

ruslandoga avatar May 01 '23 06:05 ruslandoga

Livebook smart cell :)

aerosol avatar May 02 '23 06:05 aerosol

Streaming the results of queries would be great. I think it requires setting the connection mode to active though.

hkrutzer avatar May 02 '23 13:05 hkrutzer

Streaming the results of queries would be great. I think it requires setting the connection mode to active though.

There is an undocumented Ch.stream function that is used in the benchmarks. It doesn't decode the raw data into rows very well yet (right now it only does the decoding if :types option is provided).

Here's a minimal example:

{:ok, pid} = Ch.start_link()

Ch.run(pid, fn conn ->
  conn
  |> Ch.stream("select number from system.numbers limit {limit:UInt64}", %{"limit" => 1_000_000}, types: ["UInt64"]) 
  |> Stream.each(fn rows -> IO.puts("at #{Time.utc_now()} got #{length(List.flatten(rows))} rows") end)
  |> Stream.run()
end)
at 13:24:49.737616 got 2560 rows
at 13:24:49.742150 got 36864 rows
at 13:24:49.752535 got 50657 rows
at 13:24:49.760068 got 51169 rows
at 13:24:49.765942 got 50688 rows
at 13:24:49.771635 got 51169 rows
at 13:24:49.776692 got 50657 rows
at 13:24:49.782096 got 51169 rows
at 13:24:49.787199 got 50657 rows
at 13:24:49.792197 got 51200 rows
at 13:24:49.797056 got 50657 rows
at 13:24:49.802442 got 51169 rows
at 13:24:49.807619 got 50657 rows
at 13:24:49.812214 got 51200 rows
at 13:24:49.816771 got 50657 rows
at 13:24:49.821738 got 51169 rows
at 13:24:49.826986 got 51169 rows
at 13:24:49.831561 got 50657 rows
at 13:24:49.836822 got 51200 rows
at 13:24:49.841795 got 50657 rows
at 13:24:49.846984 got 44018 rows

And some tests.

ruslandoga avatar May 02 '23 13:05 ruslandoga

That's great! Will it support not having to specify types in the future?

hkrutzer avatar May 15 '23 09:05 hkrutzer

@hkrutzer yeah, most likely. We'd "just" need to decode types from RowBinaryWithNamesAndTypes and store them in the accumulator.

ruslandoga avatar May 15 '23 09:05 ruslandoga

is there a way to hook into clickhouse's WATCH functionality?

jaronoff97 avatar May 30 '24 02:05 jaronoff97

I think that functionality is deprecated and will be removed: https://clickhouse.com/docs/en/sql-reference/statements/create/view#live-view-deprecated

hkrutzer avatar May 30 '24 06:05 hkrutzer

👋 @jaronoff97

I wasn't able to create a live view on clickhouse/clickhouse-server:24.3.3.102-alpine but either way it should be possible to consume it with Ch.stream like in https://github.com/plausible/ch/issues/82#issuecomment-1531460986

eb6fbe5ed437 :) CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Syntax error: failed at position 21 ('WITH'):

CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Expected one of: token, Dot, UUID, ON, TO, OpeningRoundBracket, sql security, DEFINER, SQL SECURITY, AS

I expect something like this to work, timeout might be tricky:

{:ok, pid} = Ch.start_link(pool_size: 1)

DBConnection.run(pid, fn conn ->
  Ch.stream(conn, "watch lv format JSONEachRowWithProgress") 
  |> Stream.each(&IO.inspect/1)
  |> Stream.run()
end, timeout: :infinity)

ruslandoga avatar May 30 '24 07:05 ruslandoga

Did you run set allow_experimental_live_view = 1 first?

hkrutzer avatar May 30 '24 07:05 hkrutzer

No, but it doesn't seem to have changed much. Somehow it's a syntax error.

eb6fbe5ed437 :) set allow_experimental_live_view = 1;

SET allow_experimental_live_view = 1

Query id: e40f3eb8-548e-4b3d-bc0e-8f92456f076c

Ok.

0 rows in set. Elapsed: 0.001 sec.

eb6fbe5ed437 :) CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Syntax error: failed at position 21 ('WITH'):

CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Expected one of: token, Dot, UUID, ON, TO, OpeningRoundBracket, sql security, DEFINER, SQL SECURITY, AS

eb6fbe5ed437 :)

ruslandoga avatar May 30 '24 08:05 ruslandoga

ah okay, thank you! :bow:

jaronoff97 avatar May 30 '24 14:05 jaronoff97

@ruslandoga I tried Ch.stream for select query, but I can't get the Ch.RowBinary.decode_rows to work reliably. It does work for simple queries, but it breaks for complex queries (or for any queries that return a lot of data back (eg SELECT number, randomFixedString(1024) FROM system.numbers_mt LIMIT {limit:UInt64}). I am assuming this is because Ch.Result{data: data} doesn't have any concept of alignment? and decode_rows only works if the data is aligned at row level?. I still can make it work with FORMAT CSV and doing my csv parse_stream, but just want to confirm if that is the only way to get this work reliably.

ananthakumaran avatar Dec 27 '24 11:12 ananthakumaran

👋 @ananthakumaran

I am assuming this is because Ch.Result{data: data} doesn't have any concept of alignment? and decode_rows only works if the data is aligned at row level?

Yes. Right now Ch.stream is used in Plausible only as a "raw" data pipe into a Zip file.

We would probably need to implement an incremental version of Ch.RowBinary.decode_rows for Ch.stream to work reliably.

ruslandoga avatar Dec 28 '24 10:12 ruslandoga