clickhouse-connect icon indicating copy to clipboard operation
clickhouse-connect copied to clipboard

Return column structure in case of empty resultset

Open zerafachris opened this issue 2 years ago • 3 comments

Describe the bug

Return column structure in case of empty resultset

Steps to reproduce

  1. Use the following query "SELECT 1 as 'column1' LIMIT 0"

Expected behaviour

Have a new setting, which will change behaviour in case of an empty result set being returned.

In case of pandas, driver.query_df(ch_query) should return pd.DataFrame({'column1' : [] })

This was resolved in the previous driver here https://github.com/ClickHouse/ClickHouse/issues/33522

zerafachris avatar Oct 23 '23 09:10 zerafachris

There is no easy way to do this using the HTTP interface, as ClickHouse doesn't return the column name or types in Native Format when the result set is empty. The driver would have to use a different format (such as JSONEachRow) for the query in order to be able to build the column structure. Note that the native interface used in clickhouse-driver works differently and does return an "empty block" with the column structure even for empty queries, so the implementation is more straightforward in that case.

If you know ahead of time that the query result will be empty (for example, with your LIMIT 0 example), maybe there could be something like a "structure only" query that would use the JSONEachRow format over HTTP.

genzgd avatar Oct 23 '23 11:10 genzgd

Hi @genzgd , Thanks for the reply. Using query_arrow is am able to return the columns, whilst using 'query_df' the column set is empty. Maybe we can use a partial implementation from pyarrow to get the column lists?

I have tagged the part of code which is returning the empty dataframe and would require an update here https://github.com/ClickHouse/clickhouse-connect/pull/119

zerafachris avatar Oct 23 '23 11:10 zerafachris

Eventually it may make sense to switch the Pandas code to use Arrow instead of the ClickHouse Native format, in particular for Pandas 2.x, which is now backed by Arrow, but that's a pretty significant change. In the meantime I've opened a feature request https://github.com/ClickHouse/ClickHouse/issues/55939 that would make this possible without major code changes, but I don't know if and when the ClickHouse team will get to it.

genzgd avatar Oct 23 '23 12:10 genzgd