connector-x
connector-x copied to clipboard
Clickhouse fetching boolean type support error
What language are you using?
Python.
What version are you using?
3.9
What database are you using?
Clickhouse
What dataframe are you using?
Pandas
Can you describe your bug?
I'm trying to access the clickhouse db, but I can't connect to it. As per the API doc on the read_sql() function I use SQLAlchemy connection string
SQLALCHEMY_CLICKHOUSE_URL = 'clickhouse://default:default@localhost:8123/default' but it doesn't connect. I changed the clickhouse part to mysql, but it doesn't work either. I manually added ports 8123 to the string, but no luck. I add protocol="text", but still nothing.
To test the connection in general, I used a postgres connection string SQLALCHEMY_POSTGRES_URL = "postgresql://postgres:postgres@localhost:5432/public" and it worked.
I also tried to change the localhost for an explicitly stated IP, but to no avail.
If possible, please include a minimal simple example including:
Database setup if the error only happens on specific data or data type
The IDE on Windows 10 and clickhouse is installed on the same Windows machine but on the WSL2 part
Example query / code
SQLALCHEMY_CLICKHOUSE_URL = 'mysql://default:default@localhost/default'
query = """SELECT COUNT (DISTINCT path) FROM staging"""
df = cx.read_sql(SQLALCHEMY_CLICKHOUSE_URL, query)
What is the error?
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
c:\Users\playground\sql_playground.ipynb Cell 58' in <cell line: 1>()
----> [1](vscode-notebook-cell:/c%3A/Users/playground/sql_playground.ipynb#ch0000056?line=0) df = cx.read_sql(SQLALCHEMY_MYSQL_URL, query)
File c:\Users\venv\lib\site-packages\connectorx\__init__.py:224, in read_sql(conn, query, return_type, protocol, partition_on, partition_range, partition_num, index_col)
221 except ModuleNotFoundError:
222 raise ValueError("You need to install pandas first")
--> 224 result = _read_sql(
225 conn,
226 "pandas",
227 queries=queries,
228 protocol=protocol,
229 partition_query=partition_query,
230 )
231 df = reconstruct_pandas(result)
233 if index_col is not None:
RuntimeError: timed out waiting for connection
I do have pandas installed.
Thanks
Hi @kuatroka , which version of connectorx are you using? If it is not the newest version (0.3.1-alpha.1), can you upgrade and see whether it works?
Sure, I'll update and try it again. Which syntax for the connection do you suggest though? Referencing clickhouse or mysql?
Hi @kuatroka , cx.read_sql("clickhouse://...", query) is the same with cx.read_sql("mysql://...", query, protocol="text"). Both of them should work.
The same error...
If I specify the port SQLALCHEMY_CLICKHOUSE_URL = 'clickhouse://default:[email protected]:8123/default'
I get
RuntimeError: timed out waiting for connection
If I don't specify the port SQLALCHEMY_CLICKHOUSE_URL = 'clickhouse://default:default@localhost/default'
I get
RuntimeError: timed out waiting for connection: DriverError { Could not connect to address 'localhost:3306': No connection could be made because the target machine actively refused it. (os error 10061) }
Hi @kuatroka , it seems like the error indicates that the database cannot be reached (at least through the mysql protocol in connectorx). Can you try to use a mysql client tool to test whether it can be connected?
Hi, I can't connect through the MySQL Workbench . I get this error message. (I have also tried with the port 9000 and no luck)

but I have no issues using DBeaver tool when connecting to it or Arctype
Maybe I need to somehow specify that I need to use the JDBC way as it seems to be working perfectly.

Thanks
Hi @kuatroka , we only support connecting to clickhouse with mysql protocol. I think it may be enabled by clickhouse through setting the mysql_port like mentioned here.
Hi @wangxiaoying Yes, my installation has this port enabled too. I used the 9004 to connect with the standard python module import mysql.connector but the connectorx still gives me errors. The error messages changed though.
For the standard default user and when using the port 9004 I now get this
RuntimeError: timed out waiting for connection: DriverError { Unknown authentication protocol: `sha256_password` }
I have created a new user pep in clickhouse and gives me this error
RuntimeError: MySqlError { ERROR 48 (00000): Code: 48. DB::Exception: Command [ERRFMT] is not implemented. (NOT_IMPLEMENTED) (version 22.7.1.84 (official build)) }
It's obvious that with the new user I get the connection as it reaches the server version 22.7.1.84, but something else breaks
Thanks
Hi @kuatroka , Looks like the second error is similar to this. have you set the protocl=text or use clickhouse:// instead of mysql:// when getting this error: RuntimeError: MySqlError { ERROR 48 (00000): Code: 48. DB::Exception: Command [ERRFMT] is not implemented. (NOT_IMPLEMENTED) (version 22.7.1.84 (official build)) } ?
Hi @wangxiaoying. Here go various error and the corresponding settings
1.
connection_string = 'clickhouse://pep:pep@localhost:9004/default'
Error
PanicException: Could not retrieve core::option::Option<u8> from Value
2.
connection_string = 'mysql://pep:pep@localhost:9004/default' and no protocol='text'
Error
RuntimeError: MySqlError { ERROR 48 (00000): Code: 48. DB::Exception: Command [ERRFMT] is not implemented. (NOT_IMPLEMENTED) (version 22.7.1.84 (official build)) }
3.
connection_string = 'mysql://pep:pep@localhost:9004/default' plus protocol='text'
Error
PanicException: Could not retrieve core::option::Option<u8> from Value
Hi @kuatroka , it looks like 1 and 3 can successfully connect to the database, but it throws error when retrieving the data Could not retrieve core::option::Option<u8> from Value. Can you narrow down which type of data that cause the failure?
Hi @wangxiaoying. Thanks for the idea. I went through the 22 columns of the table and it's a boolean column that gives me the error below
PanicException: Could not retrieve core::option::Option<u8> from Value
All the other columns are ok with no errors.
It's definitely a problem with Boolean values. I cast the Boolean column into Integer and connector-x brought the data in with no errors
The modified query for that one columns is this
"""SELECT column1::UInt16 FROM table LIMIT 5"""