redash
redash copied to clipboard
Select all queries throwing errors.
Looks like all my queries work but just noticed that "select * from any_table" results in the following error:
Error running query: 'utf8' codec can't decode bytes in position 0-1: invalid continuation byte
Not preventing me from using the product but I like to do select * queries on some tables while I build my queries.
I don't think it's an issue with SELECT *
, but with the data it returns. What data source type are you using?
MySQL innodb. Thanks for being so prompt! :)
Not aware of specific issues with utf8 in MySQL, but can you check your server logs for a stacktrace to see where this error comes from?
Ok...Just running some test queries. Its pretty clear that it's the hash columns that I have stored in my bigger tables. It appears it doesn't like all those crazy characters.
Btw, Redash doesn't apply any limits on the queries. So if you have huge tables, it will try to return all the rows.
yeah, these aren't big tables (yet) and I wouldn't normally run these queries....I was just trying to get a look at all the columns when I found the error.
👍
I'm not too worried about this but any thoughts on a fix in the near or far future?
Need to see the log to understand where this comes from.
which log?
How did you setup Redash?
on AWS using one of you AMI's.
Cool, then the log is at: /opt/redash/logs/celery_error.log.
I can reproduce this when a column has BINARY type in MySQL. (We store uuid4 string as binary to reduce data size)
We have a MySQL function to convert it to normal uuid4, so we don't have a problem though.
Also reproducible on SQL Server Geography types.
Looks like it's related to how pymssql handles binary/SQLGeography types in SQL Server.
A Workaround is to convert binary to something else, or convert SQLGeography column to text by calling column.STAsText()
https://docs.microsoft.com/en-us/sql/t-sql/spatial-geography/stastext-geography-data-type?view=sql-server-2017
Wondering if redash can do that manually when detecting binary column and show it in HEX, for example.
I was able to monkeypatch this. https://github.com/getredash/redash/compare/master...dodopizza:fix-mysql-binary-columns
We store UUIDs in MySQL as BINARY(16) columns. They are everywhere in all our databases.
When we query select * from orders
in redash, it fails with UnicodeDecodeError: 'utf8' codec can't decode byte 0xb0 in position 4: invalid start byte
.
It means that json serialization fails because it tries to encode mysql result as utf8 string, which it is not, just UUID bytes.
This is happening because MySQLdb treats BINARY as bytes, which is just a synonym of str in Python2. I think in Python 3 it wouldn't be a problem, because it wouldn't be unicode
The problematic line is https://github.com/PyMySQL/mysqlclient-python/blob/17045e8ed04579ec2d80e9daa9c7b69fbfcd5c2f/MySQLdb/converters.py#L88
_bytes_or_str = ((FLAG.BINARY, bytes), (None, unicode))
I monkeypatched this line to
_bytes_or_str = ((FLAG.BINARY, buffer), (None, unicode))
to resemble the PostgreSQL behavior: http://initd.org/psycopg/docs/usage.html#adaptation-of-python-values-to-sql-types
Psycopg treats bytea(which is BINARY) as buffer. Redash already has hexlify for buffer type: https://github.com/getredash/redash/blob/ef9a4d5eed82605ad672b4717c75c87c4b8609fa/redash/utils/init.py#L104-L105
With this monkey patching select * from orders
works and it shows our UUIDs as hexes.
So, what do you think? Should we report the problem with buffer in python2 to MySQLDb? Or just monkeypatch it in redash?
Thank you for looking into this, @spacentropy.
The monkey patching solution looks reasonable. But I wonder if we can do this on our end, by checking either the Cursor's description
or description_flags
properties to check if some column is a binary?
Regardless it's worth opening an issue on the mysql-client repo to get their feedback as well (but move forward with our own solution).
@arikfr BINARY columns now work fine in MySQL as of 9.0.0-beta. This must be result of porting to Python3 and having proper unicode/bytes types.
Thanks!
Similar issues still happens - I installed redash on a local VM running Ubuntu 18. Installation with the model script. Select * FROM ... a 2014 SQL server results to similar error. "Error running query: 'utf8' codec can't decode bytes in position 4-5: invalid continuation byte"
I have not monkeypatched, yet.