data-diff icon indicating copy to clipboard operation
data-diff copied to clipboard

Bug with latest `mysql-connector-python` (version 8.0.30)

Open MattDelac opened this issue 1 year ago • 3 comments

Was having issues with the MySQL connector on the latest version image

Had to manually downgrade it to 8.0.29 image

It might be better to add some lower/upper boundaries in the pyproject.toml of data-diff

MattDelac avatar Jul 29 '22 12:07 MattDelac

I can't reproduce your error. Version 8.0.30 works for me on both windows and linux.

Can you please provide more details on this error? How are you using data-diff, and on which platform?

erezsh avatar Aug 02 '22 12:08 erezsh

We encountered a similar issue. Using data-diff 0.2.3 to try to connect to MySQL 5.7.32. (which also ran into a different error while handling the first error). (We should upgrade to 0.2.4 but the original error would persist).

Traceback (most recent call last):
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/data_diff/databases/mysql.py", line 43, in create_connection
[INFO][2022-08-10 20:09:56 +0000]	return mysql.connect(charset="utf8", use_unicode=True, **self._args)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/pooling.py", line 286, in connect
[INFO][2022-08-10 20:09:56 +0000]	return CMySQLConnection(*args, **kwargs)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/connection_cext.py", line 101, in __init__
[INFO][2022-08-10 20:09:56 +0000]	self.connect(**kwargs)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1099, in connect
[INFO][2022-08-10 20:09:56 +0000]	self._post_connection()
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1071, in _post_connection
[INFO][2022-08-10 20:09:56 +0000]	self.set_charset_collation(self._charset_id)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1016, in set_charset_collation
[INFO][2022-08-10 20:09:56 +0000]	) = CharacterSet.get_charset_info(charset)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/constants.py", line 775, in get_charset_info
[INFO][2022-08-10 20:09:56 +0000]	info = cls.get_default_collation(charset)
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/mysql/connector/constants.py", line 746, in get_default_collation
[INFO][2022-08-10 20:09:56 +0000]	raise ProgrammingError(f"Character set '{charset}' unsupported")
[INFO][2022-08-10 20:09:56 +0000]	mysql.connector.errors.ProgrammingError: Character set '255' unsupported
[INFO][2022-08-10 20:09:56 +0000]
[INFO][2022-08-10 20:09:56 +0000]	During handling of the above exception, another exception occurred:
[INFO][2022-08-10 20:09:56 +0000]
[INFO][2022-08-10 20:09:56 +0000]	Traceback (most recent call last):
[INFO][2022-08-10 20:09:56 +0000]	File "/app/run_diff.py", line 40, in <module>
[INFO][2022-08-10 20:09:56 +0000]	c = target.create_connection()
[INFO][2022-08-10 20:09:56 +0000]	File "/usr/local/lib/python3.10/site-packages/data_diff/databases/mysql.py", line 50, in create_connection
[INFO][2022-08-10 20:09:56 +0000]	raise ConnectError(*e._args) from e
[INFO][2022-08-10 20:09:56 +0000]	AttributeError: 'ProgrammingError' object has no attribute '_args'. Did you mean: 'args'?

Preliminary investigation suggests that this is an upstream problem in mysql-connector which lists utf8mb4_0900_ai_ci as only supported on MySQL 8, while it is supported in MySQL 5.7 as well. It's related to their changes for the 8.0.30 release related to charset handling. ref

pawandubey avatar Aug 10 '22 20:08 pawandubey

So you're saying mysql-connector-python v8.0.30 introduces a bug? Did you open an issue there?

We can limit the version on our side, but I want to first make sure it's the best way forward.

erezsh avatar Aug 10 '22 21:08 erezsh

Closed due to inactivity.

erezsh avatar Sep 20 '22 11:09 erezsh

@erezsh Sorry for ressurecting this, but I'm hitting the same wall here.

I'm using MySQL 5.7.23 (AWS rds) and data-diff 0.3.0rc1 and when I run this:

from data_diff import connect_to_table

user = "xxxxxxxx"
password = "xxxxxxxxxxxxxxxxx"
database = "xxxxxxxxxxxxxxxx" 
hostname = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.rds.amazonaws.com"
db_info = f"mysql://{user}:{password}@{hostname}/{database}"

table = connect_to_table(
    db_info,
    table_name="mytable", 
)

print(table.count())

I get something like:

mysql://xxxxx:[email protected]/xxxxxxxxxxxx
INFO:database:[MySQL] Starting a threadpool, size=1.
DEBUG:google.auth._default:Checking None for explicit credentials as part of auth process...
DEBUG:google.auth._default:Checking Cloud SDK credentials as part of auth process...
DEBUG:database:Running SQL (MySQL): SELECT count(*) FROM `mytable`
CRITICAL:concurrent.futures:Exception in initializer:
Traceback (most recent call last):
  File "/home/xxxxxxxx", line 54, in create_connection
    return mysql.connect(charset="utf8", use_unicode=True, **self._args)
  File "/home/xxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/pooling.py", line 286, in connect
    return CMySQLConnection(*args, **kwargs)
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/connection_cext.py", line 101, in __init__
    self.connect(**kwargs)
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1112, in connect
    self._post_connection()
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1084, in _post_connection
    self.set_charset_collation(self._charset_id)
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1022, in set_charset_collation
    ) = CharacterSet.get_charset_info(charset)
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/constants.py", line 775, in get_charset_info
    info = cls.get_default_collation(charset)
  File "/home/xxxxxxxxxx/.pyenv/versions/data-diff-3.10.6/lib/python3.10/site-packages/mysql/connector/constants.py", line 746, in get_default_collation
    raise ProgrammingError(f"Character set '{charset}' unsupported")
mysql.connector.errors.ProgrammingError: Character set '255' unsupported

Things that worked for me:

  • Downgrading to mysql-connector-python==8.0.29
  • Removing charset="utf8" from https://github.com/datafold/data-diff/blob/master/data_diff/databases/mysql.py#L54

I'm a bit lost, since I cannot find a way to open an issue with mysql-connector-python, neither am I sure it is their issue, I'm not a developer and I find it kind of hard to navigate the code. @pawandubey links to a comment about it but I'm not sure how to proceed.

Is there something we can do?

geo909 avatar Oct 21 '22 12:10 geo909