server icon indicating copy to clipboard operation
server copied to clipboard

🐛 [DBTOOLS] Character set 'utf8' unsupported

Open kaincenteno opened this issue 2 years ago • 8 comments

  • [x] I have paid attention to this example and will edit again if need be to not break the formatting, or I will be ignored
  • [x] I have searched existing issues to see if the issue has already been opened, and I have checked the commit log to see if the issue has been resolved since my server was last updated
  • [x] I have read and understood the Contributing Guide

Branch affected by issue

base

Steps to reproduce

run pip install -r rquirements

try opening dbtools. You will be greeted with errors

this can be bypassed by using an earlier version of mysql python connector earlier than 8.0.29

From release notes

Now recognize the deprecated 'utf8mb3' character set because the 'utf8' alias shows 'utf8mb3' in the Information Schema and with SHOW statements as of MySQL 8.0.28. For additional information, see [The utf8 Character Set (Alias for utf8mb3)](https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html). (Bug #33729842)

https://dev.mysql.com/doc/relnotes/connector-python/en/news-8-0-29.html

error message:

mysql.connector.errors.ProgrammingError: Character set 'utf8' unsupported

Expected behavior

dbtools should open without issues

kaincenteno avatar Jul 28 '22 21:07 kaincenteno

I'm not very knowledgeable in this area so I might be wrong here, but it seems we set up our tables with DEFAULT CHARSET=utf8; and utf8 here actually means utf8mb3 which is a proprietary encoding implemented by MySQL before utf8 was standardized. The actual standard utf8 is utf8mb4. I think we can just change utf8 to utf8mb4 in the sql files and maybe include a migration to convert existing protected tables (see this discussion on SO and MySQL docs). However, I'm not sure how intensive the conversion process is for large tables, and I'm not sure if any column sizes need to be adjusted to accommodate these changes.

cocosolos avatar Jul 29 '22 00:07 cocosolos

looks like utf8mb4 has more byte width per char by one byte based on those docs, the migration should not be a problem. I imagine names and linkshell names won't be a problem either. Probably just need to check if Japanese text still works right over zmq, assuming that currently works as is right now, that is.

WinterSolstice8 avatar Jul 29 '22 01:07 WinterSolstice8

this is not quite correct - utf8mb4 is a superset of utf-8, not the actual utf8 standard. From what I've read it's more desired going forward in most projects though Microsoft has actually been recommending utf-16 instead.

TeoTwawki avatar Jul 29 '22 01:07 TeoTwawki

fwiw utf-16 is pretty snowflake -- it's used internally in C# for their strings so I can see the MS bias. More things innately support utf-32 over utf-16 but I'm not sure we even need more than (real) utf-8. Need internal discussion

WinterSolstice8 avatar Jul 29 '22 01:07 WinterSolstice8

Can this be resolved by using the maria db python connector since mariadb is the recommended DB for LSB server so its on parity with mariadb

https://mariadb.com/docs/connect/programming-languages/python/install/

kaincenteno avatar Jul 29 '22 01:07 kaincenteno

Can this be resolved by using the maria db python connector since mariadb is the recommended DB for LSB server so its on parity with mariadb

https://mariadb.com/docs/connect/programming-languages/python/install/

dunno, worth trying. I thought we already did and we just keep calling it mysql because they were basically the same thing

TeoTwawki avatar Jul 29 '22 02:07 TeoTwawki

Not a solution at all, but in the meantime https://downloads.mysql.com/archives/c-python/ to download a version of the MySQL Python connector earlier than 8.0.29.

I just did this (installed 8.0.28) and DBTool runs without any errors.

EpicTaru avatar Jul 30 '22 15:07 EpicTaru

So I put together a migration to convert the DB and tables to utf8mb4 and it appears to apply successfully, but the same error occurs when trying to connect. As far as switching to MariaDB connector instead of MySQL, this appears to mostly work except when dealing with BLOBs. The MySQL connector has an option to use a pure Python interface to MySQL (use_pure) or the C Extension that uses the MySQL C client library, and the MariaDB connector seems to be missing this feature as far as I can tell. When using the C extension (the default), there seems to be encoding issues when reading BLOBs which breaks some migrations (notably the ones dealing with the mission log). I wish I could go into more detail about this but I only know that using the pure Python interface fixes the issue. So it seems like locking the MySQL connector to a lower version is the best bet for right now until a better solution can be determined. You can roll back your version using pip if you so desire, using pip install mysql-connector-python==8.0.29 or py -3 -m pip install mysql-connector-python==8.0.29 (8.0.29 looks like it works for me but YMMV).

cocosolos avatar Jul 30 '22 23:07 cocosolos