crystal-mysql
crystal-mysql copied to clipboard
Connect to database with collation/character other than UTF8
In some cases other collations are needed in order to accept Emojis and/or non-latin characters.
It seems that as of today, this project do not provide a way to connect to MySQL database with any other option than UTF8, which leads to database errors when trying to add those "special" new characters.
With the wide use of smartphones, its basically impossible to build any application that do not support emojis. Its used anywhere, and its popularizing even more each year, so I believe the project should support some way to handle this.
There are some short codes sometimes used for emojis :smile:
= :smile: but I also think this would be good.
https://github.com/crystal-lang/crystal-mysql/blob/master/src/mysql/packets.cr#L75
Just replacing value 0x21u8
to 0x2du8
utf8mb4_general_ci works.
When a database is created, the 'Encoding' and 'Collation' are set. The driver should probably detect this and set it to be the same, instead of being hard coded. I'll have a quick look to see how this can be read.
Something like this give us the correct settings for the database.
SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME = 'db_name'
We just then need to convert the names to the correct bytes.
The full list for the connected database can be found with SHOW COLLATION
, the id column just needs converting to a Unit8 based on the result from the above query.
It look like this info may already be retrieved during the handshake - https://github.com/crystal-lang/crystal-mysql/blob/master/src/mysql/packets.cr#L20 however I've done some manual testing and the handshake seems to return 33 / utf8_general_ci even if the database is utf8mb4_general_ci.
@crisward @waj Is there any plan to support setting the in application database collation (utf8_general_ci
or utf8mb4_general_ci
or whatever one sets in their corresponding database). As
I can see there is no closer to this issue.
BTW: utf8
in MySQL is invalid, the valid UTF-8 character set is utf8mb4
and maybe it should be the default.
https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434
Do we have any updates in this?
@kazzkiq I don't think this is an issue with this repo, setting the collation is db specific https://stackoverflow.com/questions/38949115/how-to-change-the-connection-collation-of-mysql