known icon indicating copy to clipboard operation
known copied to clipboard

MySQL should use utf8mb4 collation

Open mapkyca opened this issue 5 years ago • 8 comments

While trying to do this:

Store any character not in the BMP - emojis, mathematical symbols, etc on mysql

I encountered this error:

They can't be stored

Some other notes:

It is recommended widely (since ~2016 at least) that you don't use utf8 (which is an alias for utf8mb3) and instead use utf8mb4.

Postgres is unaffected, as 4byte utf is the default

Give us some context:

  • It'd also be really handy if you could tell us the contents of your version.known file
  • What database are you using? (e.g. mongo, mysql, postgres)
  • Any warnings or errors in your admin/diagnostics page?
  • If this is a programming bug, can you include examples of any Micropub / API calls / webhook pings you make? Otherwise please don't worry about what this means!
  • Bonus points - are you able to illustrate the issue with a unit test? If so, submit it as a pull request!

mapkyca avatar Jun 10 '19 15:06 mapkyca

Low priority, possibly, since object serialisation seems to handle this ok. Will be an issue for any raw body fields and metadata

mapkyca avatar Jun 10 '19 19:06 mapkyca

Hitting this now... I can't believe they screwed up charsets this badly.... Wish I could use postgres instead of this ugliness...

lindner avatar Mar 17 '20 05:03 lindner

Some details here:

https://mathiasbynens.be/notes/mysql-utf8mb4

lindner avatar Mar 17 '20 07:03 lindner

definitely affects the search tables.

lindner avatar Mar 17 '20 07:03 lindner

Migration for this is actually a little fiddly owing to mysql's key length restrictions, so it's not just a simple matter of doing an alter table. Not had the bandwidth to look at it further.

I'm tempted to address this when we move towards changing the data model around, and also thinking about binning off full text search altogether in favour of an Elastic Search / Lucine plugin. Search really isn't something to be doing at the database level.

mapkyca avatar Mar 17 '20 10:03 mapkyca

Hello @lindner, @mapkyca,

For your information, altering tables was not enough for me.

I had this MySQL 5.7 database which still left me in the dark with non-terminating queries.

Upgrading to MySQL 8.0 apparently did the trick.

I could find the queries for which emojies turned out to be rejected by v5.7 MySQL database for running tests if that could help.

thierrymarianne avatar Oct 16 '22 13:10 thierrymarianne

5.7 is EOL in one year. I think we can bump to 8.0 as minimum for this version unless there's an active user that really wants to stick with that...

[Same goes for other infrastructure pieces, we can and should specify a modern PHP etc...]

lindner avatar Oct 16 '22 18:10 lindner

+1 from me, fwiw!

benwerd avatar Oct 16 '22 20:10 benwerd