anchor-cms
anchor-cms copied to clipboard
Support full Unicode in database.
See How to support full Unicode in MySQL databases by Mathias Bynens:
Turns out MySQL’s
utf8
charset only partially implements proper UTF-8 encoding. It can only store UTF-8-encoded symbols that consist of one to three bytes; encoded symbols that take up four bytes aren’t supported.
Currently Anchor only lets me chose a utf8
based collation, but it would be better to offer utf8mb4
based collations.
Thanks for your input, we haven't encountered this as a problem as of yet. (AS far as I'm aware)
Because this would require changing the configuration of the database we'll have to look into a way to migrate utf8
collated databases to utf8mb4
or what ever collation we end up choosing in the future. I'm pretty sure that it would cause problems if we weren't to do so.
True, I didn’t think about migrating existing installations. I simply edited $vars['collations']
(s/utf8_/utf8mb4_/g
) and DB::factory
’s charset setting before installing Anchor and it seemed to have no problems setting up the clean database.
This is pretty important. Have you heard about this? 😔😚😅😊😆😐😅😈😐😓😠😉😈😋😔😠 Unicode emoji is removed from post.
Try making a span element with class emoji, and set the content equal to an emoji code: <span class="theme-emoji" content="\1F60E"></span>
Why the class? So you can fix spacing issues in CSS. I'm only guessing that this may work, I saw a theme implement this kind of idea with the right arrows. Instead of the usual →
the theme used the Unicode hex code.
Thanks for response, but nothing shows up 😠 I think that you sholuld really use proper Unicode
ALTER TABLE anchor_posts CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci
Running this for every table will convert it transparently on upgrade
Do you have a working test for this?
Running this for every table will convert it transparently on upgrade
According to Mathias’ article you need to run a little more than that. But yes, that is the main gist of it.
Do you have a working test for this?
I can only say that I haven’t seen any weird things happening with a database that was set to utf8mb4
from day one using my slightly modified Anchor installer. Probably also because utf8mb4
is backwards-compatible with utf8
.
Big parts of Anchor are already UTF8 aware, e.g. the slug
-function uses both mb_strtolower
and htmlentities
with their optional encoding parameter set to UTF-8
. This issue is all about getting the storage database inline with that.