hashover-next
hashover-next copied to clipboard
Add utf8mb4 charset hint to database documentation
utf8 is an alias for utf8mb3 in MySQL and MariaDB. Some emojis use 4-bytes, so recommend utf8mb4.
Is there any reason not to also use utf8mb4 as the default in secrets.php? I would like to support all emoji by default, unless there's a good reason not to.
SQLite, PostgreSQL, and others handles 2–4 bytes from utf8 as per the Unicode standard. MySQL wanted to save RAM back in the day and normalized on utf8 meaning 3-bytes instead; which is why you need to specify utf8mb4 to get full Unicode support. MariaDB inherited this legacy from MySQL. The other database defaults in the secrets file is for SQLite.
So … yeah. Do you want to default to MySQL-legacy-workaround or the guys who’ve followed the Unicode standard without introducing issues for their users? The ambiguity is why I put it in the documentation. It’s a common issue and you might end up with breaking multibyte emojis. But that’s kind of what you get when choosing MySQL/MariaDB.