将MySQL的默认Latin1连接改为使用utf8
最近惊奇地发现MySQL的默认编码方式居然是Latin1!而不是utf8! 于是即使我用的不是Windows,还是碰上了久违的编码问题…… 错误是传入utf8字符(这里特指中文字符)后,MySQL报错说“奇怪的字符串,\xAC\x12...,不认识啊”。利用Python的decode,可以把这串报错字符还原成原本的utf8字符串。可见应该是MySQL的问题。 上网搜一下,得出是character变量设置的问题。这时候的character相关变量设定如下:
mysql> show variables like 'character%';
+--------------------------+---------------------+
| Variable_name | Value |
+--------------------------+---------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_result | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
看来得把那两项latin1也同化成utf8才行。接着继续查资料。
看了下官方的相关字符配置文档,依然不知所云。然后看到一种做法,就是去改动/etc/mysql/mysql.cnf配置文件。改完之后,发现mysql无法重新启动了……呃,查看下/var/log/mysql/error.log,tail输出最后几行,发现刚刚添加的某个变量(default-character-set)是不合理的变量,所以报错了。遇到无法启动的问题,也过来查看下error.log好了。
最后是根据这篇资料改好的:
http://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf-8-in-my-cnf
果然爆栈网上啥都有。
转述如下:
1. Remove that directive and you should be good.
2. Then your configuration file ('/etc/my.cnf' for example) should look like that:
[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
3. Restart MySQL.
4. For making sure, your MySQL is UTF-8, run the following queries in your MySQL prompt:
- First query:
mysql> show variables like 'char%';
The output should look like:
+--------------------------+---------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/local/mysql/share/charsets/|
+--------------------------+---------------------------------+
- Second query:
mysql> show variables like 'collation%';
And the query output is:
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_unicode_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+
不过试了之后,character_set_database还是latin1。当然这个回答中的评论还是提及一点,对于已经存在的数据库,你还需要ALTER TABLE Table CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;来更改它们的字符设置。
我试了下这个命令,终于把最后一个变量也改为utf8了。万岁!不过……MySQL还是会报错!还是同样的错!不知怎么会有这样的问题……因为我最后一怒之下把数据库全部重新建过,就没有这个问题了!(当然这些都是开发用的数据库,随便drop下也无所谓啦)
为什么MySQL居然使用latin1作为默认编码呢……在解决了问题后,我特意去查了下。看来这个貌似是个遗留历史问题:
http://stackoverflow.com/questions/3936059/why-does-mysql-use-latin1-swedish-ci-as-the-default
真是的,凡是稍微大一点的东西总会有那么几个历史遗留问题~