omniauth-runkeeper
omniauth-runkeeper copied to clipboard
Encoding seems to be either incorrect or malformed
I've been using omniauth-runkeeper for over a year now, and it's been great - thank you for creating it.
Just recently, I noticed some errors when a particular user was attempting to log in. It turned out that they had added a special character (í) to their name & Rails was crashing with this error when attempting to add that information to the database:
Mysql2::Error - Incorrect string value: '\xEDm Che...' for column 'name' at row 1
Some really quick background info...
- I send
request.env['omniauth.auth']
asauth
to a method that finds the user. - Within that method, I grab the user's name:
auth.extra.raw_info.name
. - If it's a new user, I create it with that name & if it's an existing user, I update their account with their name (I update elite status, name, and picture on each login).
So, Rails was choking on adding \xED
to the database. That string should have been UTF8 at that point, though.
If I run .encoding
on that name
field, it reports back that it is #<Encoding:UTF-8>
. I'm not sure if it should not report back that encoding, or if something should not be performed on the string(s) in order to keep them as UTF8.
I ran through a similar test of retrieving the user's name on RunKeeper through the healthgraph gem, and it did not exhibit this issue. The strings returned were UTF8 encoded and they contained the actual UTF8 í character.
While I'm a little over my head, I did try to poke around within the runkeeper.rb file. I removed MultiJson.decode
from line 34, to see if that was the culprit - this was a breaking change, but it allowed me to dive in through debugger access where I was able to confirm that the string is already hex encoded at that point. I'm still over my head, but this seems like it indicates that the string is made this way outside of the scope of this gem - either by the API itself or somewhere else.
Is what I've found correct - is this issue outside of your control?
If it isn't, is it possible for you to fix the data in raw_info
to be completely UTF8?
This is looking more & more like an issue with mysql. I've since run into further issues with emoji characters and the mysql utf8 character set.
I learned that emoji support requires use of the utf8mb4 character set, but that comes with a huge set of its own problems/issues. Three good references are How to support full Unicode in MySQL databases, active_record, MySQL, and emoji, and this issue.
It looks like the two options are:
- Use demoji to strip emoji characters, and continue using the utf8 character set.
- Change the character set used on the database, the character set used on each table, update all your text fields (I don't know if you need to update other text fields like
MEDIUMTEXT
, etc), and change all of the string columns that you index on fromVARCHAR(255)
toVARCHAR(191)
.
So, it's looking like an enormously huge mess that has little to nothing to do with omniauth-runkeeper.