omniauth-runkeeper icon indicating copy to clipboard operation
omniauth-runkeeper copied to clipboard

Encoding seems to be either incorrect or malformed

Open JamesChevalier opened this issue 10 years ago • 1 comments

I've been using omniauth-runkeeper for over a year now, and it's been great - thank you for creating it.

Just recently, I noticed some errors when a particular user was attempting to log in. It turned out that they had added a special character (í) to their name & Rails was crashing with this error when attempting to add that information to the database:

Mysql2::Error - Incorrect string value: '\xEDm Che...' for column 'name' at row 1

Some really quick background info...

  • I send request.env['omniauth.auth'] as auth to a method that finds the user.
  • Within that method, I grab the user's name: auth.extra.raw_info.name.
  • If it's a new user, I create it with that name & if it's an existing user, I update their account with their name (I update elite status, name, and picture on each login).

So, Rails was choking on adding \xED to the database. That string should have been UTF8 at that point, though.

If I run .encoding on that name field, it reports back that it is #<Encoding:UTF-8>. I'm not sure if it should not report back that encoding, or if something should not be performed on the string(s) in order to keep them as UTF8.

I ran through a similar test of retrieving the user's name on RunKeeper through the healthgraph gem, and it did not exhibit this issue. The strings returned were UTF8 encoded and they contained the actual UTF8 í character.

While I'm a little over my head, I did try to poke around within the runkeeper.rb file. I removed MultiJson.decode from line 34, to see if that was the culprit - this was a breaking change, but it allowed me to dive in through debugger access where I was able to confirm that the string is already hex encoded at that point. I'm still over my head, but this seems like it indicates that the string is made this way outside of the scope of this gem - either by the API itself or somewhere else.

Is what I've found correct - is this issue outside of your control? If it isn't, is it possible for you to fix the data in raw_info to be completely UTF8?

JamesChevalier avatar Jun 03 '14 02:06 JamesChevalier

This is looking more & more like an issue with mysql. I've since run into further issues with emoji characters and the mysql utf8 character set.

I learned that emoji support requires use of the utf8mb4 character set, but that comes with a huge set of its own problems/issues. Three good references are How to support full Unicode in MySQL databases, active_record, MySQL, and emoji, and this issue.

It looks like the two options are:

  1. Use demoji to strip emoji characters, and continue using the utf8 character set.
  2. Change the character set used on the database, the character set used on each table, update all your text fields (I don't know if you need to update other text fields like MEDIUMTEXT, etc), and change all of the string columns that you index on from VARCHAR(255) to VARCHAR(191).

So, it's looking like an enormously huge mess that has little to nothing to do with omniauth-runkeeper.

JamesChevalier avatar Jun 05 '14 02:06 JamesChevalier