redis-rb icon indicating copy to clipboard operation
redis-rb copied to clipboard

Incorrect string encoding on returned data

Open mpalmer opened this issue 10 years ago • 3 comments

Redis makes no assertions about the encoding of string stored within it. Therefore, IMO, all strings returned from Redis data retrieval methods should be .force_encoding("BINARY").

mpalmer avatar Jun 26 '15 03:06 mpalmer

Hey Matt,

This is definitely not what most users expect.

If you really need binary encoded strings, it's simple to do it yourself, and your code will better express intent.

If you want to have all your external data to be binary, you can set Encoding.default_external.

djanowski avatar Jul 31 '15 14:07 djanowski

I'm fairly sure that the vast majority of users don't expect anything in particular with regards to string encoding, because they never think about it. As far as "your code will better express intent", that could be said for marking strings as UTF-8 "if you really need UTF-8 encoded strings". It's an issue of correctness, at the end of the day -- you cannot safely assume that a string read from Redis is UTF-8, because Redis represents strings as simple sequences of octets, and doesn't store any encoding information.

Using Encoding.default_external isn't an option, because, being a default, it also mucks with reading text-mode files (that is, files that aren't opened with b in the mode) which isn't desired behaviour. Perhaps Redis.new could take a string_encoding option, to provide equivalent behaviour to File.open?

mpalmer avatar Aug 03 '15 07:08 mpalmer

Agree with @mpalmer, I don't want to change Encoding.default_external just because I want to read from redis. Would be nice to make it configurable, because changing Encoding.default_external may affect other parts.

Solved by:

class Redis::Connection::Ruby
  def encode(str)
    str
  end
end

Paxa avatar Aug 24 '17 19:08 Paxa

This is changed in 5.0, we rely on redis-client, it returns strings as Encoding::BINARY if respecting Encoding.default_external results in invalid encoding.

byroot avatar Aug 17 '22 18:08 byroot