redis-rb
redis-rb copied to clipboard
Incorrect string encoding on returned data
Redis makes no assertions about the encoding of string stored within it. Therefore, IMO, all strings returned from Redis data retrieval methods should be .force_encoding("BINARY").
Hey Matt,
This is definitely not what most users expect.
If you really need binary encoded strings, it's simple to do it yourself, and your code will better express intent.
If you want to have all your external data to be binary, you can set Encoding.default_external.
I'm fairly sure that the vast majority of users don't expect anything in particular with regards to string encoding, because they never think about it. As far as "your code will better express intent", that could be said for marking strings as UTF-8 "if you really need UTF-8 encoded strings". It's an issue of correctness, at the end of the day -- you cannot safely assume that a string read from Redis is UTF-8, because Redis represents strings as simple sequences of octets, and doesn't store any encoding information.
Using Encoding.default_external isn't an option, because, being a default, it also mucks with reading text-mode files (that is, files that aren't opened with b in the mode) which isn't desired behaviour. Perhaps Redis.new could take a string_encoding option, to provide equivalent behaviour to File.open?
Agree with @mpalmer, I don't want to change Encoding.default_external just because I want to read from redis. Would be nice to make it configurable, because changing Encoding.default_external may affect other parts.
Solved by:
class Redis::Connection::Ruby
def encode(str)
str
end
end
This is changed in 5.0, we rely on redis-client, it returns strings as Encoding::BINARY if respecting Encoding.default_external results in invalid encoding.