php-serialize
php-serialize copied to clipboard
Adds a test case for decoding UTF-8 strings and fixes this bug
Multibyte strings were not being encoded back into UTF-8, so I wrote a test case (and fixed the bug)
I'll need to look into this a bit more but this looks like a great change.
PHP serialization format doesn't do encodings - similar to Ruby 1.8-era, they're just vague blobs of bytes. UTF-8 is surely a good bet these days, but maybe it should be configurable?
Module instance variable I guess.
@jqr @Freaky Used an module variable to set encoding. You think this should be documented somewhere?
Well, the point would be to make it easy to override if necessary, a constant would throw a warning if you redefined it. I was thinking more along the lines of:
module PHP
class << self
attr_accessor :encoding
end
end
PHP.encoding ='UTF-8'
Then users can set it to whatever weirdness they're using in PHP. Might be mostly theoretical?
Today, I tried to merge this PR and #11 onto commit e23fcfd. I noticed that these two patches are incompatible with each other.
#11 resolves the issue using a different approach, as keichan34 mentioned. Thus, I think this PR can be closed in favor of #11.
Yeah that force encoding seems problematic, closing in favor of #11.