php-serialize icon indicating copy to clipboard operation
php-serialize copied to clipboard

Adds a test case for decoding UTF-8 strings and fixes this bug

Open keichan34 opened this issue 12 years ago • 4 comments

Multibyte strings were not being encoded back into UTF-8, so I wrote a test case (and fixed the bug)

keichan34 avatar Mar 15 '13 04:03 keichan34

I'll need to look into this a bit more but this looks like a great change.

jqr avatar Jun 29 '17 15:06 jqr

PHP serialization format doesn't do encodings - similar to Ruby 1.8-era, they're just vague blobs of bytes. UTF-8 is surely a good bet these days, but maybe it should be configurable?

Module instance variable I guess.

Freaky avatar Sep 04 '18 00:09 Freaky

@jqr @Freaky Used an module variable to set encoding. You think this should be documented somewhere?

keichan34 avatar Sep 04 '18 01:09 keichan34

Well, the point would be to make it easy to override if necessary, a constant would throw a warning if you redefined it. I was thinking more along the lines of:

module PHP
  class << self
    attr_accessor :encoding
  end
end

PHP.encoding ='UTF-8'

Then users can set it to whatever weirdness they're using in PHP. Might be mostly theoretical?

Freaky avatar Sep 04 '18 02:09 Freaky

Today, I tried to merge this PR and #11 onto commit e23fcfd. I noticed that these two patches are incompatible with each other.

#11 resolves the issue using a different approach, as keichan34 mentioned. Thus, I think this PR can be closed in favor of #11.

kaorukobo avatar Feb 21 '24 06:02 kaorukobo

Yeah that force encoding seems problematic, closing in favor of #11.

jqr avatar Feb 24 '24 15:02 jqr