sprockets icon indicating copy to clipboard operation
sprockets copied to clipboard

"\xE2" to UTF-8 in conversion from ASCII-8BIT to UTF-8 to UTF-32LE

Open stiller-leser opened this issue 6 years ago • 3 comments

Expected behavior

It should encode the source correctly.

Actual behavior

It throws the above error.

System configuration

  • Sprockets version: 4.0.0.beta5
  • Ruby version: 2.3.0

Additional information

This error occurs with a file that has the character ï in it (in our example it was the ckeditor.js from the ckeditor gem). This can be solved by wrapping the line 109

(source = source.encode(Encoding::UTF_32LE) unless source.ascii_only?)

in lib/sprockets/utils.rb in a begin-rescue-block. However I think this isn't the perfect solution. Lacking the experience with sprockets I'd rather report and wait for a fix than doing a PR.

Possible solution

In our case

source = source.encode(Encoding::UTF_32LE, "ISO-8859-1") unless source.ascii_only?

seems to do the trick. Not sure of the implications though.

stiller-leser avatar Sep 07 '17 14:09 stiller-leser

This was added in #311 cc/ @bouk what are your thoughts here?

schneems avatar Nov 17 '17 17:11 schneems

source = source.encode(Encoding::UTF_32LE, "ISO-8859-1") unless source.ascii_only?

This is suspicious because it would mean that the original encoding is ISO-8859-1, which shouldn't really happen. @stiller-leser, could you create a repo that reproduces this issue?

bouk avatar Nov 19 '17 11:11 bouk

The issue is still occuring. After a bit of investigation, it seems that the File.binread function, which is used when constructing the Asset object, read the source code of the asset with ASCII-8BIT encoding. I'm thinking about adding a preprocessor that will force the encoding to utf-8, but i think it's an overkill. Having a fallback as @bouk suggested seems a good option to me.

AyoubEssrifi avatar Nov 13 '23 14:11 AyoubEssrifi