sprockets
sprockets copied to clipboard
"\xE2" to UTF-8 in conversion from ASCII-8BIT to UTF-8 to UTF-32LE
Expected behavior
It should encode the source correctly.
Actual behavior
It throws the above error.
System configuration
- Sprockets version: 4.0.0.beta5
- Ruby version: 2.3.0
Additional information
This error occurs with a file that has the character ï
in it (in our example it was the ckeditor.js from the ckeditor gem). This can be solved by wrapping the line 109
(source = source.encode(Encoding::UTF_32LE) unless source.ascii_only?
)
in lib/sprockets/utils.rb
in a begin-rescue-block. However I think this isn't the perfect solution. Lacking the experience with sprockets I'd rather report and wait for a fix than doing a PR.
Possible solution
In our case
source = source.encode(Encoding::UTF_32LE, "ISO-8859-1") unless source.ascii_only?
seems to do the trick. Not sure of the implications though.
This was added in #311 cc/ @bouk what are your thoughts here?
source = source.encode(Encoding::UTF_32LE, "ISO-8859-1") unless source.ascii_only?
This is suspicious because it would mean that the original encoding is ISO-8859-1
, which shouldn't really happen. @stiller-leser, could you create a repo that reproduces this issue?
The issue is still occuring. After a bit of investigation, it seems that the File.binread
function, which is used when constructing the Asset
object, read the source code of the asset with ASCII-8BIT
encoding. I'm thinking about adding a preprocessor that will force the encoding to utf-8
, but i think it's an overkill. Having a fallback as @bouk suggested seems a good option to me.