rules_closure icon indicating copy to clipboard operation
rules_closure copied to clipboard

Binary files are not correctly served via the phantomjs_harness

Open applmak opened this issue 7 years ago • 2 comments

There are two issues here, I guess. One is that the harness does not read the file in binary mode. A fix merely passes 'b' to fs.read when we are serving up a binary file.

The other issue is weirder. When serving a sample png file, the file is changed in some weird way. Specifically:

$ ls -l *.png
-rw-r--r--@ 1 applmak  x  2669 Apr  4 08:24 test1x1.bad.png
-rw-r--r--@ 1 applmak  x  2742 Apr  3 20:12 test1x1.png

Even weirder:

$ xxd test1x1.bad.png | head -n 3
00000000: efbf bd50 4e47 0d0a 1a0a 0000 000d 4948  ...PNG........IH
00000010: 4452 0000 00ef bfbd 0000 00ef bfbd 0803  DR..............
00000020: 0000 00ef bfbd efbf bdef bfbd efbf bd00  ................
$ xxd test1x1.png | head -n 3
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452  .PNG........IHDR
00000010: 0000 0080 0000 0080 0803 0000 00f4 e091  ................
00000020: f900 0002 7350 4c54 45ff ffff 3333 3333  ....sPLTE...3333

which seems to indicate some kind of encoding issue? This problem might be an issue with phantomjs itself.

applmak avatar Apr 04 '17 13:04 applmak

https://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder tells me that the strange sequence 0xefbfbd is a result of the first byte of the png being incorrectly interpreted as UTF-8 (0x89) which translates to

U+FFFD	REPLACEMENT CHARACTER
	* used to replace an incoming character whose value is unknown or unrepresentable in Unicode
	* compare the use of 001A as a control character to indicate the substitute function

The second issue is then a problem is phantomjs. I'll file an issue there and link it here.

applmak avatar Apr 04 '17 13:04 applmak

Filed ariya/phantomjs#14936 for the encoding issue. Still, the harness needs to be updated.

applmak avatar Apr 04 '17 14:04 applmak