proposal-well-formed-stringify icon indicating copy to clipboard operation
proposal-well-formed-stringify copied to clipboard

PHP json_decode throws error with well-formed json

Open agaripian opened this issue 5 years ago β€’ 1 comments

This might be a PHP json_decode issue but wanted to share our production bug here after upgrading to Node 12.

var a = 'π˜₯𝘦𝘴π˜ͺ𝘨𝘯𝘦𝘳 𝘒𝘯π˜₯ π˜ͺ𝘭𝘭𝘢𝘴𝘡𝘳𝘒𝘡𝘰𝘳';
JSON.stringify(a.slice(0, 15));

// Node 10 output: 
'"π˜₯𝘦𝘴π˜ͺ𝘨𝘯𝘦�"';

// Node 12 output:
'"π˜₯𝘦𝘴π˜ͺ𝘨𝘯𝘦\\ud835"'

This response is then sent to a PHP server as JSON and decoded. Which is where the error occurs. Node10's output used to work fine with PHP json_decode but it no longer works with Node12's output.

I simplified the NODE->PHP example see below.

<?php
$string = '{"string": "π˜₯𝘦𝘴π˜ͺ𝘨𝘯𝘦\\ud835"}';
var_dump(json_decode($string, false, 512, JSON_THROW_ON_ERROR | JSON_INVALID_UTF8_IGNORE | JSON_INVALID_UTF8_SUBSTITUTE));

// Output:
Fatal error: Uncaught JsonException: Single unpaired UTF-16 surrogate in unicode escape in phptest.php:36
Stack trace:
#0 phptest.php(36): json_decode('{"string": "\xF0\x9D\x98...', false, 512, 7340032)
#1 {main}
  thrown in phptest.php on line 36


//Node 10's output works fine.
object(stdClass)#1 (1) {
  ["string"]=>
  string(31) "π˜₯𝘦𝘴π˜ͺ𝘨𝘯𝘦�"
}

agaripian avatar Nov 03 '19 14:11 agaripian

See #13.

As in that thread,

It seems weird to guard against lone surrogates in escaped form but not in their unescaped form.

Those two representations should be equivalent. You might consider filing a bug against PHP.

bakkot avatar Nov 03 '19 17:11 bakkot