simdjson_php icon indicating copy to clipboard operation
simdjson_php copied to clipboard

Create PHP RFC

Open Renkas opened this issue 5 years ago • 10 comments

Could you create RFC to maybe implement this into PHP? The speed imporovement is impressive and would improve PHP itself.

https://wiki.php.net/rfc

Renkas avatar Jan 15 '20 17:01 Renkas

Hi @crazyxman,

I've just provided a PR to php-src in order to bundle your extension to the core: https://github.com/php/php-src/pull/6551 and started a discussion about it: https://externals.io/message/112638

kocsismate avatar Dec 29 '20 17:12 kocsismate

In case it's not accepted into PHP would you consider making it available via PECL?

enumag avatar Dec 31 '20 10:12 enumag

Yes, it's most probably won't be accepted just yet. But in this case, I'll try my best to make it available via PECL, sure.

kocsismate avatar Dec 31 '20 10:12 kocsismate

@kocsismate thank you very much! It is a good idea to make it available through PECL

crazyxman avatar Jan 02 '21 03:01 crazyxman

@kocsismate I'm looking forward for your support to make it available via pecl and to improve this PHP extension.

sandrokeil avatar Jan 02 '21 16:01 sandrokeil

In the light of the mailing list discussion (https://externals.io/message/112638), I abandon my proposal, as people would rather prefer to use libsimdjson as an optional JSON parsing backend.

kocsismate avatar Apr 03 '21 11:04 kocsismate

do we have any news on progress regarding making it available through PECL?

nemanjajojic avatar Oct 28 '21 10:10 nemanjajojic

People in the ML suggested to make simdjson available as a parsing backend for the built-in json-related functions, so if I tried to work on this, then I would rather follow this approach, instead of creating a new PECL extension.

kocsismate avatar Oct 28 '21 11:10 kocsismate

People in the ML suggested to make simdjson available as a parsing backend for the built-in json-related functions

That seems viable once implemented and the test cases of edge cases pass - if simdjson fails to parse it, then the exception should be cleared and json_decode() can fall back to the original json implementation so that https://www.php.net/json_last_error and JSON_THROW_ON_ERROR will work as expected and valid inputs would be permitted with flags such as JSON_BIGINT_AS_STRING

php > var_export(json_decode('[111111111111111111111111111111111111]', true, 10, JSON_BIGINT_AS_STRING));
array (
  0 => '111111111111111111111111111111111111',
)
php > var_export(simdjson_decode('[111111111111111111111111111111111111]'));
Warning: Uncaught RuntimeException: Problem while parsing a number in php shell code:1
Stack trace:
#0 php shell code(1): simdjson_decode('[11111111111111...')
#1 {main}
  thrown in php shell code on line 1

TysonAndre avatar Dec 17 '21 13:12 TysonAndre

The pecl simdjson 3.0.0 release should be in a good state (including for zts releases) for proposing to include as a configure option in php 8.3

This will speed up the happy path, where the json string is valid json (see previous comment for error handling).

  • Unsupported $flags can be skipped and use the original implementation. The ondemand C simdjson option seems like it would be slower than dom from the documentation but I haven't benchmarked it yet.
  • Edge cases for floating-point numbers should be consistent with the patched simdjson code
  • ZTS should work properly. Edge cases such as unicode surrogate pairs are now also consistent for error handling.
  • Depth handling when arrays do/don't have properties is off by one and will need to be handled in the fork of the C simdjson library
  • simdjson should be avoided for len > 4GB or depth > SOME_REASONABLE_C_STACK_DEPTH (e.g. have to account for smaller stacks in https://www.php.net/fiber - bison does not use the C stack)

I'd still need to fix https://github.com/crazyxman/simdjson_php/pull/81

TysonAndre avatar Oct 18 '22 05:10 TysonAndre