node icon indicating copy to clipboard operation
node copied to clipboard

node-api: add support for UTF-8 and Latin-1 property keys

Open mertcanaltin opened this issue 1 year ago • 38 comments

This commit addresses issue #52979. It introduces new functions for optimized property key creation:

node_api_create_property_key_utf8 node_api_create_property_key_latin1 These functions help in creating property keys that allow more efficient property access when reused multiple times.

This commit also includes:

Updates to documentation to reflect the addition of the new functions. New tests to ensure the correctness of the implementation. Additionally, this work is related to the previous PR node-api: refactor napi_set_property function for improved performance #50282, which covered the utf16 case.

mertcanaltin avatar May 14 '24 18:05 mertcanaltin

@mertcanaltin we discussed this PR at today's Node-API team meeting. We have come to the conclusion that, although this new API adds convenience for developers, it actually encourages a pattern of runtime inefficiency by making it easy to create properties with keys for which the engine is unaware that they are not just regular strings but property keys, and thus it doesn't optimize them.

We believe it would be better to pivot this PR to creating additional variants of node_api_create_property_key_utf16 for the other two string encodings because it would encourage add-on maintainers to create property keys that are treated efficiently by the engine, thus increasing the performance of code that has to create large numbers of objects.

In addition, it would be advisable for add-on maintainers to create a property key once, and then reuse the resulting napi_value in subsequent calls to napi_set_property and/or napi_define_properties in order to avoid repeatedly creating the same string.

This is certainly less convenient for those writing the code, however, it increases the performance of the resulting add-ons.

gabrielschulhof avatar May 17 '24 16:05 gabrielschulhof

@mertcanaltin we discussed this PR at today's Node-API team meeting. We have come to the conclusion that, although this new API adds convenience for developers, it actually encourages a pattern of runtime inefficiency by making it easy to create properties with keys for which the engine is unaware that they are not just regular strings but property keys, and thus it doesn't optimize them.

We believe it would be better to pivot this PR to creating additional variants of node_api_create_property_key_utf16 for the other two string encodings because it would encourage add-on maintainers to create property keys that are treated efficiently by the engine, thus increasing the performance of code that has to create large numbers of objects.

In addition, it would be advisable for add-on maintainers to create a property key once, and then reuse the resulting napi_value in subsequent calls to napi_set_property and/or napi_define_properties in order to avoid repeatedly creating the same string.

This is certainly less convenient for those writing the code, however, it increases the performance of the resulting add-ons.

First of all, thank you very much for your review and support, do you think I can continue to develop this pr I would be very happy if you have any suggestions ❤️🙏 @gabrielschulhof

mertcanaltin avatar May 17 '24 16:05 mertcanaltin

I created a node_api_create_property_key_utf8 based on what you said

node_api_create_proper- ty_key_utf8 node_api _create_proper- ty_key_ascil

I wonder if we can get a performance as you say by making different string encodings?

mertcanaltin avatar May 17 '24 18:05 mertcanaltin

I wonder if we can get a performance as you say by making different string encodings?

@mertcanaltin , there are a few Node-API benchmarks here: https://github.com/nodejs/node/tree/main/benchmark/napi. A new one can be created to benchmark different ways to set and get property values. Note that the true advantage from using node_api_create_property_key_utf8 and node_api_create_property_key_utf16 must be seen only when the same property key is reused for setting/getting a property multiple times.

vmoroz avatar May 24 '24 14:05 vmoroz

I am hesitant of adding these APIs, as the functionality to do what you want already exists using both node_api_create_property_key_utf16() and napi_set_property(). I do not follow the argument of "an extra napi call" as outlined in https://github.com/nodejs/node/issues/52979#issuecomment-2110430131, as the team discussed it in a Node-API meeting and decided the additional call would be minimal overhead.

Adding new APIs has more overreaching effects than just Node.js itself, as other JavaScript engines must implement the new APIs as well.

Since this functionality already exists and can be performed in a trivial manner, I 👎 this addition.

KevinEady avatar May 30 '24 10:05 KevinEady

Hi @KevinEady, Some might argue that the existing node_api_create_property_key_utf16 and napi_set_property functions are sufficient. However, these functions come with additional performance costs and increased code complexity. The new function presents a more efficient and effective solution. I would love to hear any suggestions you might have.

mertcanaltin avatar Jun 03 '24 12:06 mertcanaltin

thank you very much for reviewing me I was at a conference this week so I could not give a quick answer but I will give an update today thank you all very much

mertcanaltin avatar Jun 03 '24 13:06 mertcanaltin

We discussed in the Node-api team meeting again today, and consensus seems to be to implement the suggestion in https://github.com/nodejs/node/pull/52984#issuecomment-2117935616, and only add the new property key creation functions, and no others.

mhdawson avatar Jun 07 '24 15:06 mhdawson

greetings I'm very sorry I'm late getting back here, I sent a few edits today

mertcanaltin avatar Jun 17 '24 05:06 mertcanaltin

@mertcanaltin , thank you for making the progress with the PR. Based on the discussions in Node-API meetings and the comments above:

  • Could you please remove the new napi_set_named_property_len function. The team sees this function as redundant. Developers can implement a simple helper function in their code today by using napi_create_string_utf8 and napi_set_property functions to allow using \0 in the name. Since Node-API is implemented by other JS runtimes such as Deno and Bun, and on top of other JS engines such as Hermes, adding simple helper functions unnecessary increases the API surface.
  • As an alternative to the original purpose of this PR, it would be great to retarget it to add the new node_api_create_property_key_latin1 and node_api_create_property_key_utf8 functions. These functions enable creation of optimized strings that can be reused for the efficient property access. Their implementation can follow the same set of changes as in your previous PR #50282.
  • It would be great to add a simple benchmark test that shows that reusing a property name created by node_api_create_property_key_utf8 is faster than the one created by napi_create_string_utf8 function.

Please let me know if I can help with the implementation.

vmoroz avatar Jul 01 '24 15:07 vmoroz

I did these steps @vmoroz

  • Removed the napi_set_named_property_len function based on discussions in Node-API meetings.
  • Added node_api_create_property_key_latin1 and node_api_create_property_key_utf8 functions for optimized property key creation.
  • Updated the test file to include new tests for node_api_create_property_key_latin1 and node_api_create_property_key_utf8.
  • Implemented a benchmark test (BenchmarkPropertyKeyUtf8) to compare the performance of node_api_create_property_key_utf8 with napi_create_string_utf8.
  • The new benchmark test shows that reusing a property name created by node_api_create_property_key_utf8 is faster than using napi_create_string_utf8.

mertcanaltin avatar Jul 06 '24 11:07 mertcanaltin

I wonder if you have a suggestion about test failures @vmoroz

mertcanaltin avatar Jul 08 '24 11:07 mertcanaltin

I wonder if you have a suggestion about test failures @vmoroz

The error says that the module is not found. It is possible that it is failed to build. I will check it out and see how it can be fixed.

vmoroz avatar Jul 16 '24 01:07 vmoroz

I did these steps @vmoroz

  • Removed the napi_set_named_property_len function based on discussions in Node-API meetings.
  • Added node_api_create_property_key_latin1 and node_api_create_property_key_utf8 functions for optimized property key creation.
  • Updated the test file to include new tests for node_api_create_property_key_latin1 and node_api_create_property_key_utf8.
  • Implemented a benchmark test (BenchmarkPropertyKeyUtf8) to compare the performance of node_api_create_property_key_utf8 with napi_create_string_utf8.
  • The new benchmark test shows that reusing a property name created by node_api_create_property_key_utf8 is faster than using napi_create_string_utf8.

@mertcanaltin , are you sure that the latest commit in this PR has all the latest code?

  • The code is missing the implementation for node_api_create_property_key_latin1 and there are no tests or docs for it.
  • Docs for napi_set_named_property_len are not removed.
  • The benchmark test must be in the benchmark/napi folder. E.g. see the string test there.

vmoroz avatar Jul 16 '24 01:07 vmoroz

greetings, I tried to verify the structure, but I got a Cannot find package error @vmoroz

➜  node git:(dev-52979) ✗ make test
ninja -C out/Release 
ninja: Entering directory `out/Release'
ninja: no work to do.
if [ ! -r node ] || [ ! -L node ]; then ln -fs out/Release/node node; fi
/Applications/Xcode.app/Contents/Developer/usr/bin/make -s tooltest
.....
----------------------------------------------------------------------
Ran 5 tests in 0.002s

OK
/Applications/Xcode.app/Contents/Developer/usr/bin/make -s test-doc
node:internal/modules/esm/resolve:210
  const resolvedOption = FSLegacyMainResolve(packageJsonUrlString, packageConfig.main, baseStringified);
                         ^

Error: Cannot find package '/Users/mert/Desktop/openSource/node/tools/doc/node_modules/rehype-raw/package.json' imported from /Users/mert/Desktop/openSource/node/tools/doc/generate.mjs
    at legacyMainResolve (node:internal/modules/esm/resolve:210:26)
    at packageResolve (node:internal/modules/esm/resolve:829:14)
    at moduleResolve (node:internal/modules/esm/resolve:915:18)
    at defaultResolve (node:internal/modules/esm/resolve:1123:11)
    at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:557:12)
    at ModuleLoader.resolve (node:internal/modules/esm/loader:526:25)
    at ModuleLoader.getModuleJob (node:internal/modules/esm/loader:249:38)
    at ModuleJob._link (node:internal/modules/esm/module_job:126:49) {
  code: 'ERR_MODULE_NOT_FOUND'
}

Node.js v23.0.0-pre
make[2]: *** [out/doc/api/addons.html] Error 1
make[1]: *** [doc-only] Error 2
make: *** [test] Error 2
➜  node git:(dev-52979) ✗ 

mertcanaltin avatar Jul 18 '24 08:07 mertcanaltin

in addition, markdown formatter was getting error, I corrected it by using make format-md

mertcanaltin avatar Jul 18 '24 09:07 mertcanaltin

thank you very much for your review I tried to apply what you said @vmoroz

mertcanaltin avatar Jul 19 '24 16:07 mertcanaltin

ı added this test key, but get undefined error

TypeError: test_string.TestPropertyKeyUtf8 is not a function

mertcanaltin avatar Jul 20 '24 08:07 mertcanaltin

There are a couple of issues:

  • The new unit tests are failing because not all strings are valid Latin1. In this case we should remove the Latin1 tests for the str6. Note that the normal string creation also does test the Latin1 there.
  • The Benchmark test fails because it has no native code. Please follow the same pattern as the existing https://github.com/nodejs/node/tree/main/benchmark/napi/string test where they create a native module, and then test the exported functions from it in JS. We cannot simply call C code from JS like it is written currently. We must always export functions from a native module.

vmoroz avatar Jul 20 '24 14:07 vmoroz

this is great information thank you very much I will apply it

mertcanaltin avatar Jul 20 '24 17:07 mertcanaltin

applied many thanks @vmoroz 🚀

mertcanaltin avatar Jul 20 '24 19:07 mertcanaltin

many thanks for helping me solve the errors here @legendecas

mertcanaltin avatar Jul 24 '24 10:07 mertcanaltin

CI: https://ci.nodejs.org/job/node-test-pull-request/60590/

nodejs-github-bot avatar Jul 24 '24 13:07 nodejs-github-bot

@nodejs/node-api please take a look again :)

legendecas avatar Jul 24 '24 16:07 legendecas

@mertcanaltin, thank you for driving this PR to a completion! There are only a couple tiny typos in the docs. Just a few notes about the PR description:

  • Consider to add a link to the previous related PR #50282. Just to indicate that the utf16 case was covered previously in that PR.
  • Please remove "A benchmark test to compare the performance of node_api_create_property_key_utf8 with napi_create_string_utf8." I understand that it is going to be covered in different PR.
  • Consider to rephrase "These functions help in creating property keys more efficiently by reusing the same property key multiple times." to something like "These functions help in creating property keys that allow more efficient property access when are reused multiple times."

vmoroz avatar Jul 24 '24 17:07 vmoroz

made the latest updates, thank you very much for your support I learnt great things while doing this 🚀

mertcanaltin avatar Jul 24 '24 17:07 mertcanaltin

CI: https://ci.nodejs.org/job/node-test-pull-request/60606/

nodejs-github-bot avatar Jul 24 '24 22:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/60621/

nodejs-github-bot avatar Jul 25 '24 13:07 nodejs-github-bot

I agree that the key point of the new APIs is introducing internalized strings that can be used as optimized property names.

legendecas avatar Aug 02 '24 08:08 legendecas

Does V8 not use a reference-counted string type?

For JavaScriptCore, the performant way to do this would be to not make it a napi_value (or a JSValue) and instead be a pointer with a different type declaration that uses the existing reference-counted mechanism (WTF::AtomString and JSC::Identifier). It's a bit complicated to expose though, because it would mean a ref function, an unref function, and separate functions for putting/getting properties which accept that type. On the other hand, these strings work across global objects within the same thread (napi_env)

Jarred-Sumner avatar Aug 03 '24 06:08 Jarred-Sumner