node-foundationdb
node-foundationdb copied to clipboard
Arguments error
Hello.
We catch Invalid arguments exception on
- 6.2.28 fdb client for ubuntu
- [email protected]
- Docker image based on node:16.13.1
Stack trace:
Error:
Invalid arguments
---------
Invalid argument
---------
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z16create_backtracev+0x27) [0x7f6f787dd6b1]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z14create_tracingP10napi_env__PKc+0x26) [0x7f6f787dd866]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z15throw_if_not_okP10napi_env__11napi_status+0x58) [0x7f6f787dd983]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z10getWrappedP10napi_env__P20napi_callback_info__+0x87) [0x7f6f787ddc1d]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(+0x56c3) [0x7f6f787d86c3]
node() [0xaabfcd]
node() [0xd4a35e]
node(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0xaf) [0xd4b77f]
node() [0x15e78f9]
at Database.rawCreateTransaction (node_modules/foundationdb/dist/lib/database.js:55:51)
at Database.doTn (node_modules/foundationdb/dist/lib/database.js:39:21)
at Database.doTransaction (node_modules/foundationdb/dist/lib/database.js:43:21)
at Object.doTransaction (node_modules/@deeplay/foundationdb-event-store-core/lib/utils/doTransaction.js:13:10)
at Object.doIdempotentTransaction (node_modules/@deeplay/foundationdb-event-store-core/lib/index.js:68:94)
at writeTransition (build/table/writeTransition.js:26:37)
at writeResult.baseMs (build/table/doTask.js:30:119)
at retry (node_modules/abort-controller-x/lib/retry.js:13:26)
at doTask (build/table/doTask.js:30:66)
at build/table/index.js:129:62
When calling napi_get_cb_info we obtain a pointer to thisObj from native code using the callback’s descriptor. After that we try to napi_unwrap the pointer, but the arguments error appear:
https://github.com/josephg/node-foundationdb/blob/master/src/utils.cpp#L79
I am not sure how to stably reproduce that behaviour. I was able to check the backtrace when I remove the pod, and the service forcedly stopped. But we catch same error sometimes while service is working as usual, and in this case all of the new transactions after the error will be completed with the same error.
Could you please help me to dig deeper to find the root of the issue? Can we somehow get an enriched logging to understand what is wrong and whether the bug is on our side or the native code one?
Interesting! Does it always happen when the service is forcibly stopped? I’d love to get a repro case if we can figure one out.
I can add some print statements or something, but I’m not sure how much that’d help..
Does it always happen when the service is forcibly stopped?
Not sure. I try to create a setup for a stable reproduction, although I'm not sure of success - seems we catch that error only in k8s environment.
Maybe you can give me recommendations about nodegyp-build setup to collect more information in backtrace?
seems we catch that error only in k8s environment.
That’s weird!
Also with node-gyp, based on that stack trace it looks like you’re already doing a build for the docker container. That stack trace comes from a local debug build. (Did you opt in to that?). I wonder if the fact you’re using a debug build in k8s has something to do with the error showing up.
When you do an npm install, the fdb npm package first tries to load from prebuilds/ first and if any error happens it runs a build. To build with gyp, you’ll need a local build environment (apt install build-essential) and then run npx node-gyp build —debug.
Hah - I was just looking at where the string "Invalid arguments" is coming from. Its a napi error, and my code in utils.cpp emits that string. Apparently I've had that problem before, because I left some commented out code for printing a full stack trace at the location where the error is coming from.
Restore the code like this and see if that gives you more information:
// ...
#include <execinfo.h>
#include <stdlib.h>
// Useful for figuring out where an error is coming from.
// From GCC docs
void print_trace ()
{
void *array[10];
size_t size;
char **strings;
size_t i;
size = backtrace (array, 10);
strings = backtrace_symbols (array, size);
printf ("Obtained %zd stack frames.\n", size);
for (i = 0; i < size; i++)
printf ("%s\n", strings[i]);
free (strings);
}
napi_status throw_if_not_ok(napi_env env, napi_status status) {
switch (status) {
case napi_ok: case napi_pending_exception:
return status;
case napi_invalid_arg:
print_trace(); // <-- uncomment this line
napi_throw_error(env, NULL, "Invalid arguments");
return napi_pending_exception;
case napi_number_expected:
napi_throw_type_error(env, NULL, "Expected number");
return napi_pending_exception;
case napi_string_expected:
napi_throw_type_error(env, NULL, "Expected string");
return napi_pending_exception;
case napi_generic_failure:
napi_throw_type_error(env, NULL, "Generic failure");
return napi_pending_exception;
default:
fprintf(stderr, "throw_if_not_ok %d\n", status);
assert(0);
return napi_invalid_arg; // Unreachable but silences a vcc warning.
}
}
Yes, I already use this snippet :)
But I send stack trace with the string Invalid arguments and a result of napi_get_last_error_info into js, in the same way, I rebuild the nodejs-foundationdb library with debug symbols npm run install --build-from-source --debug.
All of that manipulations were be for investigation and getting stack traces from native code.
The Invalid arguments error appeared while we used the original package.
That’s weird!
Agree.
We didn't find any way for stable reproduction and tried to get more information from the production environment. But I work on this problem. Maybe I can do it.
The sequence of events here is:
- in
database.ts, doTn calls rawCreateTransaction:
rawCreateTransaction(opts?: TransactionOptions) {
return new Transaction<KeyIn, KeyOut, ValIn, ValOut>(this._db.createTransaction(), false, this.subspace, opts)
}
- Which calls
_db.createTransaction(), which immediately tries to figure out the_dbargument via this:
static napi_value createTransaction(napi_env env, napi_callback_info info) {
FDBDatabase *db = (FDBDatabase *)getWrapped(env, info);
// ..
And getWrapped (why is the function name spelled inconsistently?) looks like this:
void *getWrapped(napi_env env, napi_callback_info info) {
void *data;
napi_value obj;
NAPI_OK_OR_RETURN_NULL(env, napi_get_cb_info(env, info, 0, NULL, &obj, NULL));
NAPI_OK_OR_RETURN_NULL(env, napi_unwrap(env, obj, &data));
return data;
}
Now, what I'm not sure about is which of those calls (napi_get_cb_info or napi_unwrap) is generating this napi_invalid_arg error. Maybe try replacing getWrapped with this:
void *unwrap(napi_env env, napi_value obj) {
void *data;
NAPI_OK_OR_RETURN_NULL(env, napi_unwrap(env, obj, &data));
return data;
}
void *getWrapped(napi_env env, napi_callback_info info) {
napi_value obj;
NAPI_OK_OR_RETURN_NULL(env, napi_get_cb_info(env, info, 0, NULL, &obj, NULL));
return unwrap(env, obj);
}
This should not change the function's behaviour, but we will see if unwrap() shows up in the stack trace to find out which napi_ method is triggering the napi_invalid_arg error.
I'm also very suspicious of a non-deterministic error here. I wouldn't be surprised if its a bug in nodejs. If you're trying to reproduce it outside k8s, check you're using the exact same version of nodejs in both places.
I followed your recommendations and divide getWrapped into two functions.
The stack trace below is taken from a new build:
Error:
Invalid arguments
---------
Invalid argument
---------
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z16create_backtracev+0x27) [0x7f0d002b46c1]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z14create_tracingP10napi_env__PKc+0x26) [0x7f0d002b4876]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z15throw_if_not_okP10napi_env__11napi_status+0x58) [0x7f0d002b4993]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z6unwrapP10napi_env__P12napi_value__+0x37) [0x7f0d002b4bdd]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(_Z10getWrappedP10napi_env__P20napi_callback_info__+0x73) [0x7f0d002b4c6f]
/opt/app/node_modules/foundationdb/build/Debug/fdblib.node(+0x56d3) [0x7f0d002af6d3]
node() [0xaabfcd]
node() [0xd4a35e]
node(_ZN2v88internal21Builtin_HandleApiCallEiPmPNS0_7IsolateE+0xaf) [0xd4b77f]
node() [0x15e78f9]
at Database.rawCreateTransaction (node_modules/foundationdb/dist/lib/database.js:55:51)
at Database.doTn (node_modules/foundationdb/dist/lib/database.js:39:21)
at Database.doTransaction (node_modules/foundationdb/dist/lib/database.js:43:21)
at Object.doTransaction (node_modules/@deeplay/foundationdb-event-store-core/lib/utils/doTransaction.js:13:10)
at Object.doIdempotentTransaction (node_modules/@deeplay/foundationdb-event-store-core/lib/index.js:68:94)
at writeTransition (build/table/writeTransition.js:26:37)
at writeResult.baseMs (build/table/doTask.js:30:119)
at retry (node_modules/abort-controller-x/lib/retry.js:13:26)
at doTask (build/table/doTask.js:30:66)
at build/table/index.js:129:62
This stack trace is taken in a similar way.
I'm still in the progress of creating a setup for stable reproduction