bigint-buffer icon indicating copy to clipboard operation
bigint-buffer copied to clipboard

mocha failure

Open krasic opened this issue 6 years ago • 7 comments

Hi. I am trying to change parquetjs to support 64 bit integers using bigint-buffer. The change seems straightforward and works in client code I have. However, I want to update parquetjs tests to reflect my change, and when I try to run their test I see this failure:

`$ npm test

[email protected] test /home/***/parquetjs mocha

ParquetCodec::PLAIN ✓ should encode BOOLEAN values ✓ should decode BOOLEAN values ✓ should encode INT32 values ✓ should decode INT32 values node: ../src/bigint-buffer.c:132: fromBigInt: Assertion status == napi_ok' failed. Aborted (core dumped) I gather this is something related to native modules, but I'm a n00b in that regard. Any ideas?

krasic avatar Jan 06 '20 19:01 krasic

Thanks for reporting this. Do you have a fork with the changes you've made? I could take a look into it.

no2chem avatar Jan 06 '20 19:01 no2chem

PR here: Use bigint for int64

krasic avatar Jan 06 '20 19:01 krasic

Hm, I see the issue.

Your call to toBufferLE passes a number instead of a bigint: https://github.com/ZJONSSON/parquetjs/pull/37/files#diff-b67720d607909e3cae702d746c5bf6f0R53

bigint-buffer doesn't know how to handle numbers so it errors out. moreover, you're passing an array of numbers, so it really doesn't know how to handle this.

There's two fixes needed for this:

one is that we need to handle numbers as well as bigints. second, is that we need to be able to write into a buffer you've allocated. This would eliminate the need for copy and allow you to iterate over the entire array, resulting in multiple n-api transitions.

I'd be happy to look into this, though I'm not sure if I can give you an ETA.

no2chem avatar Jan 07 '20 04:01 no2chem

@krasic - also, I'm interested in what you're trying to do with parquet. if you're looking for performance, it would seem that implementing native bindings would be useful, though I don't know if that's compatible with parquetjs's goals.

no2chem avatar Jan 07 '20 04:01 no2chem

@krasic sorry for the spam, but one more thing: it looks like node 12 has now implemented Buffer.writeBigInt64BE|LE, which probably implements all the functionality you need. https://nodejs.org/api/buffer.html#buffer_buf_writebigint64be_value_offset

It does seem like the underlying implementation is in javascript though, https://github.com/nodejs/node/blob/e71a0f4d5faa4ad77887fbb3fff0ddb7bca6942e/lib/internal/buffer.js#L590

So it'll be interesting to benchmark to see what the performance difference is.

no2chem avatar Jan 07 '20 04:01 no2chem

On Mon, Jan 6, 2020 at 8:17 PM Michael Wei [email protected] wrote:

@krasic https://github.com/krasic - also, I'm interested in what you're trying to do with parquet. if you're looking for performance, it would seem that implementing native bindings would be useful, though I don't know if that's compatible with parquet.

The issue I am trying to address is that parquetjs fails reading some files I have that are generated by an apache pig based the job, because those files contain 64 bit integer values. I believe incomplete 64 bit integer support is a known issue with parquetjs.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/no2chem/bigint-buffer/issues/21?email_source=notifications&email_token=ABRABBU3R6KMXJYNREW7I53Q4P66ZA5CNFSM4KDI3FT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIHUKTQ#issuecomment-571426126, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRABBT3UCWX5GKOLF7LE2DQ4P66ZANCNFSM4KDI3FTQ .

krasic avatar Jan 07 '20 17:01 krasic

On Mon, Jan 6, 2020 at 8:23 PM Michael Wei [email protected] wrote:

@krasic https://github.com/krasic sorry for the spam, but one more thing: it looks like node 12 has now implemented Buffer.writeBigInt64BE|LE, which probably implements all the functionality you need. https://nodejs.org/api/buffer.html#buffer_buf_writebigint64be_value_offset

It does seem like the underlying implementation is in javascript though,

https://github.com/nodejs/node/blob/e71a0f4d5faa4ad77887fbb3fff0ddb7bca6942e/lib/internal/buffer.js#L590

So it'll be interesting to benchmark to see what the performance difference is.

Ah. I think that code may be the most promising solution for me. My use case isn't performance critical.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/no2chem/bigint-buffer/issues/21?email_source=notifications&email_token=ABRABBWAQG4ZXYYYETC5OKTQ4P7SPA5CNFSM4KDI3FT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIHUS4A#issuecomment-571427184, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRABBS6QOCA66LFPB6FAILQ4P7SPANCNFSM4KDI3FTQ .

krasic avatar Jan 07 '20 17:01 krasic