binary-parser
binary-parser copied to clipboard
RangeError: Invalid array length
I am trying to parse an a float array. Normally, this code works, however, I now have one huge file (400MB of file), and I want to start reading it.
const dataParser = newParser()
.array("data", {
type: "floatle",
length: dataLength // 82,272,642
})
.saveOffset('dataLength');
const data = dataParser.parse(buffer);
As you can see, I am trying to parse an array with 82 million entries, which is less than the 2147483647 limit in javascript, however, I am getting the following error:
Uncaught (in promise) RangeError: Invalid array length at Array.push (
) at Parser.eval [as compiled]
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Invalid_array_length
Additional information
Manual experimentation finds that the limit is somewhere between 50,100,000 and 50,500,000.
(related to https://github.com/sign/translate/issues/44)
Thanks for reporting. I've never really created such huge arrays in JS.
Could you test if you can create an array with the same size (82 million elements) w/o using binary-parser? Your parser compiles to the following code, but I don't see anything suspicious that would bloat the memory footprint than expected.
var dataView = new DataView(buffer.buffer, buffer.byteOffset, buffer.length);
var offset = 0;
var vars = {};
vars.data = [];
for (var $tmp0 = 82272642; $tmp0 > 0; $tmp0--) {
var $tmp1 = dataView.getFloat32(offset, true);
offset += 4;
vars.data.push($tmp1);
}
vars.dataLength = offset
return vars;
This too fails, on the vars.data.push line.
If I add a log before that push, I get that the current vars.data.length is 50139473
Code that works:
data.data = new Float32Array(82272642);
for (var $tmp0 = 0; $tmp0 < 82272642; $tmp0++) {
var $tmp1 = dataView.getFloat32(offset, true);
offset += 4;
data.data[$tmp0] = $tmp1
}
data.dataLength = offset
If I initialize the necessary float32 array, that it is fine. Also, never needs to realloc.
This method btw, is 8 times faster, for an array of size 27,424,214, compared to the regular parsing.
Ok, that makes sense. Reallocs are definitely an overhead, and I guess typed arrays are more compact than normal arrays. But this approach would only work for fixed-length arrays of primitive types. Is that what you are parsing?
Yes, the largest arrays that I parse are indeed of fixed sizes (as in, I specify length to be parsed).
The normal behavior is still good for short arrays, I'd imagine.
It turns out you can directly create a Float32Array from an ArrayBuffer (zero copy). https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float32Array/Float32Array
Can you try if the following works?
const dataParser = new Parser()
.buffer("data", {
length: 82272642 * 4, // length in bytes
formatter: (buf) => new Float32Array(buf.buffer) // buf is a DataView
})
.saveOffset('dataLength');
sorry, i missed your previous message.
if i do what you wrote, and add a console.log
formatter: (buf) => {
console.log(buf);
return new Float32Array(buf.buffer)
}
In the console I see that buf is a Uint8Array, and an error:
Uncaught (in promise) RangeError: byte length of Float32Array should be a multiple of 4
because buf.buffer has an odd number of bytes

(for completeness sake, this is not the original large one, just a small scale test using length: 26578 * 4, and a seek of 1925 to get to the right place in this file)
Hi @keichi Is there any plan to support this in this library? If no plan, I'll use the custom solution, but it would be nice to at least catch this type of error and point people to this issue or some fix for future people.
I tried to write a test for it, but it passes, so I think it's out of my league to contribute here
describe('Large arrays', () => {
it('should parse large array without error', () => {
const length = 80_000_000;
const array = Buffer.from(new Float32Array(length).fill(0).buffer);
const parser = new Parser()
.array("data", {
type: "floatle",
length
});
const buffer = factory(array);
doesNotThrow(() => parser.parse(buffer));
})
})