binary-parser icon indicating copy to clipboard operation
binary-parser copied to clipboard

RangeError: Invalid array length

Open AmitMY opened this issue 3 years ago • 7 comments
trafficstars

I am trying to parse an a float array. Normally, this code works, however, I now have one huge file (400MB of file), and I want to start reading it.

    const dataParser = newParser()
        .array("data", {
            type: "floatle",
            length: dataLength // 82,272,642
        })
        .saveOffset('dataLength');

    const data = dataParser.parse(buffer);

As you can see, I am trying to parse an array with 82 million entries, which is less than the 2147483647 limit in javascript, however, I am getting the following error:

Uncaught (in promise) RangeError: Invalid array length at Array.push () at Parser.eval [as compiled]

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Errors/Invalid_array_length

Additional information

Manual experimentation finds that the limit is somewhere between 50,100,000 and 50,500,000.

(related to https://github.com/sign/translate/issues/44)

AmitMY avatar Jun 14 '22 14:06 AmitMY

Thanks for reporting. I've never really created such huge arrays in JS.

Could you test if you can create an array with the same size (82 million elements) w/o using binary-parser? Your parser compiles to the following code, but I don't see anything suspicious that would bloat the memory footprint than expected.

var dataView = new DataView(buffer.buffer, buffer.byteOffset, buffer.length);
var offset = 0;
var vars = {};

vars.data = [];
for (var $tmp0 = 82272642; $tmp0 > 0; $tmp0--) {
    var $tmp1 = dataView.getFloat32(offset, true);
    offset += 4;
    vars.data.push($tmp1);
}
vars.dataLength = offset

return vars;

keichi avatar Jun 14 '22 14:06 keichi

This too fails, on the vars.data.push line. If I add a log before that push, I get that the current vars.data.length is 50139473

Code that works:

    data.data = new Float32Array(82272642);
    for (var $tmp0 = 0; $tmp0 < 82272642; $tmp0++) {
        var $tmp1 = dataView.getFloat32(offset, true);
        offset += 4;
        data.data[$tmp0] = $tmp1
    }
    data.dataLength = offset

If I initialize the necessary float32 array, that it is fine. Also, never needs to realloc.

This method btw, is 8 times faster, for an array of size 27,424,214, compared to the regular parsing.

AmitMY avatar Jun 14 '22 14:06 AmitMY

Ok, that makes sense. Reallocs are definitely an overhead, and I guess typed arrays are more compact than normal arrays. But this approach would only work for fixed-length arrays of primitive types. Is that what you are parsing?

keichi avatar Jun 14 '22 14:06 keichi

Yes, the largest arrays that I parse are indeed of fixed sizes (as in, I specify length to be parsed). The normal behavior is still good for short arrays, I'd imagine.

AmitMY avatar Jun 14 '22 15:06 AmitMY

It turns out you can directly create a Float32Array from an ArrayBuffer (zero copy). https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Float32Array/Float32Array

Can you try if the following works?

const dataParser = new Parser()
    .buffer("data", {
        length: 82272642 * 4, // length in bytes
        formatter: (buf) => new Float32Array(buf.buffer) // buf is a DataView
    })
    .saveOffset('dataLength');

keichi avatar Jun 14 '22 15:06 keichi

sorry, i missed your previous message.

if i do what you wrote, and add a console.log

            formatter: (buf) => {
                console.log(buf);
                return new Float32Array(buf.buffer)
            } 

In the console I see that buf is a Uint8Array, and an error:

Uncaught (in promise) RangeError: byte length of Float32Array should be a multiple of 4

because buf.buffer has an odd number of bytes image

(for completeness sake, this is not the original large one, just a small scale test using length: 26578 * 4, and a seek of 1925 to get to the right place in this file)

AmitMY avatar Jun 16 '22 19:06 AmitMY

Hi @keichi Is there any plan to support this in this library? If no plan, I'll use the custom solution, but it would be nice to at least catch this type of error and point people to this issue or some fix for future people.


I tried to write a test for it, but it passes, so I think it's out of my league to contribute here

describe('Large arrays', () => {
    it('should parse large array without error', () => {
      const length = 80_000_000;
      const array = Buffer.from(new Float32Array(length).fill(0).buffer);

      const parser = new Parser()
        .array("data", {
          type: "floatle",
          length
        });

      const buffer = factory(array);
      doesNotThrow(() => parser.parse(buffer));
    })
  })

AmitMY avatar Jun 24 '22 12:06 AmitMY