stream-json icon indicating copy to clipboard operation
stream-json copied to clipboard

RangeError: Invalid array length

Open sneko opened this issue 1 year ago • 2 comments

Hi @uhop ,

I'm trying to write a file with big nested object, here what I'm using:

export async function writeBigJsonFile(filePath: string, jsonObject: object): Promise<void> {
  return await new Promise((resolve, reject) => {
    const jsonStream = new Readable({ objectMode: true });

    const pipeline = chain([
      jsonStream,
      disassembler(),
      stringer(),
      fsSync.createWriteStream(filePath, {
        encoding: 'utf-8',
      }),
    ]);

    pipeline.once('finish', () => {
      resolve();
    });

    pipeline.once('error', (error) => {
      reject(error);
    });

    jsonStream.push(jsonObject);
    jsonStream.push(null);
  });
}

Unfortunately I get the following error:

node:internal/streams/readable:570
      state.buffer.push(chunk);
                   ^

RangeError: Invalid array length
    at Array.push (<anonymous>)
    at addChunk (node:internal/streams/readable:570:20)
    at readableAddChunkPushObjectMode (node:internal/streams/readable:536:3)
    at Readable.push (node:internal/streams/readable:391:5)
    at Disassembler._transform (/Users/sneko/Documents/beta.gouv.fr/repos/figpot/node_modules/stream-json/Disassembler.js:62:40)
    at Transform._write (node:internal/streams/transform:171:8)
    at writeOrBuffer (node:internal/streams/writable:564:12)
    at _write (node:internal/streams/writable:493:10)
    at Writable.write (node:internal/streams/writable:502:10)
    at Readable.ondata (node:internal/streams/readable:1007:22)

Node.js v20.15.0

Are you aware of something to pass this?

Note:

  • it works perfectly for a small object
  • the current object is around ~600MB when in a file

EDIT: I guess this is due to the array limit due to the depth of the object... using https://www.npmjs.com/package/json-stream-stringify to write the file is working. Since they do not allow transformers while streaming I guess that's why there is not something like a path array that would break.

sneko avatar Jul 22 '24 09:07 sneko

By default, disassembler() and stringer() can work with streamed and packed values. See Disassembler and Stringer for more details. For big data it is better to stick to one rather then using both. It looks like you hit this restriction.

I am not sure why you need jsonStream — you can feed pipeline directly. Excluding it will reduce the delay.

Having said that, I never expected that disassembler() will be used for such huge objects. The problem appears in the guts of Node streams and it is likely due to the fact that disassembler() overflows that stream never checking if it needs to flush its content before continuing. I'll try to come up with a solution for that.

uhop avatar Aug 26 '24 06:08 uhop

This is the new implementation: 6165e1e739126d4bc316d45885c26fc035e10e33

Please give it a try to see if it works for you before the release.

uhop avatar Aug 26 '24 18:08 uhop

Published as 1.9.0.

uhop avatar Nov 12 '24 16:11 uhop