One-line large CSV file is not chunked
The input File consists of one very long line/row in the form of abc1, abc2, abc3, ..., abc10000000. The file size is about 200 mb.
Here is the parsing code:
Papa.LocalChunkSize = 1024 * 1024; // 1 mb
const onUploadCsv = (event: React.ChangeEvent<HTMLInputElement>): void => {
const file = event.target.files?.[0];
if (!file) {
return;
}
Papa.parse(file, {
worker: false,
header: false,
beforeFirstChunk: chunk => {
console.log('Before first chunk callback chunk: ' + chunk);
return chunk;
},
chunk: (results, parser) => {
console.log(
'Chunk callback results data length: ' +
results.data.length,
);
console.log('Chunk callback results data: ' + results.data);
},
complete: () => {
console.log('Complete!');
},
});
// reset value to allow upload same file again
event.target.value = '';
};
The result is that beforeFirstChunk is executed as intended, but chunk callback just returns no data for each iteration and then returns the whole line in the last iteration, that defies the purpose of chunking/streaming. For multi-line/multi-row files everything is working as intended.
Could someone, please, explain the behaviour: is this format not supported or is there a bug in the library?
Interesting, chunking is not designed to work on a single line and I think adding that would require too many breaking API changes. This also seems like a pretty unreasonable thing to support. I think a good tradeoff for this library in service of being performant and accessible is to require that no individual row can exceed available memory.