sheetjs icon indicating copy to clipboard operation
sheetjs copied to clipboard

How to read huge Excel file without load entire file in memory and append new rows?

Open aislanmaia opened this issue 3 years ago • 6 comments

I need to append new rows to a huge excel file. How to do it without to read the entiry file in the memory, which can potencially slowing down the process and heap out of memory error ?

aislanmaia avatar Jun 09 '22 00:06 aislanmaia

I am also struggling with the same issue, but in my case it is CSV file

farideliyev avatar Jun 10 '22 21:06 farideliyev

excel stream to csv, and handle csv file using pipeline. this is my may. it can append data item into any where.

HappyFerry avatar Jun 16 '22 04:06 HappyFerry

@HappyFerry may you please share your implementation ?

farideliyev avatar Jun 16 '22 10:06 farideliyev

Yeah, @HappyFerry I'd be highly interested in how you stream excel to csv, especially with large excel files. Would be very grateful if you shared some details.

flaushi avatar Sep 03 '22 06:09 flaushi

This library, and every other JS library, is limited to the browser APIs in the frontend and limited to platform APIs on the backend.

XLSX is a ZIP-based file format and cannot be incrementally processed with a streaming data source for structural reasons as described in https://docs.sheetjs.com/docs/solutions/input#example-readable-streams . ZIP was designed for streaming write and assumes readers have random access.

That means, to first order, the browser must retain a seekable copy of the original file. (In NodeJS there is a way to avoid this using child_process to extract the ZIP file to the filesystem using the unzip command-line tool.)

Assuming retaining the original file in memory is not an issue, that specific workflow (reading a file and appending rows) can be optimized at the expense of being able to extract data in the process. Our Pro Edit build can skip the worksheet generation and surgically edit the raw XML.

If there is interest in a NodeJS-specific large file processor, please let us know.

SheetJSDev avatar Sep 03 '22 19:09 SheetJSDev

Save as the xlsx format or CSV format change it in to binary sheet so the huge data file easily access.

RameshDhanapal2022 avatar Oct 13 '22 10:10 RameshDhanapal2022