xlsx-stream-reader
xlsx-stream-reader copied to clipboard
Doubt regarding workSheetReader.abort and workSheetReader.skip function.
Is my understanding correct regarding the workSheetReader.abort() and workSheetReader.skip() function.
The code will stop the processing of the worksheet as soon as workSheetReader.abort() is called, and then it will go on to other sheet, and if all other sheets are skipped by calling workSheetReader.skip() function, then the processing of the excel file will stop there it self, and workBookReader.on('end') event will be triggered.
If the above understanding is correct, then workSheetReader.abort() or workSheetReader.skip() is not working as expected.
I have a very large file (300MB) and I am reading only the first sheet like this.
if (workSheetReader.id > 1) {
workSheetReader.skip();
return;
}
And I am trying to only read the header row by this code
workSheetReader.on('row', function (row) {
if (row.attributes.r == 1) {
// do something with row 1 like save as column names
} else{
workSheetReader.abort();
}
});
The code is stopping after a good 10-15 min. So I am assuming that the code is processing all the other rows, and sheets as well, before stopping the processing.
After a quick look I do not think those functions are very efficient/as intended, and I actually don't use them in production. I suspect they could be optimized, but I would need time and test data
What would you suggest will be a better way to just read the header of the first sheet, and then stop while reading from excel.
What would you suggest will be a better way to just read the header of the first sheet, and then stop while reading from excel.
Hi, I have got the same situation. there is my workaround
// skip other sheets
if (Number(workSheetReader.id) > 1) {
workSheetReader.skip();
return;
}
// handle the title
workSheetReader.on('row', row => {
let i = Number(row.attributes.r) // assuming that the first line is the title
if (i === 1) {
// do something with the title.
// if you don't want handle the sheet any more
workSheetReader.removeAllListeners('row')
workSheetReader.abort()
workSheetReader.skip()
}
}).prependListener('end', () => {
// prepend our own 'end' cb
// delete the 'end' callback to avoid the error "TypeError: Cannot read property 'path' of undefined
// at Immediate.processBooks (/xxxxx/node_modules/xlsx-stream-reader/lib/workbook.js:188:65)"
// because workSheetReader.skip() will emit its own 'end' event while the real stream is still in the pipe process
// when work sheet stream reach the end, it will also emit workSheetReader's 'end' event, multiple 'end' will set the `currentBook` variable large than length of `waitingWorkSheets`
workSheetReader.removeAllListeners('end')
}).process()