excel-streaming-reader icon indicating copy to clipboard operation
excel-streaming-reader copied to clipboard

Reading first row from first sheet before full file available?

Open david5w opened this issue 4 years ago • 5 comments

Thank you for this library. Is it reasonable to expect this library to enable access to the first row in the first sheet, when the file is not yet completely available (still sequentially arriving)? Based on the nature of the error, it seems like the underlying ZIP might disallow this.

When I try the following, I get the resulting exception.

List<String> list = new LinkedList<String>();
InputStream in = <...>
try (
	Workbook workbook = StreamingReader.builder().open(in)) {
	Sheet sheet = workbook.getSheetAt(0);
	Row r = sheet.getRow(0);
	for (Cell c : r) {
		list.add(c.getStringCellValue());
	}
}

org.apache.poi.openxml4j.exceptions.InvalidOperationException: Can't open the specified file: '<...>.xlsx' at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:137) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:252) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:201) at com.github.pjfanning.xlsx.impl.StreamingWorkbookReader.init(StreamingWorkbookReader.java:117) at com.github.pjfanning.xlsx.impl.StreamingWorkbookReader.init(StreamingWorkbookReader.java:93) at com.github.pjfanning.xlsx.StreamingReader$Builder.open(StreamingReader.java:247) <...> Caused by: java.util.zip.ZipException: zip END header not found at java.base/java.util.zip.ZipFile$Source.zerror(ZipFile.java:1585) at java.base/java.util.zip.ZipFile$Source.findEND(ZipFile.java:1439) at java.base/java.util.zip.ZipFile$Source.initCEN(ZipFile.java:1448) at java.base/java.util.zip.ZipFile$Source.(ZipFile.java:1249) at java.base/java.util.zip.ZipFile$Source.get(ZipFile.java:1211) at java.base/java.util.zip.ZipFile$CleanableResource.(ZipFile.java:701) at java.base/java.util.zip.ZipFile.(ZipFile.java:240) at java.base/java.util.zip.ZipFile.(ZipFile.java:171) at java.base/java.util.zip.ZipFile.(ZipFile.java:185) at org.apache.poi.openxml4j.util.ZipSecureFile.(ZipSecureFile.java:105) at org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:158) at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:135) <...>

david5w avatar Mar 09 '21 19:03 david5w

This is not an issue in this library. The exception happens in java.util.zip.ZipFile

pjfanning avatar Mar 10 '21 21:03 pjfanning

Understood. But it might be worthwhile to clarify to your audience that your streaming API cannot control whether any of its dependencies do non-streaming things (case in point), and that the memory and speed benefits of streaming may therefore be adversely impacted. For our use case, it meant we couldn't use this library. Nevertheless, thank you for making this library available. Kind Regards.


From: PJ Fanning @.> Sent: Wednesday, March 10, 2021 4:34 PM To: pjfanning/excel-streaming-reader @.> Cc: David Barron @.>; Author @.> Subject: Re: [pjfanning/excel-streaming-reader] Reading first row from first sheet before full file available? (#38)

This is not an issue in this library. The exception happens in java.util.zip.ZipFile

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/pjfanning/excel-streaming-reader/issues/38#issuecomment-796155436, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABIYKLTQSZNUP4SNH7IN6P3TC7QXZANCNFSM4Y4IYLVA.

david5w avatar Mar 10 '21 21:03 david5w

https://github.com/pjfanning/excel-streaming-reader/issues/103#issuecomment-1068119418 - similar use case and my answer is applicable here too

pjfanning avatar Mar 15 '22 16:03 pjfanning

@david5w did you find a way to fit your use case?

oviniciuslara avatar Mar 15 '22 18:03 oviniciuslara

We have not.


From: Vinícius Lara @.> Sent: Tuesday, March 15, 2022 2:53 PM To: pjfanning/excel-streaming-reader @.> Cc: David Barron @.>; Mention @.> Subject: Re: [pjfanning/excel-streaming-reader] Reading first row from first sheet before full file available? (#38)

@david5whttps://github.com/david5w did you find a way to fit your use case?

— Reply to this email directly, view it on GitHubhttps://github.com/pjfanning/excel-streaming-reader/issues/38#issuecomment-1068344448, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABIYKLQIAKA5DUG7NFHJXPTVADMCHANCNFSM4Y4IYLVA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub. You are receiving this because you were mentioned.Message ID: @.***>

david5w avatar Oct 11 '22 07:10 david5w