libarchivejs icon indicating copy to clipboard operation
libarchivejs copied to clipboard

Streaming API

Open Wolvan opened this issue 4 years ago • 8 comments

I am considering to use this library with big files (read archives >4GB). Is there a possibility to implement streaming the output of a file extraction action without storing it in memory? Otherwise I'll probably end up with multiple GB of RAM usage only to hold the data that the library extracted.

Wolvan avatar Feb 10 '20 10:02 Wolvan

do you have any specific API in mind ? should we just return chucks of typed arrays ?

nika-begiashvili avatar Feb 14 '20 08:02 nika-begiashvili

Chunked type arrays would work perfectly. I don't think that browsers have a standardized streaming interface, so just continously returning the chunks (in order, of course) in a callback is a decent implementation.

Wolvan avatar Feb 14 '20 11:02 Wolvan

I would also be very happy to have this feature enhancement!

AndreiRinea avatar May 25 '20 17:05 AndreiRinea

I'm just wondering if there are any plans to have it.

amykhailovskyi avatar Aug 24 '20 14:08 amykhailovskyi

unfortunately I do not have this planned yet due to lack of time

nika-begiashvili avatar Aug 31 '20 10:08 nika-begiashvili

Revisiting this I have a question about use-case, if there's a single large file wouldn't it end-up in RAM anyway even if it's streamed as chunked ? unless it's streamed to network right away, it which case it would make more sense to decompress on server

nika-begiashvili avatar Jan 10 '24 17:01 nika-begiashvili

Hi @nika-begiashvili we are interested in a streaming API.

We sometimes need to process 10GB+ files in the browser. We are only interested in a subset of the files in these archives based on a pattern (this subset is about <1% of the overall size). Our use case would be to scan the archive to get a list of file paths, then selectively unarchive files based on a file pattern.

Is this something that is theoretically possible with the way libarchive is designed? We'd be willing to sponsor an improvement.

venkatd avatar Jan 13 '24 12:01 venkatd

Yes, I think that should be possible since javascript File object can be read by chunks and libarchive does provide custom read callbacks, although it will need to call javascript functions from C

nika-begiashvili avatar Jan 13 '24 17:01 nika-begiashvili