sharpcompress
sharpcompress copied to clipboard
Add hints to random access availability to archives
Following #707
I've noticed you really don't know if you need to read sequentially or you can access randomly until you have opened the archive... correct me if I'm wrong:
Plain .TAR files can actually be read randomly.... but tar archives inside a gzip archive can't
So, if that's the case, the SupportsRandomAccess
property should go into the IArchive, not in the IArchiveFactory interface, right?
Also, wouldn't the IsSolid property would be enough for this? (in that case, IsSolid should be true for .tar.gz files, I've checked and currently it's false)
If IsSolid serves a different purpose, then, IArchive definitely needs a SupportsRandomAccess
Any thoughts?
IsSolid means a specific thing for RAR.
On Streams there's IsSeekable which basically means what we want but we need it on IArchive. An alternative is just not to have IArchive interfaces for non-seekable situations and tell people to use IReader
Got it... I'll do a new PR with that knowledge
Hmm.. having a hard time exposing the IsSeekable to IArchive .... opening a tar.gz reports IsSeekable to true in all the streams I can see around.
Anyway, the problem seems to be more tricky to handle;
On one side, most archives, including plain TAR can be opened as IArchive and accessed randomly.
TAR.XX is a special case and needs to be opened using an IReader.... because if you try to go through the IArchive path, it's just a Gzip with a single entry.
To understand the problem, what I am trying to do is a general archive reader, that relies on IArchiveFactory and does not know about the specifics of each archive format, and, whenever possible, try to use the random access, and only when that's not possible, to fall back to IReader.... but I would need a way to know which archives support and not support it.
The alternative is to also provide public static IReader Open(Stream stream, ReaderOptions? options = null) to IArchiveFactory
IReaderFactory has that Open I think.
You don't want to put stream's IsSeekable on IArchive. You want to return true/false based on the archive/compression format. File streams are always seekable but decompressing files is usually not. Zip/Rar has individual files compressed so can seek. TarGz/TarBz are one continuous compression so they're not seekable.
yes, it's ReaderFactory the one that has that Open... so I think an IReaderFactory interface is needed, in the same way that I introduced IArchiveFactory... so the spaguetty code can be removed and new readers can be registered.
If I follow that path, what I would like to avoid is having to register factories at both Archive* and Reader* so it could be goo to have a single factory class for each archive type, implementing both IArchiveFactory and IReaderFactory
Maybe a Factory folder, and moving all ZipArchiveFactory to it, renaming ot ZipFactory, etc?
that's not bad idea to just have singular factory classes or something to consolidate things
So, SevenZip doesn't have an implementation in the readers factory? is there a reason for it?
Added a PR: #709
So, SevenZip doesn't have an implementation in the readers factory? is there a reason for it?
This is because 7Zip requires random access to a file from my memory. The streams need to seek around to properly find headers and decompress the streams in the format. Readers only work for non-seekable streams.
@adamhathcock expanding this topic a bit further: which would be the recomended way to open archives in a generic way?
What I'm trying to achieve is to traverse a number of directories, containing all sorts of archives (zip,rar, 7z, etc) open them and scan their content.
You'll have to implement that yourself if you can't guarantee Reader. It's beyond the scope of the library.
I've been away for personal reasons.