SharpZipLib icon indicating copy to clipboard operation
SharpZipLib copied to clipboard

feature request: multiple Zip Entries for the same content

Open jspraul opened this issue 7 years ago • 5 comments

What is the minimum effort required to point multiple zip entries at the same content (rather than duplicating the content)? Basically want to have folder1/file1 + folder2/file2 point to the same content with different timestamps / other metadata (same size, though if that's not important it's cool to know it's possible to reference the beginning of a file with a separate entry).

Has anyone who has experimented with this type of file already given up because it triggers errors in verification or with default OS unzip implementations?

I am looking at this instead of using Git because people don't have the tooling by default, and they would probably be confused by Git anyway. I just don't want to have thousands of the same file around in the .ZIP!

jspraul avatar Jan 22 '18 20:01 jspraul

Officially no, not as far as I can tell. The zip format needs one file header per file data. It might be possible to change the offset in the central directory header to point at another file's data and have it's "real" file data be 0 length. The problem is that the file names will differ in the local header versus the directory header, and that might either make the extracting software put the wrong file name, decide that the archive is corrupt, or just plainly crash.

It would be a cool experiment tho!

piksel avatar May 12 '18 15:05 piksel

only tangentially related, but the article about zip bombs @ https://www.bamsoftware.com/hacks/zipbomb/ describes a bunch of clever and/or evil things that can be done with zip files, including having multiple central directory entries pointing at the same file data (though it does point out that some extractors won't like the file, and it won't work with streaming readers).

Numpsy avatar Apr 08 '20 22:04 Numpsy

I tried in a custom implementation (so not SharpZipLip, I just wrote the stream "manually"). Windows 10 extracted without problems. 7-zip could open the archive and display the individual files, but an "extract all" failed, it did not extract all the files. In case it was the mismatching names getting it, maybe masking the local header name would work, but I did not spend a lot of time on this as it was not my main goal (the main goal was multithreaded zipping, and that was luckily rather trivial to do).

I was sure I have read that the central directory always win (to allow renaming a file), but not sure where I read that anymore. I can't find it in the zip specification. Might just have been someone concluding it worked for the tools he used.

lmoelleb avatar Dec 03 '20 21:12 lmoelleb

Intresting! For my archive diagnostic tool I extracted some parts of SharpZipLib to be used for doing non-standard things with zip files. Might turn that into a separate library for experiments like this.

piksel avatar Dec 07 '20 09:12 piksel

Also, maybe we should close this, since there is no way to achieve the feature without creating invalid zip files? The thing we might want to support is for consumers to do more advanced things with the format if they don't care about the files being readable by other software...

piksel avatar Dec 07 '20 09:12 piksel