droid icon indicating copy to clipboard operation
droid copied to clipboard

ZIP Container error

Open thorsted opened this issue 4 years ago • 28 comments

Attempting to make a signature for a file format in a ZIP container. Created signature and tested in DROID and file is only identified as ZIP signature. Checked the log and can see an error: "WARN Could not process the potential container format (ZIP): file:/Users/thorsted/Documents/file.olm ZIP file spanning/splitting is not supported!"

Checked multiple samples for file format and getting the same error. Is this a bug?

thorsted avatar Aug 21 '19 23:08 thorsted

I found this potentially related issue: https://github.com/digital-preservation/droid/issues/71

Are you running the latest version of DROID?

anjackson avatar Aug 22 '19 09:08 anjackson

Hi Tyler,

Are you able to share an example file that encounters this issue, either here or privately?

David

Dclipsham avatar Aug 22 '19 10:08 Dclipsham

Running version 6.4. It does seem connected to issue #71. Samples sent to David.

thorsted avatar Aug 22 '19 13:08 thorsted

Thanks Tyler, easy to reproduce here, and confirming as bug. My instinct is that it is similar to #71 and probably has a similar resolution (updating/changing zip handler libraries) - I've revisited the problematic files from #100 and they work with DROID 6.4 but should also be tested as part of any fix for this issue.

Dclipsham avatar Aug 23 '19 09:08 Dclipsham

To note also, These zips unpack happily in 7zip and Windows Explorer, however when browsing the contents before unpacking, both tools have issues gathering properties of the contents, as per the images below incompleteProperties7Zip

incompleteProperties

Dclipsham avatar Aug 23 '19 09:08 Dclipsham

David, I also was able to confirm, by recompressing with zip and naming as OLM, my container signature identifies them correctly.

-Tyler

On Fri, Aug 23, 2019 at 3:38 AM David Clipsham [email protected] wrote:

To note also, These zips unpack happily in 7zip and Windows Explorer, however when browsing the contents before unpacking, both tools have issues gathering properties of the contents, as per the images below [image: incompleteProperties7Zip] https://user-images.githubusercontent.com/2189778/63583041-1d5c5c00-c592-11e9-88e9-c3b3d7d527eb.png [image: incompleteProperties] https://user-images.githubusercontent.com/2189778/63583043-1d5c5c00-c592-11e9-85e4-d2c46171df83.png

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/digital-preservation/droid/issues/232?email_source=notifications&email_token=AABUCMEDYJD4IVAVCLQZTNDQF6VZFA5CNFSM4IOPK7AKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD47WHVY#issuecomment-524248023, or mute the thread https://github.com/notifications/unsubscribe-auth/AABUCMDZXOE56BPJ4DL6CF3QF6VZFANCNFSM4IOPK7AA .

thorsted avatar Aug 23 '19 14:08 thorsted

There's been a conversation around this issue on Google Groups - https://groups.google.com/forum/#!topic/droid-list/N1tE3ZuEDbo - it seems that the zip handler either needs updating or replacing as the error handling throwing the warning appears to be unhappy with what should be valid Zip64 structure (https://groups.google.com/d/msg/droid-list/N1tE3ZuEDbo/wWDP_FjYAwAJ)

Dclipsham avatar Aug 27 '19 09:08 Dclipsham

hi @thorsted my solution is actually not satisfying, I forced droid to identify olm files as archives and scan their contents, while actually it should just be identified as a container file.

Would you have a signature file to submit for olm files?

jcharlet avatar Dec 03 '19 18:12 jcharlet

Here is my initial signature attempt. Yes, I agree, these should be identified as containers. OLM-Sig.zip

thorsted avatar Dec 03 '19 18:12 thorsted

Does it work on your side when you run droid @thorsted ? All my olm samples are identified as zip still. (on droid6.4 and from master branch)

Screenshot from 2019-12-06 14-03-15 olm-sample.zip

jcharlet avatar Dec 06 '19 14:12 jcharlet

I have another ZIP Container error. Different message. "Could not process the archival format(ZIP): file:///Flash5.5-S01v5.fla Expected 25 more entries in the Central Directory!"

7ZIP shows a header error when testing. FLA-error.zip

thorsted avatar Jul 12 '22 23:07 thorsted

@thorsted is that new one not the file itself? (rather than something DROID should compensate for?)

Was interested to have a look. lsar and unar work well in Linux, or seem to. But 7z and zip are as follows:

$ zip -T Flash5.5-S01v5.fla
error [Flash5.5-S01v5.fla]:  missing 54 bytes in zipfile
  (attempting to process anyway)
error [Flash5.5-S01v5.fla]:  reported length of central directory is
  54 bytes too long (Atari STZip zipfile?  J.H.Holm ZIPSPLIT 1.1
  zipfile?).  Compensating...
error: invalid zip file with overlapped components (possible zip bomb)
test of Flash5.5-S01v5.fla FAILED
$ 7z t Flash5.5-S01v5.fla

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz (50650),ASM,AES-NI)

Scanning the drive for archives:
1 file, 216581 bytes (212 KiB)

Testing archive: Flash5.5-S01v5.fla

ERRORS:
Headers Error

--
Path = Flash5.5-S01v5.fla
Type = zip
ERRORS:
Headers Error
Physical Size = 216581
Embedded Stub Size = 63



Archives with Errors: 1

Open Errors: 1

ross-spencer avatar Jul 13 '22 06:07 ross-spencer

Just an aside - does this format need a container sig if it's creating non-standard zips? in this case there's an apparent binary ID hook from offset 0x1E - 'mimetypeapplication/vnd.adobe.xfl'. I haven't got a pool of samples myself to check consistency, but just an observation from this one file...

Dclipsham avatar Jul 13 '22 09:07 Dclipsham

I should have mentioned, this happens on hundreds of my samples, many directly from the Software Installation CD, from multiple versions. None of which have errors in software when opened.

thorsted avatar Jul 13 '22 11:07 thorsted

for ref, specific issue with .fla is also described here: https://sourceforge.net/p/sevenzip/discussion/45798/thread/9e936d87/ with an effective won't-fix from 7z maintainer

Dclipsham avatar Jul 13 '22 11:07 Dclipsham

I could try a simple binary identification method, but not all FLA files have the mimetype file within the structure. I am basing my identification of the DOMDocument.xml as it has a xflVersion string, which will allow me to get each version identified correctly. Adobe Flash was retired and Adobe Animate continues to use the FLA format. All the files I test from even the most recent version all have this central directory error. But when I create FLA files with this tool, I don't see the issue.

You can see some Animate samples here.

So how much should Droid do to validate a file, versus do everything it can to identify a file even if it has to ignore some errors along the way? All the FLA files I have looked at will unzip with the right content, but sometimes will have a duplicate filename. Shouldn't Droid attempt to do the same?

thorsted avatar Jul 13 '22 15:07 thorsted

Well I think as it is, DROID is just using the zip handling library, TrueZip, to perform its zip-related tasks, so if there's a compatible zip library that handles this elegantly, but doesn't cause regressions elsewhere then I would hope it would be relatively straight forward to update, but of course I'm no longer in a position to directly influence DROID's dev roadmap...

Since TrueZip's latest version is 7.7.10 (https://mvnrepository.com/artifact/de.schlichtherle.truezip/truezip/7.7.10), which is 6 years old and has various vulns, I would hope that a prioritisation case could be made. CC @sparkhi @OliverHannan

Dclipsham avatar Jul 13 '22 15:07 Dclipsham

TrueVFS is the successor project to TrueZip. see https://mvnrepository.com/artifact/net.java.truevfs/truevfs-driver-zip/0.14.0 and http://truevfs.net/ (although the latter link refers to version 0.12, which is behind the latest maven version). I'm not currently in a position to build something to test whether this would overcome either the OLM or FLA issue, but might be the path of least resistance if it does work...

Dclipsham avatar Jul 13 '22 15:07 Dclipsham

Sorry I missed this notification. Thanks for the original comment @thorsted and updates @ross-spencer, @Dclipsham. We'll take this into account once we have dedicated developer time. I'll get it into the backlog ASAP though.

OliverHannan avatar Jul 19 '22 08:07 OliverHannan

TrueVFS is the successor project to TrueZip. see https://mvnrepository.com/artifact/net.java.truevfs/truevfs-driver-zip/0.14.0 and http://truevfs.net/ (although the latter link refers to version 0.12, which is behind the latest maven version). I'm not currently in a position to build something to test whether this would overcome either the OLM or FLA issue, but might be the path of least resistance if it does work...

We have updated to use TrueVFS and were hoping it would work but it has not solved the original issue mentioned in this ticket. As a result, I'm going to leave this ticket open.

sparkhi avatar Aug 02 '23 15:08 sparkhi

I've done a bit more trials and I am noting a few things I've found here. There is no solution to this yet.

The animate files that error do not appear to be Zip64, so the error is unrelated to zip64. I tried creating a zip64 file locally and it worked fine in droid.

The FLA files that produce the error, when I tried to dig deeper using 7Zip, It gave a cryptic error (but did not stop) Errors: FIXME-MyLoadStringW-

One interesting things noticed when using 7zip to browse the file is, there appears to be a same file appearing twice at the same location image

sparkhi avatar Aug 09 '23 13:08 sparkhi

From a brief Google on this point I don't think FLA files are truly valid ZIP files of any variant. They are close, and some applications will have a try at opening them, but it looks as if the creators of the FLA format have deviated from, or extended, the ZIP format to include other data or relax constraints (e.g. the multiple mimetype files, and corrupt Central Directory section). As @sparkhi says, we've updated DROID to use a modern ZIP library but it still won't allow FLA files to be used with container signatures.

steve-daly avatar Aug 09 '23 18:08 steve-daly

Not a disimmilar conclusion to that which the 7zip folk came to: https://sourceforge.net/p/sevenzip/discussion/45798/thread/9e936d87/

Out of interest does OLM at least ID now with the TrueVFS update?

Dclipsham avatar Aug 10 '23 09:08 Dclipsham

Not a disimmilar conclusion to that which the 7zip folk came to: https://sourceforge.net/p/sevenzip/discussion/45798/thread/9e936d87/

Out of interest does OLM at least ID now with the TrueVFS update?

No, the OLM format also does not ID.

Could not process the potential container format (ZIP): file:///Volumes/File%20Formats/OLM/OLM-samples/Outlook%20for%20Mac%202011%20Archive2.olm ZIP file spanning/splitting is not supported!

thorsted avatar Aug 14 '23 17:08 thorsted

From a brief Google on this point I don't think FLA files are truly valid ZIP files of any variant. They are close, and some applications will have a try at opening them, but it looks as if the creators of the FLA format have deviated from, or extended, the ZIP format to include other data or relax constraints (e.g. the multiple mimetype files, and corrupt Central Directory section). As @sparkhi says, we've updated DROID to use a modern ZIP library but it still won't allow FLA files to be used with container signatures.

Should DROID only process valid ZIP files? Validity of a file format should come after identification in my opinion. Can the ZIP library be configured to ignore many of these errors and provide some access to the contents?

thorsted avatar Aug 14 '23 17:08 thorsted

@thorsted do you have a simple OLM file sample you could share (or do you know if we already have one of these from you?)

steve-daly avatar Aug 14 '23 17:08 steve-daly

Should DROID only process valid ZIP files? Validity of a file format should come after identification in my opinion. Can the ZIP library be configured to ignore many of these errors and provide some access to the contents?

Sadly these aren't just warnings from the ZIP library which could be inhibited, but these files are missing some essential elements of the ZIP specification which this library (and others) need to extract it fully. Effectively, although the compressed FLA format has some elements in common with the ZIP format specification, it's not actually a ZIP file.

steve-daly avatar Aug 14 '23 17:08 steve-daly

Effectively, although the compressed FLA format has some elements in common with the ZIP format specification, it's not actually a ZIP file.

7Zip identifies all my FLA and OLM samples as ZIP compressed. Uncompresses them all successfully, but with an error.

thorsted avatar Aug 14 '23 18:08 thorsted