kaitai_struct icon indicating copy to clipboard operation
kaitai_struct copied to clipboard

Question: How can I do this?

Open AIDDQD opened this issue 2 years ago • 2 comments

I want to implement zip64 extra field of central_dir_entry but 64 bit fields appears in file only on 32 bit field equivalent overflow. image I added this but obviously its not working, because i can't determine what field is overflowed

p.s. english not my native language. So feel free to ask any questions if something is unclear

AIDDQD avatar Jul 22 '22 12:07 AIDDQD

I want to implement zip64 extra field of central_dir_entry

Note that there is a draft pull request https://github.com/kaitai-io/kaitai_struct_formats/pull/602 that adds support for some ZIP64-related data structures.

I added this but obviously its not working, because i can't determine what field is overflowed

Sorry, I'm not really familiar with how the ZIP64 extension works (and I don't feel like I have that much spare time to study it), so this doesn't make much sense to me, but I noticed a comment on it in the aforementioned PR (https://github.com/kaitai-io/kaitai_struct_formats/pull/602#issue-1205617035):

The ZIP64 extra field has not been enabled yet, because with test files I am seeing different values. This needs some extra research.

So @armijnhemel perhaps may be interested and shed some light on this?

generalmimon avatar Aug 07 '22 14:08 generalmimon

I want to implement zip64 extra field of central_dir_entry

Note that there is a draft pull request kaitai-io/kaitai_struct_formats#602 that adds support for some ZIP64-related data structures.

I added this but obviously its not working, because i can't determine what field is overflowed

Sorry, I'm not really familiar with how the ZIP64 extension works (and I don't feel like I have that much spare time to study it), so this doesn't make much sense to me, but I noticed a comment on it in the aforementioned PR (kaitai-io/kaitai_struct_formats#602 (comment)):

The ZIP64 extra field has not been enabled yet, because with test files I am seeing different values. This needs some extra research.

So @armijnhemel perhaps may be interested and shed some light on this?

ZIP is a much more complex format, with many exceptions and implementations that are not adhering to the ZIP standard. ZIP is basically a list of entries, followed by a lookup table that points to the right entries in the ZIP file, called the "central directory". All unzip programs rely on the central directory.

Parsing ZIP files from the start of the file will likely mean you will bump into all kinds of exceptions, especially with files from Android (because Google put all kinds of extra data in between the entries and the central directory even though that is not what the ZIP documentation allows).

The ZIP64 field does not always work because not all fields are always present (at least, that's the situation I have seen). The ZIP documentation section 4.5.3 says:

If one of the size or
offset fields in the Local or Central directory
record is too small to hold the required data,
a Zip64 extended information record is created.
The order of the fields in the zip64 extended 
information record is fixed, but the fields MUST
only appear if the corresponding Local or Central
directory record field is set to 0xFFFF or 0xFFFFFFFF.

This means that before you can see which fields are present you need to look at the local header but also at the information in the central directory. Looking at the local header is easy when parsing from the beginning of the file, but looking at both the local header and the one in the central directory is impossible when parsing from the beginning of the file.

The documentation isn't very clear if all fields should be overridden in the ZIP64 header (I think that most of the implementations take this approach) or only the fields that need to be overridden and any field in front of it.

What I personally have done is that I have wrapped a script around the Kaitai Struct parser and parse individual entries whenever I encounter them. This works well for almost all files I have encountered (minus a few that I bumped into a few days ago).

armijnhemel avatar Aug 07 '22 18:08 armijnhemel