archive/zip: improve Zip64 compatibility with 7z
Go version
go1.23.1
Output of go env in your module/workspace:
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/sangmin5.lee/.cache/go-build'
GOENV='/home/sangmin5.lee/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/sangmin5.lee/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/sangmin5.lee/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/sangmin5.lee/dev/go/goroot'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/sangmin5.lee/dev/go/goroot/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/sangmin5.lee/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4058537502=/tmp/go-build -gno-record-gcc-switches'
What did you do?
- How to reproduce this issue
** Prepare two files with each size of 5G, 1M $ touch test_5G $ shred -n 1 -s 5G test_5G $ touch test_1M $ shred -n 1 -s 1M test_1M
** Create zipfile to have 5G file zipfile, err := os.Create("s5G.zip") zipWriter := zip.NewWriter(zipfile) newfile, err := os.Open("test_5G")
fileInfo, err := newfile.Stat() header, err := zip.FileInfoHeader(fileInfo)
header.Name = "test_5G" header.Method = zip.Deflate
writer, err := zipWriter.CreateHeader(header) _, err = io.Copy(writer, newfile)
** Get 7z from https://sourceforge.net/projects/sevenzip/files/7-Zip/23.01/ or higher and try to add 1M file to created zip
$ 7zz a s5G.zip test_1M
What did you see happen?
7-Zip (z) 23.01 (x86) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 32-bit ILP32 locale=en_US.utf8 Threads:96 OPEN_MAX:131072, ASM
Open archive: s5G.zip
WARNINGS: Headers Error
-- Path = s5G.zip Type = zip WARNINGS: Headers Error Physical Size = 5370519708 64-bit = + Characteristics = Zip64
Scanning the drive: 1 file, 1048576 bytes (1024 KiB)
Updating archive: s5G.zip
Keep old data in archive: 1 file, 5368709120 bytes (5120 MiB) Add new data to archive: 1 file, 1048576 bytes (1024 KiB)
System ERROR: E_NOTIMPL : Not implemented
What did you expect to see?
Everything is OK without errs and the contents should be listed
$ unzip -l 5G.zip Archive: 5G.zip Length Date Time Name
1048576 2024-09-11 07:47 test_1M 5368709120 2024-09-11 07:50 test_5G
5369757696 2 files
Related Issues and Documentation
- archive/zip: zip64 extra headers problems #33116
- zip.NewReader read some format zip file but write file has panic #43709 (closed)
- archive/zip: cannot parse file header with compressed size or local file header offset of 0xffffffff #31692
- archive/zip: CRC, Compressed Length, and Uncompressed Length fields are not filled in on local headers created by zip.Writer #54666 (closed)
- archive/zip: Writer drops fs.FileMode data #48166
- archive/zip: unzip with java fails on tip #25215 (closed)
- archive/zip: checksum error in readDataDescriptor when reading valid zip file #66157
- compress/flate: deflatefast produces corrupted output #41420 (closed)
- archive/zip: compression performance #20031 (closed)
- zip archive invalid but works with other utilities #45338 (closed)
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
I'd like to propose the potential fix for this issue.
https://go-review.googlesource.com/c/go/+/612595
CC @dsnet, @bradfitz, @ianlancetaylor.
Quoting https://go.dev/cl/612595:
Symptoms: An error occurs when 7z adds or updates files in a zip archive that includes files over 4GB.
Reasons:
-
Header Inconsistency: The main header writes a 32-bit value even though a 64-bit value is written to the Zip64 header. The offset value should be set to uint32max (0xFFFFFFFF) to indicate that a 64-bit value is used.
-
Zip64 Detection: 7z primarily uses the Extra Field in the Local File Header to detect Zip64 format. If this field is missing, 7z assumes 32bit Data Descriptor, leading to errors. The Extra Field should include the Zip64 information in the Local File Header, even if Data Descriptor is used.
Solution:
- Ensure that the offset value in the main header is set to 0xFFFFFFFF when writing 64-bit values in the Zip64 header.
- Include the Zip64 Extra Field in the Local File Header to align with 7z handling of Zip64 archives.
CC @dsnet
Note that changing this might invalidate checksums based on the full content of zip files for files created before this change.
Go modules containing files greater than 4 GB could be affected.
That behavior can be made opt-in or at least needs a Go experiment flag to opt out.
Note that changing this might invalidate checksums based on the full content of zip files for files created before this change.
Go modules containing files greater than 4 GB could be affected.
That behavior can be made opt-in or at least needs a Go experiment flag to opt out.
The go.sum checksums are well-specified directory hashes, not hashes of the full zip files, precisely because zip files should not be assumed to have a stable, canonical format.
Change https://go.dev/cl/725161 mentions this issue: archive/zip: fix Zip64 edge cases