go icon indicating copy to clipboard operation
go copied to clipboard

archive/zip: improve Zip64 compatibility with 7z

Open Sangmin-Simon-Lee opened this issue 1 year ago • 6 comments

Go version

go1.23.1

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/sangmin5.lee/.cache/go-build'
GOENV='/home/sangmin5.lee/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/sangmin5.lee/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/sangmin5.lee/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/sangmin5.lee/dev/go/goroot'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/sangmin5.lee/dev/go/goroot/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/sangmin5.lee/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build4058537502=/tmp/go-build -gno-record-gcc-switches'

What did you do?

  • How to reproduce this issue

** Prepare two files with each size of 5G, 1M $ touch test_5G $ shred -n 1 -s 5G test_5G $ touch test_1M $ shred -n 1 -s 1M test_1M

** Create zipfile to have 5G file zipfile, err := os.Create("s5G.zip") zipWriter := zip.NewWriter(zipfile) newfile, err := os.Open("test_5G")

fileInfo, err := newfile.Stat() header, err := zip.FileInfoHeader(fileInfo)

header.Name = "test_5G" header.Method = zip.Deflate

writer, err := zipWriter.CreateHeader(header) _, err = io.Copy(writer, newfile)

** Get 7z from https://sourceforge.net/projects/sevenzip/files/7-Zip/23.01/ or higher and try to add 1M file to created zip

$ 7zz a s5G.zip test_1M

What did you see happen?

7-Zip (z) 23.01 (x86) : Copyright (c) 1999-2023 Igor Pavlov : 2023-06-20 32-bit ILP32 locale=en_US.utf8 Threads:96 OPEN_MAX:131072, ASM

Open archive: s5G.zip

WARNINGS: Headers Error

-- Path = s5G.zip Type = zip WARNINGS: Headers Error Physical Size = 5370519708 64-bit = + Characteristics = Zip64

Scanning the drive: 1 file, 1048576 bytes (1024 KiB)

Updating archive: s5G.zip

Keep old data in archive: 1 file, 5368709120 bytes (5120 MiB) Add new data to archive: 1 file, 1048576 bytes (1024 KiB)

System ERROR: E_NOTIMPL : Not implemented

What did you expect to see?

Everything is OK without errs and the contents should be listed

$ unzip -l 5G.zip Archive: 5G.zip Length Date Time Name


1048576 2024-09-11 07:47 test_1M 5368709120 2024-09-11 07:50 test_5G


5369757696 2 files

Sangmin-Simon-Lee avatar Sep 12 '24 04:09 Sangmin-Simon-Lee

I'd like to propose the potential fix for this issue.

https://go-review.googlesource.com/c/go/+/612595

Sangmin-Simon-Lee avatar Sep 12 '24 04:09 Sangmin-Simon-Lee

CC @dsnet, @bradfitz, @ianlancetaylor.

timothy-king avatar Sep 12 '24 18:09 timothy-king

Quoting https://go.dev/cl/612595:

Symptoms: An error occurs when 7z adds or updates files in a zip archive that includes files over 4GB.

Reasons:

  1. Header Inconsistency: The main header writes a 32-bit value even though a 64-bit value is written to the Zip64 header. The offset value should be set to uint32max (0xFFFFFFFF) to indicate that a 64-bit value is used.

  2. Zip64 Detection: 7z primarily uses the Extra Field in the Local File Header to detect Zip64 format. If this field is missing, 7z assumes 32bit Data Descriptor, leading to errors. The Extra Field should include the Zip64 information in the Local File Header, even if Data Descriptor is used.

Solution:

  1. Ensure that the offset value in the main header is set to 0xFFFFFFFF when writing 64-bit values in the Zip64 header.
  2. Include the Zip64 Extra Field in the Local File Header to align with 7z handling of Zip64 archives.

ianlancetaylor avatar Sep 12 '24 18:09 ianlancetaylor

CC @dsnet

ianlancetaylor avatar Sep 12 '24 18:09 ianlancetaylor

Note that changing this might invalidate checksums based on the full content of zip files for files created before this change.

Go modules containing files greater than 4 GB could be affected.

That behavior can be made opt-in or at least needs a Go experiment flag to opt out.

nightlyone avatar Sep 13 '24 04:09 nightlyone

Note that changing this might invalidate checksums based on the full content of zip files for files created before this change.

Go modules containing files greater than 4 GB could be affected.

That behavior can be made opt-in or at least needs a Go experiment flag to opt out.

The go.sum checksums are well-specified directory hashes, not hashes of the full zip files, precisely because zip files should not be assumed to have a stable, canonical format.

FiloSottile avatar Nov 30 '25 23:11 FiloSottile

Change https://go.dev/cl/725161 mentions this issue: archive/zip: fix Zip64 edge cases

gopherbot avatar Nov 30 '25 23:11 gopherbot