Deterministic archive?
Is there a way to make archiver create identical archives each time it is run with the same files and options?
Currently the resulting archive file size is identical, but the file itself is structured in a slightly different way so that calculating the checksum of several archives give different results :/
What type of archive you use?
- GZIP has platform-dependent header
- ZIP has date information encoded
For GZIP we have code, this is base64 encoded binary content of GZIP archive:
const gzipHeader = {
darwin: 'H4sIAAAAAAAAE2',
win32: 'H4sIAAAAAAAACm',
linux: 'H4sIAAAAAAAAA2',
}[os.platform()];
For ZIP we specify date:
zip.append(chain, {
name: `customers.csv`,
date: new Date('2000-07-18T20:18:24.441Z'),
}).finalize();
Note: file name that is appended to archive also encoded, so it should be preserved in order to get exactly same file content
For additional information about GZIP header see https://www.forensicswiki.org/wiki/Gzip
I was indeed using ZIP, so I will try gzip and see if this already fixes my problem. That would be super awesome. Will report back.
~~For zip even with specifying the same date, the hash of the zip file differs (even though the contents are identical).~~
~~Does anyone have a way of achieving deterministic zip archives?~~
Edit. I retract. With specifying the date it does seem to work!