bkcrack icon indicating copy to clipboard operation
bkcrack copied to clipboard

Plain text attack on ZIP failed

Open sergeevabc opened this issue 1 month ago • 6 comments

Setup

  • Windows 7 x64, bkcrack 1.8.1
  • x.zip, encrypted archive with x.txt
  • plain.zip, non-encrypted archive with plain.txt which contains the beginning of x.txt

Goal

  • Get x.txt from x.zip

Analysis

Click to see the log…
$ 7z l -slt x.zip
7-Zip (a) 25.01 (x64) : Copyright (c) 1999-2025 Igor Pavlov : 2025-08-03
Scanning the drive for archives:
1 file, 340 bytes (1 KiB)
Listing archive: x.zip
--
Path = x.zip
Type = zip
Physical Size = 340
----------
Path = x.txt
Folder = -
Size = 214
Packed Size = 196
Modified = 2025-12-01 16:27:20.0000000
Created = 2025-12-01 17:15:22.4588250
Accessed = 2025-12-01 17:15:22.4588250
Attributes = A
Encrypted = +
Comment =
CRC = C44D13D5
Method = ZipCrypto Deflate:Maximum
Characteristics = NTFS : Encrypt
Host OS = FAT
Version = 20
Volume Index = 0
Offset = 0

$ 7z l -slt plain.zip
7-Zip (a) 25.01 (x64) : Copyright (c) 1999-2025 Igor Pavlov : 2025-08-03
Scanning the drive for archives:
1 file, 181 bytes (1 KiB)
Listing archive: plain.zip
--
Path = plain.zip
Type = zip
Physical Size = 181
----------
Path = plain.txt
Folder = -
Size = 27
Packed Size = 29
Modified = 2025-12-02 15:10:49.8030051
Created = 2025-12-02 15:10:09.1097236
Accessed = 2025-12-02 15:10:49.8030051
Attributes = A
Encrypted = -
Comment =
CRC = A558474B
Method = Deflate:Maximum
Characteristics = NTFS
Host OS = FAT
Version = 20
Volume Index = 0
Offset = 0

Attack

$ bkcrack -C x.zip -c x.txt -P plain.zip -p plain.txt
bkcrack 1.8.1 - 2025-10-25
[15:18:24] Z reduction using 22 bytes of known plaintext
100.0 % (22 / 22)
[15:18:25] Attack on 347691 Z values at index 6
100.0 % (347691 / 347691)
[16:05:46] Could not find the keys.

Dear @kimci86 and community members, Why are there no keys found? What else can be done?

sergeevabc avatar Dec 02 '25 16:12 sergeevabc

The reason is deflate compression that was applied to x.txt before encryption in x.zip produced different data than deflate compression on the smaller prefix plain.txt in plain.zip.

If we look at the compressed deflate stream structure in plain.zip with infgen:

$ cat plain.zip | tail -c +40 | head -c 29 | infgen -r -dd
! infgen 3.6 output
!
last			! 1
fixed			! 01
literal 'd		! 00101001
literal '2		! 01000110
literal 'V		! 01100001
literal 'i		! 10011001
literal 'd		! 00101001
literal 'H		! 00011110
literal 'V		! 01100001
literal 'u		! 10100101
literal 'b		! 01001001
literal 'm		! 10111001
literal 'V		! 01100001
literal 's		! 11000101
literal 'I		! 10011110
literal 'F		! 01101110
literal 's		! 11000101
literal 'y		! 10010101
literal 'M		! 10111110
literal 'D		! 00101110
literal 'A		! 10001110
literal 'x		! 00010101
literal 'O		! 11111110
literal 'm		! 10111001
literal 'R		! 01000001
literal 'i		! 10011001
literal 'O		! 11111110
literal 'D		! 00101110
literal 'p		! 00000101
end			! 0000000
			! 000000

We see it ends with an end code and padding bits that are most probably not there in the compressed x.txt.

We can ignore that ending by truncating plaintext, using 27 of instead 29 bytes from compressed plain.txt:

bkcrack -C x.zip -c x.txt -P plain.zip -p plain.txt -t 27

But that doesn't work.

I suspect this is because x.txt if big enough that it makes using so-called dynamic Huffman codes worth it instead of fixed Huffman codes. If that is true, it means compressed x.txt before encryption was made of a representation of a Huffman tree followed by data encoded using that tree. On the other hand, compressing a small prefix is using a fixed Huffman tree. So the compressed plain.txt is probably widely different from compressed x.txt. Guessing compressed data from a small prefix is hard.

I hope it helps!

kimci86 avatar Dec 02 '25 20:12 kimci86

To reduce mystery and bring clarity, I share x.txt as is so you can investigate further.

SHA256: 331cb26a5f4f1563a03042e8bd3fc89dc664f22e8952936d1b351f39604907a9

Here's what interests me: information security resources repeat that ZipCrypto is a broken data protection mechanism that must not be used under any circumstances today, but here I am, trying to crack a real ZIP archive with a Deflate twist, but it isn't happening. So maybe ZipCrypto can still be used instead of AES (supplied with 3rd party tools such as 7-zip) when backward compatibility (decompressing in legacy Windows) is important?

And if it is possible, what other conditions, other than using Deflate, are better to be met to resist plain text attack? E.g. complex password, minimum file size, number of files…

sergeevabc avatar Dec 02 '25 20:12 sergeevabc

I don't know which conditions are enough to make a ZipCrypto archive absolutely safe, but I can tell you it is easily breakable in some cases.

If the archive contains at least one stored file (ZipCrypto Store) of known format (png, jpg, zip, 7z, exe, ...), using magic numbers or structure from that format specification can provide enough data for a known plaintext attack and decrypt the entire archive. If a stored file is known fully, it is even easier and faster, using the entire file as known plaintext.

If the archive contains only compressed files (e.g. ZipCrypto Deflate) it is harder but sometimes still possible.

One condition that makes it not so hard is having one of the files known in full. Then it is only a matter of compressing that known file using the same compression tool and settings than the encrypted version, or at least a combination of options that produce the same compressed data. The compressed file can then be used for a known plaintext attack which allows to decrypt the entire archive on success.

When no file is known in full, it may still be possible with some luck and knowledge of deflate algorithm. Knowing a big prefix (several kilobytes) or a large proportion of a file can be enough to recreate some compressed data. I can't say for sure what would make it impossible to guess though. There might be some tricks I am not aware of.

Note the password is of no importance at all for a known plaintext attack.

I only consider Biham and Kocher's known plaintext attack above, implemented in bkcrack and based on A known plaintext attack on the PKZIP stream cipher, but ZipCrypto can also be broken in other ways with more computing resources. See for example Improved Forensic Recovery of PKZIP Stream Cipher Passwords and A Differential Meet-in-the-Middle Attack on the Zip cipher.

An attacker could also attempt more traditional password cracking with free open-source tools like hashcat or john the ripper.


By the way, what tool did you use to create x.zip? I was not able to get the correct compressed size for x.txt with Info-ZIP nor 7-zip. I could try to guess by testing more tools (what an attacker would do), but I would rather not spend time on it and ask you since this is a crafted example.

kimci86 avatar Dec 03 '25 06:12 kimci86

I don't know which conditions are enough to make a ZipCrypto archive absolutely safe, but I can tell you it is easily breakable in some cases.

Right. But in the case of archiving several text files, when they are likely to be compressed using Deflate, it seems that cracking such an archive will require more effort than an amateur (not a government agent or a hired professional) would like to spend.

What tool did you use to create x.zip?

bz, a console part of Bandizip. I prefer it because it allows me to add multilingual comments (among other things).

$ bz.exe a -l:9 -zopfli -p:"password" x.zip x.txt
bz 7.40(Beta,x64) - Bandizip console tool. Copyright (C) 2025 Bandisoft
Creating archive: C:\Test\x.zip
Compressing: x.txt

sergeevabc avatar Dec 03 '25 11:12 sergeevabc

But in the case of archiving several text files, when they are likely to be compressed using Deflate [...]

This should deter most amateurs, but do not underestimate how people can get obsessed with breaking it 😉 Being aware of ZipCrypto weaknesses can help you make an informed decision.


bz, a console part of Bandizip.

Interesting, so it is using zopfli compression library under the hood. Out of curiosity, I tried to confirm compressing x.txt with zopfli allows to crack the archive, and it did. Good to know!

zopfli --deflate x.txt
bkcrack -C x.zip -c x.txt -p x.txt.deflate -D no_password.zip

kimci86 avatar Dec 03 '25 20:12 kimci86

bz, a console part of Bandizip.

Interesting, so it is using zopfli compression library under the hood. Out of curiosity, I tried to confirm compressing x.txt with zopfli allows to crack the archive, and it did. Good to know!

Lol, bz is a poor name of an archiver. Sounds like an older version of bz2, which is completely different. Anyway yes, good to know!

An attacker could also attempt more traditional password cracking with free open-source tools like hashcat or john the ripper.

I've been meaning to implement GPU support in JtR but it's so fast on CPU I keep putting it off. Hashcat's pkzip modes are good (and of course faster than CPU), they nicked JimF's awesome early-reject from JtR for speed. BTW hashcat's "6 byte optimization" is awesome after cracking the key with bkcrack, not many passwords withstand guaranteed cracking up to length 12.

magnumripper avatar Dec 04 '25 08:12 magnumripper