ZIP bomb detection
Hi there.
I was wondering: Is there anything built-in that I could use for zip-bomb detection?
Based on this: http://stackoverflow.com/questions/1459080/how-can-i-protect-myself-from-a-zip-bomb I don't think that checking the file sizes will be enough.
Looking at the code, I don't see anything related, but I am not 100% sure.
Probably, using a custom Stream (inheriting from FileStream) for ZipEntry.Extract(Stream stream) that counts the extracted file bytes would do the trick but it might be good to include something like this in the library.
I will try to do this and I will post the resulting code here, if it works.
Why do you think checking the size of the uncompressed data is not enough?
I have thought about that, but I have the impression that this information is just an informational header that can be tampered. I have done so, and WinRar and other software don't even notify you when this info is wrong. DotNetZip does notify you with a CRC exception (and I am a bit confused by this, since I thought that the CRC was only protecting the zip-entry data and not the zip-entry headers), but the CRC itself can be tampered in order to match too, I guess. Anyway, I have created a read & write stream with write-quota, than tries to compress a sample of the last bytes written to the extracted file, after a quota limit is exceeded. If the compression is good enough to be suspicious, I throw an exception.
The CRC is calculated from the header and data together as far as I remember. If you tamper the crc and decompressed size, does DotNetZip keep decompressing even when it exceed the decompressed size?
If you tamper the crc and decompressed size, does DotNetZip keep decompressing even when it exceed the decompressed size?
I have not tested that, to tell you the truth. It required some more time than I had available. So, I don't know. :( Sorry.
I did what I did, in order to be 100% sure the resulting code was safe.
Sounds like you've done a bit of an investigation. That's good.
So what's actionable for this library from this thread?
I don't know if it would make sense (since this is a common vulnerability of ZIP extraction) to include a class like this and some documentation for it :
public class FileStreamWithWriteQuota : FileStream
{
private readonly long _quota = 0;
private readonly Func<Stream, bool> _shouldThrow = null;
private long _counter = 0;
public FileStreamWithWriteQuota(string path, FileMode mode, FileAccess access, FileShare share, long quota, Func<Stream, bool> shouldThrow)
: base(path, mode, access, share)
{
_quota = quota;
_shouldThrow = shouldThrow;
}
private void Increase(int count)
{
_counter += count;
if (_counter > _quota)
{
if (_shouldThrow != null)
{
if(_shouldThrow(this)) throw new StreamQuotaException();
else _counter = 0;
}
else
{
throw new StreamQuotaException();
}
}
}
public override void Write(byte[] array, int offset, int count)
{
Increase(count);
base.Write(array, offset, count);
}
public override IAsyncResult BeginWrite(byte[] array, int offset, int numBytes, AsyncCallback userCallback,
object stateObject)
{
Increase(numBytes);
return base.BeginWrite(array, offset, numBytes, userCallback, stateObject);
}
public override void WriteByte(byte value)
{
Increase(1);
base.WriteByte(value);
}
public override Task WriteAsync(byte[] buffer, int offset, int count, CancellationToken cancellationToken)
{
Increase(count);
return base.WriteAsync(buffer, offset, count, cancellationToken);
}
}
When you give a proper access to the stream it allows the inspection of the extracted data when the quota is reached and allows you to decide whether to throw an exception (to stop the extraction) or not.
Note: I think it would be much easier to decide when to stop the extraction (e.g. inside _shouldThrow above) if there was a way to get the read-position within the compressed data of a zip-entry at each moment of the extraction process (so that you use the position on the compressed data and the position on the uncompressed data to estimate the compression ratio), but this might not be easy thing to do...
What do you think would be the use-case / example code of the class you posted? Note that the average user simply uses the Extract / ExtractAll methods.
I tested it now with a modified zip file. See my test code here: https://github.com/dosomder/DotNetZip.Semverd/commit/8a2eac9dec188b4d026e6aaa491c69a42dad9d09
As you mentioned before, a CRC exception was thrown. Personally I think this is already good but the exception could be renamed.
By the way, the CRC in the zip header is from the file only. It might be that DotNetZip calculates the CRC of the destination file using the length of "Uncompressed Length" in the zip header (and not the full file length) and therefore obviously results in a different CRC.
After modifying the CRC of my zipbomb test file, the file was extracted successfully. However, the resulting file size was what was set in "Uncompressed Length" and not the real full length.
Programs like WinRar just keep extracting the zip stream until it's finished. DotNetZip stops extracting when it's finished or reached "Uncompressed Length"
Good! :) I didn't know that. :)
I will have a look at the test code. The exception needs to be renamed, I guess. It confused me once.
What do you think would be the use-case / example code of the class you posted? Note that the average user simply uses the Extract / ExtractAll methods.
I have thought about this too... Perhaps there could be an ExtractSafe(path, quota, maxCompressionRatio) that uses it internally. But if it's not necessary, you can skip that. Just it's not obvious from the documentation that such security issues do not apply on this library.
Also, there is another exception when the compression algorithm is not supported, which is of the general Exception type (at least last time I checked). Making a NotSupportedCompressionAlgorithmException (or sth shorter than that) for it might be good. :)
Thanks. :)
Hi,
Would the CRC exception be thrown when using zipEntry.OpenReader? Or only when using Extract()?