InflaterInputStream doesn't reset position of underlying stream to end of deflated data
Steps to reproduce
- Obtain a file that contains a deflated blob with some other data following it. In my case, I have a binary file where there is a deflated blob, followed by some additional ints/floats/etc. It's important that there is trailing data after the deflated blob. It doesn't matter what the data is, it could be garbage. It just has to exist. The code below assumes there is at least 4 bytes in this trailing data.
- Run this simple program against it, after filling in appropriate values for InFile, InFileCompressedDataStartPos, InFileCompressedLength, and InFileUncompressedLength:
internal class Program
{
private static string InFile = @"";
private static int InFileCompressedDataStart = 0;
private static int InFileCompressedLength = 0;
private static int InFileUncompressedLength = 0;
public static void Main(string[] args)
{
using (var stream = File.OpenRead(InFile))
{
stream.Position = InFileCompressedDataStart;
//Encounter deflated blob in stream, and inflate it
using (var zStream = new InflaterInputStream(stream))
{
zStream.IsStreamOwner = false;
var buffer = new byte[InFileUncompressedLength];
var streamStartPos = stream.Position;
Console.WriteLine($"Stream position before inflate: {streamStartPos}");
var read = zStream.Read(buffer, 0, InFileCompressedLength);
Console.WriteLine($"InflaterInputStream read {read} bytes");
var streamEndPos = stream.Position;
Console.WriteLine($"Stream position after inflate: {streamEndPos}");
Console.WriteLine($"Stream position moved {streamEndPos - streamStartPos} bytes during read");
}
//Continue on with data following deflated blob
var testBuffer = new byte[4];
stream.Read(testBuffer, 0, 4); //This reads from the wrong offset in the file!
}
}
}
- Note the console output
Expected behavior
stream.Position should have progressed by InFileCompressedLength. (to allow the trailing data to be read independently)
Actual behavior
stream.Position progressed by some amount greater than InFileCompressedLength. (this is part way thru completely unrelated data!)
Version of SharpZipLib
1.3.1
Obtained from (only keep the relevant lines)
- Package installed using NuGet
This is probably due to the internal buffering. It might be possible to just seek to the end of the deflated data here if the underlying stream supports it:
https://github.com/icsharpcode/SharpZipLib/blob/06ff713469fd6e1c1cdd2ad3b364248e457a1b96/src/ICSharpCode.SharpZipLib/Zip/Compression/Streams/InflaterInputStream.cs#L665-L668
It might also be possible to do this:
using (var zStream = new InflaterInputStream(stream), new Inflater(), 1)
{
// ...
}
Which will limit the buffer to a single byte. This means that it will check if it's needed before reading every single byte and only advance the underlying stream if necessary. It will probably be really slow, but as long as the data is byte-aligned it should work afaik.
Another possibility would be to add in SubStream. A sample implementation is written here: https://stackoverflow.com/questions/6949441/how-to-expose-a-sub-section-of-my-stream-to-a-user
So basically, the underlying Stream is first wrapped in a SubStream instance which has it's length set to the value you already have. This SubStream is then given to the constructor of InflateInputStream. SubStream is basically limiting how much can be read from the underlying stream before signalling end of stream by itself.
Your code above would change to something like this:
internal class Program
{
private static string InFile = @"";
private static int InFileCompressedDataStart = 0;
private static int InFileCompressedLength = 0;
private static int InFileUncompressedLength = 0;
public static void Main(string[] args)
{
using (var stream = File.OpenRead(InFile))
{
stream.Position = InFileCompressedDataStart;
//Encounter deflated blob in stream, and inflate it
using (var subStream = new SubStream(stream, InFileCompressedDataStart, InFileCompressedLength)
{
using (var zStream = new InflaterInputStream(subStream))
{
zStream.IsStreamOwner = false;
var buffer = new byte[InFileUncompressedLength];
var streamStartPos = stream.Position;
Console.WriteLine($"Stream position before inflate: {streamStartPos}");
var read = zStream.Read(buffer, 0, InFileUncompressedLength);
Console.WriteLine($"InflaterInputStream read {read} bytes");
var streamEndPos = stream.Position;
Console.WriteLine($"Stream position after inflate: {streamEndPos}");
Console.WriteLine($"Stream position moved {streamEndPos - streamStartPos} bytes during read");
}
}
//Continue on with data following deflated blob
var testBuffer = new byte[4];
stream.Read(testBuffer, 0, 4); //This reads from the wrong offset in the file!
}
}
}
A simpler option, as you have all values up front. Before continue reading, compare calculated position and actual position in the stream and if they don't match, do a stream.Seek or stream.Position before continue reading. If the underlying stream is in fact non-seekable, using something like SubStream is the only valuable Option without sacrificing the performance extremely.