BZip2InputStream.Length reports an incorrect length
Steps to reproduce
- Load already compressed bzip2 payload into a
MemoryStream - Create
BZip2InputStreamfrom saidMemoryStream Assert.AreNotEqual(memorystream.Length, bzip2stream.Length);
Example code is included below.
Expected behavior
BZip2InputStream.Length should return how many bytes the developer can expect to read from the stream.
Actual behavior
BZip2InputStream.Length returns 48, but only 11 bytes are actually available in the stream. Uncompressed data is a string of 11 bytes, compressed data blob is 48 bytes. It seems to just return the length reported by underlying stream.
Version of SharpZipLib
Version 1.3.1
Obtained from
- Package installed using NuGet
Example code
using System.IO;
using System.Threading.Tasks;
using ICSharpCode.SharpZipLib.BZip2;
using NUnit.Framework;
namespace Tests {
[TestFixture]
public class StreamLengthTest {
private static readonly byte[] CompressedChunk = {
0x42, 0x5a, 0x68, 0x39, 0x31, 0x41, 0x59, 0x26, 0x53, 0x59, 0xd5, 0xb7, 0x4b, 0xf5, 0x00, 0x00,
0x03, 0xc1, 0x80, 0x00, 0x10, 0x23, 0x84, 0x44, 0x00, 0x20, 0x00, 0x31, 0x0c, 0x08, 0x20, 0xd1,
0xb4, 0x8b, 0xa4, 0x28, 0x05, 0x3c, 0x5d, 0xc9, 0x14, 0xe1, 0x42, 0x43, 0x56, 0xdd, 0x2f, 0xd4
};
[Test]
public async Task Test() {
// Read the bzip2 blob into a memory stream
await using var memorystream = new MemoryStream(CompressedChunk);
// Create bzip2 stream from memorystream
await using var bzip2stream = new BZip2InputStream(memorystream);
// Sanity checks
Assert.AreEqual(48, CompressedChunk.Length, "The constant binary blob should have length 48");
Assert.AreEqual(CompressedChunk.Length, memorystream.Length, "memorystream should be a 1:1 representation of CompressedChunk. Lengths should match.");
Assert.IsTrue(memorystream.CanRead);
Assert.IsTrue(bzip2stream.CanRead);
// Read data from the stream
var expectedLength = bzip2stream.Length;
var actualLength = 0;
byte[] buffer = {0};
try {
while (0 < bzip2stream.Read(buffer, 0, 1)) {
actualLength++;
}
} catch { /* ignored */ }
// Observe results
Assert.AreEqual(expectedLength, actualLength, "Actual length should equal reported length");
}
}
}
Exerpt from src/ICSharpCode.SharpZipLib/BZip2/BZip2InputStream.cs line 158 to 167:
/// <summary>
/// Gets the length in bytes of the stream.
/// </summary>
public override long Length
{
get
{
return baseStream.Length;
}
}
You cannot get the length without decompressing the stream first. The only alternative would be to throw a NotSupportedException instead, which is probably more correct (and is what System.IO.Compression.GZipStream does). The problem is that it might break code that rely on this old behavior.
We should add a deprecation warning (ObsoleteAttribute) on it though.
Since uncompressed size is not part of the bzip2 header, I see that this is not possible to implement properly. I would prefer almost anything to silent failure though, since this could cause (and demonstrably has caused) wasted developer hours. Marking it obsolete would be a good first step.