sharpcompress
sharpcompress copied to clipboard
Incomplete iterations over TarArchive.Entries yields to not fully loaded archive
Hallo, I just stumbled upon an error in your library. I had the strange behavior, that sometimes a certain file was in a given tar archive, sometimes not. During further investigation I notices, that the searched file was in the archive, when I used the debugger to inspect the tarArchive.Entries manually. It look like that LazyReadOnlyCollection has a bug, that is triggered when the first iteration over the archive is incomplete.
E.g.
var archive = TarArchive.Open("PathToArchiveWithTwoFiles");
var first = archive.Entries.First();
Assert.Equals(2,archive.Entries.Count); //this fails
I currently fixed this in my code by:
var archive = TarArchive.Open("PathToArchiveWithTwoFiles");
archive.Entries.ToArray();
var first = archive.Entries.First();
Assert.Equals(2,archive.Entries.Count); //this succeeds
Yes. I guess the lazy collection doesn't consider Count has a trigger to fully load the entries.
I tried to reproduce the error in your tests and it seems the bug is a little more complicated the I initially thought. The problem is not within the lazy evaluation itself, I also have to call OpenEntryStream on the first entry, without closing it (that was a bug in my code). Afterwards the second entry is not found. Unfortunately I was not able reproduce the error with your archives yet.
By the way, what is the expected behavior when:
- search for a file
- open its stream without closing it
- search for another file that is behind the first one
- Should the second one be found?
- If it is not allowed to search for an entry (that has not been discovered yet) while an entry stream is open, shouldn't there be an exception?
- Is it allowed to open streams of multiple entries simultaneously?
- Should a entry stream be closed by the caller?
I don't have expected behavior with open its stream without closing it
because you have to close it as the reader needs to move to the next file.
If it is not allowed to search for an entry (that has not been discovered yet) while an entry stream is open, shouldn't there be an exception?
is this is not detectable given the freedom of the API
Is it allowed to open streams of multiple entries simultaneously?
is not allowed by all formats so the code doesn't expect it for any format.
Should a entry stream be closed by the caller?
you should be always closing your streams if you're making them on any API.
I think this situation (not closed entry stream) can sometimes be detected. I think TarHeader.Read
can check if the reader.BaseStream is aligned to 512 bytes. Or is there a situation where reader.BaseStream is aligned to 512 bytes?
By the way, what is the reason that TarHeaderFactory.ReadHeader
yields null, when where was an exception during parsing the header?
By the way, what is the reason that
TarHeaderFactory.ReadHeader
yields null, when where was an exception during parsing the header?
I can't recall. Maybe a good idea at the time that hasn't panned out.