rust-ar
rust-ar copied to clipboard
Entry `drop` function can take a long time
The current drop implementation for Entry
is as follows:
impl<'a, R: 'a + Read> Drop for Entry<'a, R> {
fn drop(&mut self) {
if self.position < self.length {
// Consume the rest of the data in this entry.
let mut remaining = self.reader.take(self.length - self.position);
let _ = io::copy(&mut remaining, &mut io::sink());
}
}
}
When dealing with large entries, this function can take a very long time. Could there be a reason for this implementation? Do we really need to consume the reader?
This is necessary to seek the reader to the header of the next entry. There is no Seek
bound, so reader.seek()
can't be used.
I'm confused. I'm not familiar with the code but why don't we use Reader::seek here? Eventually, this drop implementation takes a lot of time when dealing with a lot of entries in a big AR file. Can't it be solved?
reader.seek()
requires a Seek
bound. It could be added, but because it currently isn't, .seek()
can't be used without a breaking change. A breaking change may be acceptable. At least it would be fine with me speaking as user of rust-ar.
@mdsteele WDYT about adding Seek
bound to Reader
?
BTW what about calling consume
?
Yes, it was written that way to avoid needing a Seek
bound. At the time I was trying to stay similar to the API of the tar
crate, which also does not require a Seek
bound. Once nice feature of not requiring Seek
is that you can easily extract a compressed archive by using a reader adapter that provides streaming decompression (for example, GzipReader
, which doesn't implement Seek
). It'd be nice not to lose that capability.
I poked around just now, though, and it looks like tar
recently landed a (not yet published) change to provide an alternate Archive::entries_with_seek
method to allow opting in to a more efficient implementation when the underlying reader implements Seek
. So that might be a good approach.