webdav4
webdav4 copied to clipboard
when file is large, seek is very slow
In stream.py, seek function is
def seek(self, offset: int, whence: int = 0) -> int: # noqa: C901
"""Seek the file object."""
if whence == 0:
loc = offset
elif whence == 1:
if offset >= 0:
self.read(offset)
return self.loc
loc = self.loc + offset
elif whence == 2:
if not self.size:
raise ValueError("cannot seek to the end of file")
loc = self.size + offset
else:
raise ValueError(f"invalid whence ({whence}, should be 0, 1 or 2)")
if loc < 0:
raise ValueError("Seek before start of file")
if loc and not self.supports_ranges:
raise ValueError("server does not support ranges")
self.close()
self._cm = iter_url(self.client, self.url, pos=loc, chunk_size=self.chunk_size)
# pylint: disable=no-member
_, self._iterator = self._cm.__enter__()
self.loc = loc
return loc
when whence == 1 and offset > 0, the seek will read to the offset
if offset >= 0:
self.read(offset)
return self.loc
loc = self.loc + offset
to seek 1G later will read 1G content first, which is very inefficient If I comment out the if statement, the seek operation works too, it will create a new iterator, use Range header to fast locate the position
I think it was added assuming that on SEEK_CUR, the offsets are small, and might be already cached in our buffer and that I wanted to reset the iterator as much as possible (not all webdav servers support ranges).
Feel free to propose a PR. 🙂