rarfile icon indicating copy to clipboard operation
rarfile copied to clipboard

CRC check failed when reading after seeking

Open mxmlnkn opened this issue 4 years ago • 1 comments

System Information:

  • Ubuntu 20.10
  • Python 3.8.10
  • rarfile 4.0
  • unar v1.10.1
  • UNRAR 5.61 beta 1

Both, unar and unrar are in my PATH, so I don't know which is used. I think I don't have bsdtar installed.

Steps to reproduce:

  1. Create test rar: echo foo > bar && rar a bar.rar bar
  2. Open with rarfile and seek and read:
import rarfile
rar = rarfile.RarFile("bar.rar")
file = rar.open("bar")
# These read calls were only to show that rarfile generally works but it seems they are somewhat important for reproduction!
file.read(1)  # b'f'
file.read(1)  # b'o'
file.read(1)  # b'o'
file.read(1)  # b'\n'
file.read(1)  # b''
# Seeking to 0 is no problem. Again, these calls can be omitted for reproducing the problem
file.seek(0)
file.read()   # b'foo\n'
# Here begins the problematic sequence
file.seek(1)  # 1
file.read()
---------------------------------------------------------------------------
BadRarFile                                Traceback (most recent call last)
<ipython-input-42-f3fc120c03c1> in <module>
----> 1 file.read()

~/.local/lib/python3.8/site-packages/rarfile.py in read(self, n)
   2200         if not data or self._remain == 0:
   2201             # self.close()
-> 2202             self._check()
   2203         return data
   2204 

~/.local/lib/python3.8/site-packages/rarfile.py in _check(self)
   2216             raise BadRarFile("Failed the read enough data")
   2217         if final != exp:
-> 2218             raise BadRarFile("Corrupt file - CRC check failed: %s - exp=%r got=%r" % (
   2219                 self._inf.filename, exp, final))
   2220 

BadRarFile: Corrupt file - CRC check failed: bar - exp=2117232040 got=3195718521

Forward seeking does not seem to be a problem. This works:

rar = rarfile.RarFile("bar.rar")
file = rar.open("bar")
file.seek(1)
file.read()

However, as soon as I am seeking backwards, the problem arises even when using crc_check=False, which makes it even weirder!

rar = rarfile.RarFile("bar.rar", crc_check=False)
file = rar.open("bar")
file.read(2)
file.seek(1)
file.read()  # exception!

mxmlnkn avatar Sep 13 '21 21:09 mxmlnkn

I took a quick look at the source and documentation and it seems that backward seeking is supposed to be implemented by reopening the file. Somehow that reopen isn't effective enough. My workaround, which also simply reopens the file, works without problems:

class RawFileInsideRar(io.RawIOBase):
    def __init__(self, reopen, file_size):
        self.reopen = reopen
        self.fileobj = reopen()
        self.file_size = file_size

    def __enter__(self):
        return self

    def __exit__(self, exception_type, exception_value, exception_traceback):
        self.close()

    def close(self) -> None:
        self.fileobj.close()

    def fileno(self) -> int:
        # This is a virtual Python level file object and therefore does not have a valid OS file descriptor!
        raise io.UnsupportedOperation()

    def seekable(self) -> bool:
        return self.fileobj.seekable()

    def readable(self) -> bool:
        return self.fileobj.readable()

    def writable(self) -> bool:
        return False

    def read(self, size: int = -1) -> bytes:
        return self.fileobj.read(size)

    def seek(self, offset: int, whence: int = io.SEEK_SET) -> int:
        if whence == io.SEEK_CUR:
            offset += self.tell()
        elif whence == io.SEEK_END:
            offset += self.file_size

        if offset >= self.tell():
            return self.fileobj.seek(offset, io.SEEK_SET)

        self.fileobj = self.reopen()
        return self.fileobj.seek(offset, io.SEEK_SET)

    def tell(self) -> int:
        return self.fileobj.tell()

Replacing the rar.open("bar") in my minimal non-working examples the following two lines will make them run just fine:

info = rar.getinfo("bar")
file = RawFileInsideRar(lambda: rar.open(info), info.file_size)

mxmlnkn avatar Sep 25 '21 16:09 mxmlnkn

Thanks for the report!

markokr avatar Sep 17 '23 17:09 markokr