Don't simply quit on bad CRC?
In order to read from faulty media it might be helpful to not just "crash" with an exception when a bad crc is encountered. I did simply quit because when a crc is wrong, I can't be sure anymore that the next bits are for the next block. However, for the parallel version I added a blockfinder function, which can search for the magic bit strings of bzip2 blocks. I could use that to recover from bad blocks. Ironically, this is not the only new feature which I could get out of the box from the parallelized design.
How to do error reporting then, a simple message to stderr?
Sometimes I get an incomplete bz2 file and indexed_bzip2 hangs and my program can't continue. The standard bz2 package returns an error in this case
It hangs? How exactly are you calling it? This probably is a slightly different problem, especially as the checksum is at the end, so it is not available when the archive is incomplete.
Im call it from python
import indexed_bzip2
with indexed_bzip2.open( bzfile, parallelization=6 ) as source:
memfile = source.read()
And program not response But the bz2 return a exception
import bz2
with open(bzfile, 'rb') as source:
memfile = bz2.decompress(source.read())
I cannot reproduce it. I tried with files generated like so:
base64 /dev/urandom | head -c 1024 | bzip2 | head -c 100 > base64.bz2.truncated
base64 /dev/urandom | head -c $(( 32 * 1024 * 1024 )) | bzip2 | head -c $(( 8 * 1024 * 1024 )) > base64.bz2.truncated
and with bzfile in your script being replaced with the path:
import indexed_bzip2
with indexed_bzip2.open( "base64.bz2.truncated", parallelization=6 ) as source:
memfile = source.read()
For both, I get:
> python3 issue-7.py
Traceback (most recent call last):
File "issue-7.py", line 6, in <module>
memfile = source.read()
^^^^^^^^^^^^^
File "indexed_bzip2.pyx", line 251, in indexed_bzip2._IndexedBzip2FileParallel.readinto
RuntimeError: std::exception
> echo $?
1
I.e., it does not hang. Are you using your script in a pipe? Is your bzfile a file object or a path?
Thank you so much My problem is related to inodes limitation in Ubuntu