japsa
japsa copied to clipboard
XM data corruption
XM compressor still has data corruption issue. Compressing some input and decompressing it back produces corrupted output. I.e., decompressed data is different from original file.
Test data size: 30,244 bytes Test data link: http://kirill.med.u-tokai.ac.jp/data/temp/xm-repro-4-input.zip
Commands to reproduce:
Compress:
jsa.xm.compress --hashSize=11 --context=15 --limit=200 --threshold=0.15 --chance=20 --real=archive.xm original.fasta
Decompress:
jsa.xm.compress --hashSize=11 --context=15 --limit=200 --threshold=0.15 --chance=20 --decode=archive.xm --output=decompressed.fasta
Compare:
cmp original.fasta decompressed.fasta
Produces: original.fasta decompressed.fasta differ: byte 27512, line 274
The decompressed file has correct size, but corrupted sequence data. It was found during testing for Sequence Compression Benchmark.
Let me know if you need any additional information or help.