repaq
repaq copied to clipboard
MD5 FAIL
Hello, I recently found some problems using repaq-0.3.0.
After decompressing, I found that the md5 check failed. Comparing the decompressed fastq file with the original file, it was found that at some point in some reads N became G after being compressed. Is this a machine problem or an algorithm problem? I compressed 160 files and 20 of them were problematic. Recompressing these 20 is still the same result.
Here is the result of the --compare parameter
"result":"failed", "msg":"The RFQ file and FASTQ file have different sequence in the 7815 pair. GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGGAATGGCAGCAGACTGAGAGCTTCTGGA | GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGNAATGGCAGCAGACTGAGAGCTTCTGGA"
Are you using the latest version?
If yes, can you upload a small file that can reproduce this issue?
Can you please try the latest version v0.4.0 ?
Can you please try the latest version v0.4.0 ?
Sorry, I was a bit busy last week, I will try the 0.4.0 version this week
thank you. If it fails again, please help to upload a small file to reproduce this issue.
test1.fq.gz test2.fq.gz The same error occurred in the latest version, the above files can reproduce this problem.
Any update on this? I can reproduce this issue with the given test files.
repaq -c -i test1.fq.gz -I test2.fq.gz -o testGZ.rfq.xz
repaq --compare --in1 test1.fq.gz --in2 test2.fq.gz -r testGZ.rfq.xz
{
"result":"failed",
"msg":"The RFQ file and FASTQ file have different sequence in the 7815 pair. GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGGAATGGCAGCAGACTGAGAGCTTCTGGA | GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGNAATGGCAGCAGACTGAGAGCTTCTGGA",
"fastq_reads":15630,
"rfq_reads":15630,
"fastq_bases":1563000,
"rfq_bases":1563000
}
BTW, I cloned the git repository. (v0.4.0 is not in the releases)