repaq icon indicating copy to clipboard operation
repaq copied to clipboard

MD5 FAIL

Open yangguang8112 opened this issue 4 years ago • 6 comments

Hello, I recently found some problems using repaq-0.3.0. After decompressing, I found that the md5 check failed. Comparing the decompressed fastq file with the original file, it was found that at some point in some reads N became G after being compressed. Is this a machine problem or an algorithm problem? I compressed 160 files and 20 of them were problematic. Recompressing these 20 is still the same result. Here is the result of the --compare parameter "result":"failed", "msg":"The RFQ file and FASTQ file have different sequence in the 7815 pair. GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGGAATGGCAGCAGACTGAGAGCTTCTGGA | GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGNAATGGCAGCAGACTGAGAGCTTCTGGA"

yangguang8112 avatar Mar 10 '20 04:03 yangguang8112

Are you using the latest version?

If yes, can you upload a small file that can reproduce this issue?

sfchen avatar Mar 10 '20 04:03 sfchen

Can you please try the latest version v0.4.0 ?

sfchen avatar Mar 16 '20 08:03 sfchen

Can you please try the latest version v0.4.0 ?

Sorry, I was a bit busy last week, I will try the 0.4.0 version this week

yangguang8112 avatar Mar 16 '20 08:03 yangguang8112

thank you. If it fails again, please help to upload a small file to reproduce this issue.

sfchen avatar Mar 16 '20 08:03 sfchen

test1.fq.gz test2.fq.gz The same error occurred in the latest version, the above files can reproduce this problem.

yangguang8112 avatar Mar 17 '20 06:03 yangguang8112

Any update on this? I can reproduce this issue with the given test files.

repaq -c -i test1.fq.gz -I test2.fq.gz -o testGZ.rfq.xz
repaq --compare --in1 test1.fq.gz --in2 test2.fq.gz -r testGZ.rfq.xz                                                                                                        
{
	"result":"failed",
	"msg":"The RFQ file and FASTQ file have different sequence in the 7815 pair. GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGGAATGGCAGCAGACTGAGAGCTTCTGGA | GGACCTCTTCTGACTGATGGGAAATCACAGCAGTTGGAGACCCAGGTCCACAGGAAGGATGAAGAACCCAAGNAATGGCAGCAGACTGAGAGCTTCTGGA",
	"fastq_reads":15630,
	"rfq_reads":15630,
	"fastq_bases":1563000,
	"rfq_bases":1563000
}

BTW, I cloned the git repository. (v0.4.0 is not in the releases)

bartns avatar Aug 27 '20 09:08 bartns