GeneFuse icon indicating copy to clipboard operation
GeneFuse copied to clipboard

No fusion found using tests data

Open chizhenfen opened this issue 5 years ago • 20 comments

Hi Shifu, no fusion found using tests data ./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 R1.fq.gz -2 R2.fq.gz -h r1r2n.html 15:51:11 start with 4 threads 15:51:50 mapper indexing done 15:52:20 sequence number before filtering: 0 15:52:20 removeByComplexity: 0 15:52:20 removeByDistance: 0 15:52:20 removeIndels: 0 15:54:5 matcher indexing done 15:54:5 removeAlignables: 0 15:54:5 found 0 fusions

./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h genefuser1r2n.html 15:55:45 start with 4 threads 15:56:25 mapper indexing done 15:56:36 sequence number before filtering: 0 15:56:36 removeByComplexity: 0 15:56:36 removeByDistance: 0 15:56:36 removeIndels: 0 15:58:25 matcher indexing done 15:58:25 removeAlignables: 0 15:58:25 found 0 fusions

Dataset was downloaded from: http://opengene.org/dataset.html

Thanks.

chizhenfen avatar Jan 01 '19 08:01 chizhenfen

I just tried again with command: ./genefuse -r ~/data/ref/hg19.fa -1 ~/data/fq/genefuse.R1.fq.gz -2 ~/data/fq/genefuse.R2.fq.gz -h test.html -j test.json -f genes/druggable.hg19.csv

and got:

15:1:4 start with 4 threads
15:1:47 mapper indexing done
15:2:38 sequence number before filtering: 1329
15:2:38 removeByComplexity: 0
15:2:38 removeByDistance: 39
15:2:38 removeIndels: 67
15:4:3 matcher indexing done
15:4:3 removeAlignables: 8

Probably you used incorrect reference? I used hg19 downloaded from UCSC. Did you checked the downloaded files using MD5?

sfchen avatar Jan 02 '19 07:01 sfchen

Hi sfchen,

I have the same problem. I used all the demo files you provided including the reference genome. but I still got nothing in my result.

richardicky avatar Feb 26 '19 01:02 richardicky

Hi sfchen,

I have experienced the same problem as others have reported here. Have you gotten to the bottom of this?

Thanks.

MarcHiggins avatar May 16 '19 12:05 MarcHiggins

Can you guys check md5 for the downloaded FASTQ file?

sfchen avatar May 16 '19 13:05 sfchen

http://opengene.org/dataset.html

You should download following files: Paired-end FASTQ files for GeneFuse testing (Illumina platform) genefuse.R1.fq.gz (size: 62 M, MD5: 171e6dfa0af37fe95c826005bc5fcdf9) genefuse.R2.fq.gz (size: 66 M, MD5: e756cf01e256186dccaa9e700d85a342)

sfchen avatar May 16 '19 13:05 sfchen

Hi sfchen,

Yes those are the same md5 I get when I check on the downloaded FASTQs. The command I run is: ./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result

I have downloaded the .fasta file from ensembl.

Thanks.

MarcHiggins avatar May 16 '19 13:05 MarcHiggins

The druggable.hg19.csv is in the genes folder

Have you checked the error message?

sfchen avatar May 16 '19 14:05 sfchen

I mean, you should run:

./genefuse -r Homo_sapiens_assembly19.fasta -f genes/druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result

sfchen avatar May 16 '19 14:05 sfchen

I have downloaded via wget the druggable.hg.csv from the genes folder. In the results document there are no reported errors

MarcHiggins avatar May 16 '19 14:05 MarcHiggins

Errors are saved to STDERR, not STDOUT. So you cannot find errors in the result file.

Can you just run the command without redirecting to result?

sfchen avatar May 16 '19 14:05 sfchen

I do not get any STDERR or STDOUT files regardless of if I redirect to result or not. I am running the binary if this may make a difference. Thank you for your help by the way.

MarcHiggins avatar May 16 '19 15:05 MarcHiggins

Apologies I meant I do not get an STDOUT file at all.

MarcHiggins avatar May 16 '19 15:05 MarcHiggins

You used >, which redirected STDOUT to the file you specified.

sfchen avatar May 16 '19 15:05 sfchen

But even if I exclude > there is no STDERR file - that is what I meant not the lack of STDOUT apologies for confusion.

MarcHiggins avatar May 16 '19 15:05 MarcHiggins

You didn't redirect STDERR, so it would be printed on terminal.

You can use following command to also redirect STDERR:

./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result & 2>err.log

sfchen avatar May 16 '19 15:05 sfchen

I have ran again and no errors are reported. I notice however the FASTQ files don't have the 15 million lines you mention in a different thread - they have more like 800,000. Maybe this is the issue?

MarcHiggins avatar May 16 '19 16:05 MarcHiggins

Hi sfchen, I have again ran genefuse on fastqs which I know to contain translocations. Your software did not call these. This is more just to let you know than a specific request or question.

MarcHiggins avatar May 17 '19 13:05 MarcHiggins

Thanks, I will try to reproduce it.

sfchen avatar May 17 '19 13:05 sfchen

Hi sfchen,

I am in the same situation. I do not find genefusion in the test dataset.

tkcaccia avatar Nov 28 '19 10:11 tkcaccia

I found the answer in https://github.com/OpenGene/GeneFuse/issues/31 -- the NCBI version of hg19 had a different chromosome naming convention, so it doesn't work. The version downloadable from UCSC is fine:

wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz (then unzip)

amacbride avatar Jul 09 '21 19:07 amacbride