fusioncatcher
fusioncatcher copied to clipboard
why fusioncatcher didn't work until the sencond run?
I test fusioncatcher using the test fq files:
http://sourceforge.net/projects/fusioncatcher/files/test/reads_1.fq.gz
http://sourceforge.net/projects/fusioncatcher/files/test/reads_2.fq.gz
for the glibc error reason, I can only run it through a singularity image:
singularity exec -B /mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher:/mnt,/mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher/test/fq_dir:/tmp,/mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher/test/output:/opt /mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/centos7_yum2.simg /mnt/bin/fusioncatcher -d /mnt/data/current -i /tmp -o /opt
then there came the error:
WARNING: Cannot restart automatically because the previous log file '/opt/fusioncatcher.log' cannot be found! The workflow will be restarted from the beginning with step 1! .................... ERROR: The version of the data build does not match the version of this pipeline! Please, run again the 'fusioncatcher-build.py' in order to fix this! ....................
I'm sure I used the very version matching the pipeline, for I installed the data followed the manual:
git clone https://github.com/ndaniel/fusioncatcher
cd fusioncatcher/tools/
./install_tools.sh
cd ../data
./download-human-db.sh
but when I run the command line for the second time, it can run to the end:
singularity exec -B /mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher:/mnt,/mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher/test/fq_dir:/tmp,/mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/fusioncatcher/test/output:/opt /mnt/lustre/user/wubin/01.Program/Scripts/02.software/Fusioncatcher/centos7_yum2.simg /mnt/bin/fusioncatcher -d /mnt/data/current -i /tmp -o /opt
however, it reported an error:
Reading... /mnt/data/current/genes_symbols.txt Processing and reading... /opt/reads_filtered_transcriptome_sorted-read_no-offending-reads.map Writing... /opt/candidate_fusion-genes_no-offending-reads.txt Traceback (most recent call last): File "/mnt/bin/find_fusion_genes_map.py", line 335, in
data=[line+[hugo[line[0]],hugo[line[1]]] for line in data] KeyError: 'ENSG09000000014'
my question are :
- since I followed the manual to install the data, why fusioncatcher tell me it doesn't match?
- I can run it to the end by just try a second time, why?
- if I deleted all the files in the output directory before the second run, it will report the "ERROR: The version of the data build does not match the version of this pipeline!", just like the first run.
- what does the KeyError: 'ENSG09000000014' mean? does it indicate a failure of running fusioncatcher ?