pipeline-structural-variation icon indicating copy to clipboard operation
pipeline-structural-variation copied to clipboard

pipeline cannot access .py scripts at `/home/user/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation`

Open BCArg opened this issue 4 years ago • 3 comments

In order to run this pipeline I have followed the instructions as in this github page. I have miniconda3 installed and I have installed snakemake with conda install -c bioconda -c conda-forge snakemake=5.7.0 as the command conda install -y snakemake installed a version < 5.4.3 on my machine.

After downloading and unzipping the files, I have created a virtual environment with

conda env create -n pipeline-structural-variation -f env.yml

Then, after activating this environment I tried to run the pipeline with

snakemake -p all

At first, I got the following error message (which is my .log file):

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
        count   jobs
        1       all
        1       auto_read_support
        1       bed_from_bam
        1       calc_depth
        1       call_sniffles
        1       filter_region
        1       filter_vcf
        1       index_minimap2
        1       index_vcf
        1       init
        1       map_minimap2
        1       nanoplot_qc
        1       reformat_vcf
        1       sort_vcf
        14

[Wed Oct 16 14:25:40 2019]
rule index_minimap2:
    input: /home/bhinckel/pipeline-structural-variation-1.2.0/data/test_reference/human_g1k_v37_part.fasta
    output: my_sample/index/minimap2.idx
    jobid: 6
    wildcards: sample=my_sample

minimap2 -t 1 -ax map-ont --MD -Y /home/bhinckel/pipeline-structural-variation-1.2.0/data/test_reference/human_g1k_v37_part.fasta -d my_sample/index/minimap2.idx
[Wed Oct 16 14:25:40 2019]
Finished job 6.
1 of 14 steps (7%) done

[Wed Oct 16 14:25:40 2019]
rule map_minimap2:
    input: /home/bhinckel/pipeline-structural-variation-1.2.0/data/hg002_simple_test_chr12_44420446_44421533/, my_sample/index/minimap2.idx
    output: my_sample/alignment/my_sample_minimap2.bam, my_sample/alignment/my_sample_minimap2.bam.bai
    jobid: 4
    wildcards: sample=my_sample

cat_fastq /home/bhinckel/pipeline-structural-variation-1.2.0/data/hg002_simple_test_chr12_44420446_44421533/ | minimap2 -t 1 -ax map-ont --MD -Y my_sample/index/minimap2.idx - | samtools sort -@ 12.0 -o my_sample/alignment/my_sample_minimap2.bam - && samtools index -@ 1 my_sample/alignment/my_sample_minimap2.bam
[Wed Oct 16 14:25:40 2019]
Error in rule map_minimap2:
    jobid: 4
    output: my_sample/alignment/my_sample_minimap2.bam, my_sample/alignment/my_sample_minimap2.bam.bai
    shell:
        cat_fastq /home/bhinckel/pipeline-structural-variation-1.2.0/data/hg002_simple_test_chr12_44420446_44421533/ | minimap2 -t 1 -ax map-ont --MD -Y my_sample/index/minimap2.idx - | samtools sort -@ 12.0 -o my_sample/alignment/my_sample_minimap2.bam - && samtools index -@ 1 my_sample/alignment/my_sample_minimap2.bam
        (exited with non-zero exit code)

Removing output files of failed job map_minimap2 since they might be corrupted:
my_sample/alignment/my_sample_minimap2.bam

I then, made the following replacement on the Snakefile:

rule map_minimap2:
   input:
       FQ = FQ_INPUT_DIRECTORY,
       IDX = rules.index_minimap2.output
   output:
       BAM = "{sample}/alignment/{sample}_minimap2.bam",
       BAI = "{sample}/alignment/{sample}_minimap2.bam.bai"
   params:
       min_qscore = config["min_qscore"] if "min_qscore" in config else 6,
       min_read_length = config["min_read_length"] if "min_read_length" in config else 1000,
       sort_threads = max(1, (max(1, config["threads"]) * 0.1))
   conda: "env.yml"
   threads: config["threads"]
   shell:
       #"cat_fastq {input.FQ} | minimap2 -t {threads} -ax map-ont --MD -Y {input.IDX} - | samtools sort -@ {params.sort_threads} -o {output.BAM} - && samtools index -@ {threads} {output.BAM}"
       "python /home/bhinckel/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation/cat_fastq.py {input.FQ} | minimap2 -t {threads} -ax map-ont --MD -Y {input.IDX} - | samtools sort -@ {params.sort_threads} -o {output.BAM} - && samtools index -@ {threads} {output.BAM}"

The cat_fastq command now worked (inputting the full path to the python script). But then I got the following error message (again, which is my log file), which was quite surprising to me as the command pip install /home/bhinckel/pipeline-structural-variation-1.2.0/lib &> init ran OK.

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
	count	jobs
	1	all
	1	auto_read_support
	1	bed_from_bam
	1	calc_depth
	1	call_sniffles
	1	filter_region
	1	filter_vcf
	1	index_minimap2
	1	index_vcf
	1	init
	1	map_minimap2
	1	nanoplot_qc
	1	reformat_vcf
	1	sort_vcf
	14

[Wed Oct 16 15:00:08 2019]
rule index_minimap2:
    input: /home/bhinckel/pipeline-structural-variation-1.2.0/data/test_reference/human_g1k_v37_part.fasta
    output: my_sample/index/minimap2.idx
    jobid: 6
    wildcards: sample=my_sample

minimap2 -t 1 -ax map-ont --MD -Y /home/bhinckel/pipeline-structural-variation-1.2.0/data/test_reference/human_g1k_v37_part.fasta -d my_sample/index/minimap2.idx
[Wed Oct 16 15:00:08 2019]
Finished job 6.
1 of 14 steps (7%) done

[Wed Oct 16 15:00:08 2019]
rule map_minimap2:
    input: /home/bhinckel/pipeline-structural-variation-1.2.0/data/hg002_simple_test_chr12_44420446_44421533/, my_sample/index/minimap2.idx
    output: my_sample/alignment/my_sample_minimap2.bam, my_sample/alignment/my_sample_minimap2.bam.bai
    jobid: 4
    wildcards: sample=my_sample

python /home/bhinckel/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation/cat_fastq.py /home/bhinckel/pipeline-structural-variation-1.2.0/data/hg002_simple_test_chr12_44420446_44421533/ | minimap2 -t 1 -ax map-ont --MD -Y my_sample/index/minimap2.idx - | samtools sort -@ 12.0 -o my_sample/alignment/my_sample_minimap2.bam - && samtools index -@ 1 my_sample/alignment/my_sample_minimap2.bam
[Wed Oct 16 15:00:08 2019]
Finished job 4.
2 of 14 steps (14%) done

[Wed Oct 16 15:00:08 2019]
rule call_sniffles:
    input: my_sample/alignment/my_sample_minimap2.bam
    output: my_sample/sv_calls/my_sample_sniffles_tmp.vcf
    jobid: 12
    wildcards: sample=my_sample

sniffles -m my_sample/alignment/my_sample_minimap2.bam -v my_sample/sv_calls/my_sample_sniffles_tmp.vcf -s 3 -r 1000 -q 20 --genotype --report_read_strands
[Wed Oct 16 15:00:08 2019]
Finished job 12.
3 of 14 steps (21%) done

[Wed Oct 16 15:00:08 2019]
rule nanoplot_qc:
    input: my_sample/alignment/my_sample_minimap2.bam
    output: my_sample/qc
    jobid: 2
    wildcards: sample=my_sample

NanoPlot -t 1 --bam my_sample/alignment/my_sample_minimap2.bam --raw -o my_sample/qc -p my_sample_ --N50 --title my_sample --downsample 100000
[Wed Oct 16 15:00:52 2019]
Finished job 2.
4 of 14 steps (29%) done

[Wed Oct 16 15:00:52 2019]
rule init:
    output: init
    jobid: 9

pip install /home/bhinckel/pipeline-structural-variation-1.2.0/lib &> init
[Wed Oct 16 15:00:54 2019]
Finished job 9.
5 of 14 steps (36%) done

[Wed Oct 16 15:00:54 2019]
rule bed_from_bam:
    input: my_sample/alignment/my_sample_minimap2.bam, init
    output: my_sample/target.bed
    jobid: 13
    wildcards: sample=my_sample

bamref2bed -b my_sample/alignment/my_sample_minimap2.bam -f _Un _random > my_sample/target.bed
[Wed Oct 16 15:00:54 2019]
Finished job 13.
6 of 14 steps (43%) done

[Wed Oct 16 15:00:54 2019]
rule calc_depth:
    input: my_sample/alignment/my_sample_minimap2.bam, my_sample/target.bed
    output: my_sample/depth
    jobid: 11
    wildcards: sample=my_sample

mkdir -p my_sample/depth; mosdepth -x -t 1 -n -b my_sample/target.bed my_sample/depth/my_sample my_sample/alignment/my_sample_minimap2.bam
[Wed Oct 16 15:00:54 2019]
Finished job 11.
7 of 14 steps (50%) done

[Wed Oct 16 15:00:54 2019]
rule filter_region:
    input: my_sample/sv_calls/my_sample_sniffles_tmp.vcf, my_sample/target.bed, init
    output: my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf
    jobid: 10
    wildcards: sample=my_sample

bcftools view -T my_sample/target.bed my_sample/sv_calls/my_sample_sniffles_tmp.vcf -o my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf
[Wed Oct 16 15:00:54 2019]
Finished job 10.
8 of 14 steps (57%) done

[Wed Oct 16 15:00:54 2019]
rule reformat_vcf:
    input: my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf, init
    output: my_sample/sv_calls/my_sample_sniffles.vcf
    jobid: 7
    wildcards: sample=my_sample

sniffles-edit --ins-length --check --vcf-version -v my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf -o my_sample/sv_calls/my_sample_sniffles.vcf
[Wed Oct 16 15:00:54 2019]
Error in rule reformat_vcf:
    jobid: 7
    output: my_sample/sv_calls/my_sample_sniffles.vcf
    shell:
        sniffles-edit --ins-length --check --vcf-version -v my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf -o my_sample/sv_calls/my_sample_sniffles.vcf
        (exited with non-zero exit code)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Like above, I inserted the full path to the python script in the rule reformat_vcf, as shown below

rule reformat_vcf:
    input:
         VCF = rules.filter_region.output.VCF,
         SETUP = "init"
    output:
         VCF = "{sample}/sv_calls/{sample}_sniffles.vcf"
    conda: "env.yml"
    shell:
         #"sniffles-edit --ins-length --check --vcf-version -v {input.VCF} -o {output.VCF}"
         "python /home/bhinckel/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation/sniffles_edit.py --ins-length --check --vcf-version -v {input.VCF} -o {output.VCF}"

Though, after this replacement, I got the same error as the .log file shown above.

If I do:

python /home/bhinckel/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation/sniffles_edit.py

I get:

Traceback (most recent call last):
  File "/home/bhinckel/pipeline-structural-variation-1.2.0/lib/pipeline_structural_variation/sniffles_edit.py", line 13, in <module>
    from sv_calling_glue.sniffles_telemetry import get_summary_from_vcf, \
ModuleNotFoundError: No module named 'sv_calling_glue'

So I assume the module sv_calling_glue is missing.

Do you know how can I fix this? The python version installed on my virtual environment is 3.6.7.

BCArg avatar Oct 16 '19 13:10 BCArg

Try snakemake --use-conda -p all

It solved a lot missing module problem for me.

cluhaowie avatar Oct 20 '19 21:10 cluhaowie

I also tried this i.e. without creating a virtual environment though it did not work.

I downloaded the same pipeline from another location, which looks slightly different, and it now works, see this question

BCArg avatar Oct 21 '19 07:10 BCArg

Hello everyone, I run snakemake --use-conda -p all, and got a error as below:

Building DAG of jobs... Using shell: /bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 1 all 1 auto_read_support 1 filter_vcf 1 index_vcf 1 reformat_vcf 1 sort_vcf 6

[Sat Nov 9 02:50:09 2019] rule reformat_vcf: input: my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf, init output: my_sample/sv_calls/my_sample_sniffles.vcf jobid: 8 wildcards: sample=my_sample

sniffles-edit --ins-length --check --vcf-version -v my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf -o my_sample/sv_calls/my_sample_sniffles.vcf /bin/bash: sniffles-edit: command not found [Sat Nov 9 02:50:09 2019] Error in rule reformat_vcf: jobid: 8 output: my_sample/sv_calls/my_sample_sniffles.vcf shell: sniffles-edit --ins-length --check --vcf-version -v my_sample/sv_calls/my_sample_sniffles_region_filtered.vcf -o my_sample/sv_calls/my_sample_sniffles.vcf (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.

Could give some suggestions? Thanks a lot! tjiang

tjiangHIT avatar Nov 09 '19 02:11 tjiangHIT