mgatk
mgatk copied to clipboard
Error about Missing output files
Hi, team. I tried to use MAESTER(https://github.com/petervangalen/MAESTER-2021), and maegatk first. But in my own data or test data cannot be processed, so i tried to mgatk test file. Then i figured out this problem is also happed in mgatk.
I ran this command,
mgatk bcall -i barcode/test_barcode.bam -n bc1 -o bc1d -bt CB -b barcode/test_barcodes.txt -z
Error comes out after final_sparse_matrices part.
Thu Jul 07 17:56:35 KST 2022: mgatk v0.6.6
Thu Jul 07 17:56:35 KST 2022: Found bam file: barcode/test_barcode.bam for genotyping.
Thu Jul 07 17:56:35 KST 2022: Found file of barcodes to be parsed: barcode/test_barcodes.txt
Thu Jul 07 17:56:35 KST 2022: User specified mitochondrial genome matches .bam file
Thu Jul 07 17:56:37 KST 2022: Finished determining/splitting barcodes for genotyping.
Thu Jul 07 17:56:38 KST 2022: Genotyping samples with 88 threads
Config file bc1d/.internal/parseltongue/snake.gather.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
-------------------------- ------- ------------- -------------
all 1 1 1
make_depth_table 1 1 1
make_final_sparse_matrices 1 1 1
total 3 1 1
Select jobs to execute...
[Thu Jul 7 17:56:38 2022]
rule make_final_sparse_matrices:
output: bc1d/final/bc1.A.txt.gz, bc1d/final/bc1.C.txt.gz, bc1d/final/bc1.G.txt.gz, bc1d/final/bc1.T.txt.gz, bc1d/final/bc1.coverage.txt.gz
jobid: 2
reason: Missing output files: bc1d/final/bc1.A.txt.gz, bc1d/final/bc1.G.txt.gz, bc1d/final/bc1.coverage.txt.gz, bc1d/final/bc1.C.txt.gz, bc1d/final/bc1.T.txt.gz
resources: tmpdir=/tmp
Config file bc1d/.internal/parseltongue/snake.scatter.yaml is extended by additional config specified via the command line.
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 88
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
------------------ ------- ------------- -------------
all 1 1 1
make_sample_list 1 1 1
process_one_sample 3 1 1
total 5 1 1
Select jobs to execute...
[Thu Jul 7 17:56:39 2022]
rule process_one_sample:
input: bc1d/.internal/samples/CACCACTAGGAGGCGA-1.bam.txt
output: bc1d/temp/ready_bam/CACCACTAGGAGGCGA-1.qc.bam, bc1d/temp/ready_bam/CACCACTAGGAGGCGA-1.qc.bam.bai, bc1d/qc/depth/CACCACTAGGAGGCGA-1.depth.txt, bc1d/temp/sparse_matrices/CACCACTAGGAGGCGA-1.A.txt, bc1d/temp/sparse_matrices/CACCACTAGGAGGCGA-1.C.txt, bc1d/temp/sparse_matrices/CACCACTAGGAGGCGA-1.G.txt, bc1d/temp/sparse_matrices/CACCACTAGGAGGCGA-1.T.txt, bc1d/temp/sparse_matrices/CACCACTAGGAGGCGA-1.coverage.txt
jobid: 3
reason: Missing output files: bc1d/qc/depth/CACCACTAGGAGGCGA-1.depth.txt
wildcards: sample=CACCACTAGGAGGCGA-1
resources: tmpdir=/tmp
[Thu Jul 7 17:56:39 2022]
rule process_one_sample:
input: bc1d/.internal/samples/GCCTAGGCAGTTCGGC-1.bam.txt
output: bc1d/temp/ready_bam/GCCTAGGCAGTTCGGC-1.qc.bam, bc1d/temp/ready_bam/GCCTAGGCAGTTCGGC-1.qc.bam.bai, bc1d/qc/depth/GCCTAGGCAGTTCGGC-1.depth.txt, bc1d/temp/sparse_matrices/GCCTAGGCAGTTCGGC-1.A.txt, bc1d/temp/sparse_matrices/GCCTAGGCAGTTCGGC-1.C.txt, bc1d/temp/sparse_matrices/GCCTAGGCAGTTCGGC-1.G.txt, bc1d/temp/sparse_matrices/GCCTAGGCAGTTCGGC-1.T.txt, bc1d/temp/sparse_matrices/GCCTAGGCAGTTCGGC-1.coverage.txt
jobid: 4
reason: Missing output files: bc1d/qc/depth/GCCTAGGCAGTTCGGC-1.depth.txt
wildcards: sample=GCCTAGGCAGTTCGGC-1
resources: tmpdir=/tmp
[Thu Jul 7 17:56:39 2022]
rule process_one_sample:
input: bc1d/.internal/samples/CTAACTTAGAGCCACA-1.bam.txt
output: bc1d/temp/ready_bam/CTAACTTAGAGCCACA-1.qc.bam, bc1d/temp/ready_bam/CTAACTTAGAGCCACA-1.qc.bam.bai, bc1d/qc/depth/CTAACTTAGAGCCACA-1.depth.txt, bc1d/temp/sparse_matrices/CTAACTTAGAGCCACA-1.A.txt, bc1d/temp/sparse_matrices/CTAACTTAGAGCCACA-1.C.txt, bc1d/temp/sparse_matrices/CTAACTTAGAGCCACA-1.G.txt, bc1d/temp/sparse_matrices/CTAACTTAGAGCCACA-1.T.txt, bc1d/temp/sparse_matrices/CTAACTTAGAGCCACA-1.coverage.txt
jobid: 2
reason: Missing output files: bc1d/qc/depth/CTAACTTAGAGCCACA-1.depth.txt
wildcards: sample=CTAACTTAGAGCCACA-1
resources: tmpdir=/tmp
gzip: bc1d/final/bc1.A.txt: No such file or directory
gzip: bc1d/final/bc1.C.txt: No such file or directory
gzip: bc1d/final/bc1.G.txt: No such file or directory
gzip: bc1d/final/bc1.T.txt: No such file or directory
gzip: bc1d/final/bc1.coverage.txt: No such file or directory
Error in checkGrep(grep(".A.txt", files)) :
Improper folder specification; file missing / extra file present. See documentation
Calls: importMito -> checkGrep
Execution halted
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2022-07-07T175638.678416.snakemake.log
[Thu Jul 7 17:56:47 2022]
Finished job 4.
1 of 5 steps (20%) done
[Thu Jul 7 17:56:48 2022]
Finished job 2.
2 of 5 steps (40%) done
[Thu Jul 7 17:56:48 2022]
Finished job 3.
3 of 5 steps (60%) done
Select jobs to execute...
[Thu Jul 7 17:56:48 2022]
rule make_sample_list:
input: bc1d/qc/depth/CTAACTTAGAGCCACA-1.depth.txt, bc1d/qc/depth/CACCACTAGGAGGCGA-1.depth.txt, bc1d/qc/depth/GCCTAGGCAGTTCGGC-1.depth.txt
output: bc1d/temp/scattered.allSamples.txt
jobid: 1
reason: Missing output files: bc1d/temp/scattered.allSamples.txt; Input files updated by another job: bc1d/qc/depth/CACCACTAGGAGGCGA-1.depth.txt, bc1d/qc/depth/GCCTAGGCAGTTCGGC-1.depth.txt, bc1d/qc/depth/CTAACTTAGAGCCACA-1.depth.txt
resources: tmpdir=/tmp
[Thu Jul 7 17:56:49 2022]
Finished job 1.
4 of 5 steps (80%) done
Select jobs to execute...
[Thu Jul 7 17:56:49 2022]
localrule all:
input: bc1d/temp/scattered.allSamples.txt
jobid: 0
reason: Input files updated by another job: bc1d/temp/scattered.allSamples.txt
resources: tmpdir=/tmp
[Thu Jul 7 17:56:49 2022]
Finished job 0.
5 of 5 steps (100%) done
Complete log: .snakemake/log/2022-07-07T175638.766786.snakemake.log
and
These are the list of output files.
bc1d/:
total 20K
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 fasta
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 final
drwxrwxr-x 4 sjp sjp 4.0K Jul 7 2022 logs
drwxrwxr-x 4 sjp sjp 4.0K Jul 7 2022 qc
drwxrwxr-x 7 sjp sjp 4.0K Jul 7 2022 temp
bc1d/fasta:
total 24K
-rw-rw-r-- 1 sjp sjp 17K Jul 7 2022 chrM.fasta
-rw-rw-r-- 1 sjp sjp 19 Jul 7 2022 chrM.fasta.fai
bc1d/final:
total 120K
-rw-rw-r-- 1 sjp sjp 119K Jul 7 2022 chrM_refAllele.txt
bc1d/logs:
total 28K
-rw-rw-r-- 1 sjp sjp 433 Jul 7 2022 base.mgatk.log
-rw-rw-r-- 1 sjp sjp 476 Jul 7 2022 bc1.parameters.txt
-rw-rw-r-- 1 sjp sjp 0 Jul 7 2022 bc1.snakemake_gather.log
-rw-rw-r-- 1 sjp sjp 0 Jul 7 2022 bc1.snakemake_scatter.log
-rw-rw-r-- 1 sjp sjp 10K Jul 7 2022 bc1.snakemake_scatter.stats
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 filterlogs
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 rmdupslogs
bc1d/logs/filterlogs:
total 12K
-rw-rw-r-- 1 sjp sjp 21 Jul 7 2022 CACCACTAGGAGGCGA-1.filter.log
-rw-rw-r-- 1 sjp sjp 21 Jul 7 2022 CTAACTTAGAGCCACA-1.filter.log
-rw-rw-r-- 1 sjp sjp 21 Jul 7 2022 GCCTAGGCAGTTCGGC-1.filter.log
bc1d/logs/rmdupslogs:
total 12K
-rw-rw-r-- 1 sjp sjp 1.5K Jul 7 2022 CACCACTAGGAGGCGA-1.rmdups.log
-rw-rw-r-- 1 sjp sjp 1.5K Jul 7 2022 CTAACTTAGAGCCACA-1.rmdups.log
-rw-rw-r-- 1 sjp sjp 1.5K Jul 7 2022 GCCTAGGCAGTTCGGC-1.rmdups.log
bc1d/qc:
total 8.0K
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 depth
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 quality
bc1d/qc/depth:
total 12K
-rw-rw-r-- 1 sjp sjp 25 Jul 7 2022 CACCACTAGGAGGCGA-1.depth.txt
-rw-rw-r-- 1 sjp sjp 25 Jul 7 2022 CTAACTTAGAGCCACA-1.depth.txt
-rw-rw-r-- 1 sjp sjp 26 Jul 7 2022 GCCTAGGCAGTTCGGC-1.depth.txt
bc1d/qc/quality:
total 0
bc1d/temp:
total 24K
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 barcoded_bams
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 quality
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 ready_bam
-rw-rw-r-- 1 sjp sjp 57 Jul 7 2022 scattered.allSamples.txt
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 sparse_matrices
drwxrwxr-x 2 sjp sjp 4.0K Jul 7 2022 temp_bam
bc1d/temp/barcoded_bams:
total 7.0M
-rw-rw-r-- 1 sjp sjp 2.7M Jul 7 2022 CACCACTAGGAGGCGA-1.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 CACCACTAGGAGGCGA-1.bam.bai
-rw-rw-r-- 1 sjp sjp 2.4M Jul 7 2022 CTAACTTAGAGCCACA-1.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 CTAACTTAGAGCCACA-1.bam.bai
-rw-rw-r-- 1 sjp sjp 2.1M Jul 7 2022 GCCTAGGCAGTTCGGC-1.bam
-rw-rw-r-- 1 sjp sjp 792 Jul 7 2022 GCCTAGGCAGTTCGGC-1.bam.bai
bc1d/temp/quality:
total 0
bc1d/temp/ready_bam:
total 7.4M
-rw-rw-r-- 1 sjp sjp 2.8M Jul 7 2022 CACCACTAGGAGGCGA-1.qc.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 CACCACTAGGAGGCGA-1.qc.bam.bai
-rw-rw-r-- 1 sjp sjp 2.5M Jul 7 2022 CTAACTTAGAGCCACA-1.qc.bam
-rw-rw-r-- 1 sjp sjp 824 Jul 7 2022 CTAACTTAGAGCCACA-1.qc.bam.bai
-rw-rw-r-- 1 sjp sjp 2.1M Jul 7 2022 GCCTAGGCAGTTCGGC-1.qc.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 GCCTAGGCAGTTCGGC-1.qc.bam.bai
bc1d/temp/sparse_matrices:
total 3.2M
-rw-rw-r-- 1 sjp sjp 193K Jul 7 2022 CACCACTAGGAGGCGA-1.A.txt
-rw-rw-r-- 1 sjp sjp 458K Jul 7 2022 CACCACTAGGAGGCGA-1.coverage.txt
-rw-rw-r-- 1 sjp sjp 189K Jul 7 2022 CACCACTAGGAGGCGA-1.C.txt
-rw-rw-r-- 1 sjp sjp 99K Jul 7 2022 CACCACTAGGAGGCGA-1.G.txt
-rw-rw-r-- 1 sjp sjp 149K Jul 7 2022 CACCACTAGGAGGCGA-1.T.txt
-rw-rw-r-- 1 sjp sjp 183K Jul 7 2022 CTAACTTAGAGCCACA-1.A.txt
-rw-rw-r-- 1 sjp sjp 457K Jul 7 2022 CTAACTTAGAGCCACA-1.coverage.txt
-rw-rw-r-- 1 sjp sjp 183K Jul 7 2022 CTAACTTAGAGCCACA-1.C.txt
-rw-rw-r-- 1 sjp sjp 94K Jul 7 2022 CTAACTTAGAGCCACA-1.G.txt
-rw-rw-r-- 1 sjp sjp 145K Jul 7 2022 CTAACTTAGAGCCACA-1.T.txt
-rw-rw-r-- 1 sjp sjp 185K Jul 7 2022 GCCTAGGCAGTTCGGC-1.A.txt
-rw-rw-r-- 1 sjp sjp 452K Jul 7 2022 GCCTAGGCAGTTCGGC-1.coverage.txt
-rw-rw-r-- 1 sjp sjp 183K Jul 7 2022 GCCTAGGCAGTTCGGC-1.C.txt
-rw-rw-r-- 1 sjp sjp 93K Jul 7 2022 GCCTAGGCAGTTCGGC-1.G.txt
-rw-rw-r-- 1 sjp sjp 144K Jul 7 2022 GCCTAGGCAGTTCGGC-1.T.txt
bc1d/temp/temp_bam:
total 14M
-rw-rw-r-- 1 sjp sjp 2.7M Jul 7 2022 CACCACTAGGAGGCGA-1.temp0.bam
-rw-rw-r-- 1 sjp sjp 2.7M Jul 7 2022 CACCACTAGGAGGCGA-1.temp1.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 CACCACTAGGAGGCGA-1.temp1.bam.bai
-rw-rw-r-- 1 sjp sjp 2.4M Jul 7 2022 CTAACTTAGAGCCACA-1.temp0.bam
-rw-rw-r-- 1 sjp sjp 2.4M Jul 7 2022 CTAACTTAGAGCCACA-1.temp1.bam
-rw-rw-r-- 1 sjp sjp 808 Jul 7 2022 CTAACTTAGAGCCACA-1.temp1.bam.bai
-rw-rw-r-- 1 sjp sjp 2.1M Jul 7 2022 GCCTAGGCAGTTCGGC-1.temp0.bam
-rw-rw-r-- 1 sjp sjp 2.1M Jul 7 2022 GCCTAGGCAGTTCGGC-1.temp1.bam
-rw-rw-r-- 1 sjp sjp 792 Jul 7 2022 GCCTAGGCAGTTCGGC-1.temp1.bam.bai
I really appreciate your kindness. Thanks.
Hm strange... it's erroring out at not finding a 'depth' file per barcode; can you see if this command works?
mgatk bcall -i barcode/test_barcode.bam -n bc1_tenx -o tenx_test -bt CB -b barcode/test_barcodes.txt -z
Thank you for your reply. Unfortunately it shows same error.
But interesting point is if i run again with -z option to the same directory, it runs well! I think some problem in snakemake cause the problem to find file or something. I test this on snakemake version 7.7.0 & 7.8.5.
And plus, can I use this tools to 10x single-end (3' or 5') scRNA seq data? In this case filtering method that used in this tool(strand bias) have to be changed, but i just not sure I can or not.
hello, I met the same error as u mentioned, did u figure out what cause the problem? how to solve the problem? and what do u mean "run again with -z option to the same directory,"?