genome-grist icon indicating copy to clipboard operation
genome-grist copied to clipboard

filesystem latency error

Open jeanzzhao opened this issue 2 years ago • 8 comments

One of my genome-grist runs on 21 Marine metagenome samples failed due to a filesystem latency error.

less /home/zyzhao/assloss/grist/marine21/.snakemake/log/2022-11-01T121117.233778.snakemake.log
...
    jobid: 5
    wildcards: sample=SRR12479851
    threads: 6
    resources: tmpdir=/tmp, mem_mb=40000

Activating conda environment: /home/zyzhao/assloss/grist/marine21/.snakemake/conda/ec46e841a679de5a50a2bf3cb2326bd9
Waiting at most 5 seconds for missing files.
MissingOutputException in line 482 of /home/zyzhao/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/Snakefile:
Job Missing files after 5 seconds:
outputs.marine21_samples/raw/SRR2053302_1.fastq.gz
outputs.marine21_samples/raw/SRR2053302_2.fastq.gz
outputs.marine21_samples/raw/SRR2053302_unpaired.fastq.gz
/tmp/tmp0h36lsbn/SRR2053302.d
/tmp/tmp0h36lsbn/SRR2053302.d/SRR2053302_1.fastq
/tmp/tmp0h36lsbn/SRR2053302.d/SRR2053302_2.fastq
/tmp/tmp0h36lsbn/SRR2053302.d/SRR2053302.fastq
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 150 completed successfully, but some output files are missing. 150
  • re-run stopped @~10h after upgrading snakemake with pip install -U snakemake
less /home/zyzhao/assloss/grist/marine21/.snakemake/log/2022-11-03T115319.708459.snakemake.log
...
MissingOutputException in line 482 of /home/zyzhao/miniconda3/envs/grist/lib/python3.9/site-packages/genome_grist/conf/Snakefile:
Job Missing files after 5 seconds:
outputs.marine21_samples/raw/SRR5917881_1.fastq.gz
outputs.marine21_samples/raw/SRR5917881_2.fastq.gz
outputs.marine21_samples/raw/SRR5917881_unpaired.fastq.gz
/tmp/tmpvgh47xjv/SRR5917881.d
/tmp/tmpvgh47xjv/SRR5917881.d/SRR5917881_1.fastq
/tmp/tmpvgh47xjv/SRR5917881.d/SRR5917881_2.fastq
/tmp/tmpvgh47xjv/SRR5917881.d/SRR5917881.fastq
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 217 completed successfully, but some output files are missing. 217
different SRR, but the same MissingOutputException in line 482 of...
  • SRR5917881 was removed from config file for the re-run

jeanzzhao avatar Nov 14 '22 02:11 jeanzzhao

I'm having trouble reproducing this now; not sure what changed! Very frustrating - I had no problems reproducing it before 😆

ctb avatar Dec 05 '22 15:12 ctb

hi @jeanzzhao I've been unable to reproduce this error; not sure where or how it was fixed but 🤷

I've just released genome-grist v0.9.2. pip install -U genome-grist should upgrade. If you can give it a try on the troublesome samples, I'd appreciate it!

ctb avatar Dec 06 '22 14:12 ctb

(I think this is the wrong issue for skip_genomes?)

Also, when you paste error messages from snakemake please paste more of the log - the above doesn't contain any actual error messages, just the part where snakemake notices there's an error!

ctb avatar Dec 07 '22 14:12 ctb

what's the directory, set of commands, and conda environment you're using? thx!

ctb avatar Dec 08 '22 19:12 ctb

sorry, some of my previous comments should be posted on a different issue (skip_genome), let me remove them.

jeanzzhao avatar Dec 08 '22 21:12 jeanzzhao

dig into this one SRR11593045 with @bluegenes , error in file below: /home/zyzhao/assloss/grist/soil/.snakemake/log/2022-11-24T070134.689543.snakemake.log

Job Missing files after 5 seconds:
outputs.soil_samples/sigs/SRR11593045.trim.sig.zip
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
Job id: 130 completed successfully, but some output files are missing. 130

small _1 & _2 fastq.gz file

/home/zyzhao/assloss/grist/soil/outputs.soil_samples/raw$ ls -alh SRR11593045*
-r--r--r-- 1 zyzhao zyzhao   20 Nov 24 07:05 SRR11593045_1.fastq.gz
-r--r--r-- 1 zyzhao zyzhao   20 Nov 24 07:05 SRR11593045_2.fastq.gz
-r--r--r-- 1 zyzhao zyzhao 5.8M Nov 24 07:05 SRR11593045_unpaired.fastq.gz

weird SRR11593045_1.fastq.gz less SRR11593045_1.fastq.gz

^_<8B>^H^@4<88>^?c^@^C^C^@^@^@^@^@^@^@^@^@
SRR11593045_1.fastq.gz (END)

zless SRR11593045_unpaired.fastq.gz Screen Shot 2022-12-08 at 12 33 20 PM

download file from NCBI https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR11593045&display=download fastq:

@SRR11593045.1.1 1 length=395
TAAGACGTAGGGGGCCAGCGTTGCTCGGAATTACTGGGTGTAAAGGGTTCGTAGGCGGTGCGGCAAGTTGGGAGTGAAATCTCTGGGCTCAACCCAGAGACGGCTTCCAAAACTGCTGTGCTTGAGTGTGAGAGAGGCTCGTGGAATTGCAGGTGTAGCGGTGAAATGCGTAGAGATGCGGAGGAACACCGATGGCGAAGGCAGCCCCCTGGGCTAGCACTGACGCTCAGGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACTATGTCAACTGGGTGTTCGGGAAGCGATTTCTGAGTACCGTAGCTAACGCGTGAAGTTGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTTAAAGGAATTGACGG
+SRR11593045.1.1 1 length=395
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
@SRR11593045.2.1 2 length=396
TAATACGTAGGCAGCGAGCGTTGTTCGGAGTTACTGGGCGTAAAGGGTGTGTAGGCGGTTGTTTAAGTTTGGTGTGAAATCTCCCGGCTCAACTGGGAGGGTGTGCCGAATACTGAATGACTTCGAGTGCGGGAGAGGAAAGTGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACACCGGTGGTGTAGACGGCTTTCTGGACCGTAACTGACGCAGAGACACGAAAGCGTGGGTAGCAAACAGGATTAGAGACCCTGGAAGTCCACTCCCTAAACGATGCATATTTGGTGTGGGCAGTTCATTCTGTCCGTGCCGGAGCTAACGCGTTAAATATGCCGCCTGGGGAGTACAGTCGCAATGCTGAAACTTAAATGAATTGACGG
+SRR11593045.2.1 2 length=396
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
@SRR11593045.3.1 3 length=398
TAATACAGAGGGTGCAAGCGTTGTTCGGAATTATTGGGCGTAAAGGGCGCGTAGGCGGCGCGACAAGTCACCTGTGAAATCCCCGGGCTTAACTCGGGGCCTGCAGGCGAAACTGTCGTGCTGGAGTATGGGAGAGGTGCGTGGAATTCCCGGTGTAGCGGTGAAATGCGTAGATATCGGGAGGAACACCTGCGGCGAAGGCGGGTTGCTGGGCCGACACTGACGCTGATGCGCGAAAGCCAGGGGAGCGAACGGGATTAGATACCCCGGTAGTCCTGGCCTTAAACGATGGATGCTTGGTGTCTGGGGTTTTATAGTCCCCGGGTGCCGCAGCTAACGCGTTAAGCATCCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTCAAATGAATTGACGG
+SRR11593045.3.1 3 length=398
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
...

fasta:

@SRR11593045.1.1 1 length=395
TAAGACGTAGGGGGCCAGCGTTGCTCGGAATTACTGGGTGTAAAGGGTTCGTAGGCGGTGCGGCAAGTTGGGAGTGAAATCTCTGGGCTCAACCCAGAGACGGCTTCCAAAACTGCTGTGCTTGAGTGTGAGAGAGGCTCGTGGAATTGCAGGTGTAGCGGTGAAATGCGTAGAGATGCGGAGGAACACCGATGGCGAAGGCAGCCCCCTGGGCTAGCACTGACGCTCAGGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCCTAAACTATGTCAACTGGGTGTTCGGGAAGCGATTTCTGAGTACCGTAGCTAACGCGTGAAGTTGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTTAAAGGAATTGACGG
@SRR11593045.2.1 2 length=396
TAATACGTAGGCAGCGAGCGTTGTTCGGAGTTACTGGGCGTAAAGGGTGTGTAGGCGGTTGTTTAAGTTTGGTGTGAAATCTCCCGGCTCAACTGGGAGGGTGTGCCGAATACTGAATGACTTCGAGTGCGGGAGAGGAAAGTGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACACCGGTGGTGTAGACGGCTTTCTGGACCGTAACTGACGCAGAGACACGAAAGCGTGGGTAGCAAACAGGATTAGAGACCCTGGAAGTCCACTCCCTAAACGATGCATATTTGGTGTGGGCAGTTCATTCTGTCCGTGCCGGAGCTAACGCGTTAAATATGCCGCCTGGGGAGTACAGTCGCAATGCTGAAACTTAAATGAATTGACGG
@SRR11593045.3.1 3 length=398
TAATACAGAGGGTGCAAGCGTTGTTCGGAATTATTGGGCGTAAAGGGCGCGTAGGCGGCGCGACAAGTCACCTGTGAAATCCCCGGGCTTAACTCGGGGCCTGCAGGCGAAACTGTCGTGCTGGAGTATGGGAGAGGTGCGTGGAATTCCCGGTGTAGCGGTGAAATGCGTAGATATCGGGAGGAACACCTGCGGCGAAGGCGGGTTGCTGGGCCGACACTGACGCTGATGCGCGAAAGCCAGGGGAGCGAACGGGATTAGATACCCCGGTAGTCCTGGCCTTAAACGATGGATGCTTGGTGTCTGGGGTTTTATAGTCCCCGGGTGCCGCAGCTAACGCGTTAAGCATCCCGCCTGGGGAGTACGGTCGCAAGACTGAAACTCAAATGAATTGACGG
...

NCBI file seems weird. 12/14/22, this is a RNA seq sample

jeanzzhao avatar Dec 08 '22 21:12 jeanzzhao

Re-run going for ~3h now, will update, thank you! @ctb

jeanzzhao avatar Dec 09 '22 01:12 jeanzzhao

Re-run failed, different error (samtools_mpileup_wc) though: farm:~/assloss/grist/marine21/.snakemake/log$ less 2022-12-08T155647.533690.snakemake.log

Error in rule samtools_mpileup_wc:
    jobid: 22889
    output: outputs.marine21_samples/leftover/SRR5915428.x.GCA_902550555.1.bcf, outputs.marine21_samples/leftover/SRR5915428.x.GCA_902550555.1.vcf.gz, outpu
ts.marine21_samples/leftover/SRR5915428.x.GCA_902550555.1.vcf.gz.csi
...

jeanzzhao avatar Dec 09 '22 20:12 jeanzzhao