StainedGlass icon indicating copy to clipboard operation
StainedGlass copied to clipboard

Error in rule pair_end_bed with large chromosomes.

Open yuanhelianyi opened this issue 4 months ago • 1 comments

Hi Mitchell,

I'm trying to run StainedGlass on the one chromosome, which length is 496Mb. When I run the following:

nohup snakemake --nt \
                --cores 12 \
                --config sample=$chr \
                         fasta=/data6/$chr/$chr.fasta \
                         window=2000 \
                         nbatch=7 \
                         alnthreads=12 \
                         mm_f=10000 \
>stainedglass.out 2>&1 &

I get this error:

Error in rule pair_end_bed:
    message: None
    jobid: 22
    input: temp/CM01.2000.10000.tbl.gz, /data6/CM01/CM01.fasta.fai
    output: results/CM01.2000.10000.bed.gz, results/CM01.2000.10000.full.tbl.gz
    log: logs/pair_end_bed.CM01.2000.10000.log (check log file(s) for error details)
    conda-env: /data6/stainedglass/CM01/.snakemake/conda/8e6ed08619c6fe6f9766f8006374821f_
    shell:
        
        python /data5/.cache/snakemake/snakemake/source-cache/runtime-cache/tmpv7vyq2_m/file/data6/bin/StainedGlass/workflow/scripts/refmt.py             --window 2000 --fai /data6/CM01/CM01.fasta.fai             --full results/CM01.2000.10000.full.tbl.gz                          temp/CM01.2000.10000.tbl.gz results/CM01.2000.10000.bed.gz
        
        (command exited with non-zero exit code)
Logfile logs/pair_end_bed.CM01.2000.10000.log: empty file
Complete log(s): /data6/.snakemake/log/2025-08-25T135232.400702.snakemake.log
WorkflowError:
At least one job did not complete successfully.

And I found the error was caused by "numpy.core._exceptions._ArrayMemoryError: Unable to allocate 23.3 GiB for an array with shape (6, 520971401) and data type int64"

Or maybe you've encountered this before and have a different solution?

Best, Jing

yuanhelianyi avatar Aug 27 '25 03:08 yuanhelianyi

Hi Jing,

This looks like an out of memory error, unfortunately a chuck of my code is pretty inefficient with memory. So the solution is just more ram.

But if you are willing to try new tools there is now a great alternative that is less resource intensive that you might want to try: https://github.com/marbl/ModDotPlot

Cheers, Mitchell

mrvollger avatar Sep 10 '25 00:09 mrvollger