IsoQuant icon indicating copy to clipboard operation
IsoQuant copied to clipboard

IsoQuant visualize.py memory issue

Open sme229 opened this issue 7 months ago • 8 comments

Hello,

I'm trying to run the visualize.py script as described here https://ablab.github.io/IsoQuant/visualization.html on Nanopore single-cell sequencing data. I'm not sure how to optimize the requested resources as even 4T of memory gives me 'out of memory' error. Here is my script:

#!/bin/bash #SBATCH --job-name=isoq_visualise #SBATCH --mem=3T #SBATCH --time=48:00:00

module load python python IsoQuant/visualize.py counts_cDNA2_dedup --gene_list genes_short.txt --viz_output IsoQuant_visualisation

Here is the error message:

/home/sme229/.local/lib/python3.9/site-packages/matplotlib/projections/init.py:63: UserWarning: Unable to import Axes3D. This may be due to multiple versions of Matplotlib being installed (e.g. as a system package and as a pip package). As a result, the 3D projection is not available. warnings.warn("Unable to import Axes3D. This may be due to multiple versions of " /home/sme229/.local/lib/python3.9/site-packages/pandas/core/computation/expressions.py:21: UserWarning: Pandas requires version '2.8.4' or newer of 'numexpr' (version '2.7.3' currently installed). from pandas.core.computation.check import NUMEXPR_INSTALLED /home/sme229/.local/lib/python3.9/site-packages/pandas/core/arrays/masked.py:60: UserWarning: Pandas requires version '1.3.6' or newer of 'bottleneck' (version '1.3.2' currently installed). from pandas.core import ( /cm/local/apps/slurm/var/spool/job5696671/slurm_script: line 11: 28351 Killed python IsoQuant/visualize.py counts_cDNA2_dedup --gene_list genes.txt --viz_output IsoQuant_visualisation slurmstepd: error: Detected 1 oom_kill event in StepId=5696671.batch. Some of the step tasks have been OOM Killed.

sme229 avatar May 21 '25 22:05 sme229

Dear @sme229

Thanks for the report, this looks odd, could you attach the full log?

Tagging @jackfreeman88

Best Andrey

andrewprzh avatar May 22 '25 13:05 andrewprzh

Hi @andrewprzh

Thanks so much for your response. Please find attached the slurm log file, it's the only output file I get.

slurm-5872528.txt

Also here is a screenshot of my .txt file with a list of genes that I used as input:

Image

sme229 avatar May 23 '25 21:05 sme229

@sme229 Could you also send isoquant.log just in case?

andrewprzh avatar May 26 '25 09:05 andrewprzh

Hi @andrewprzh I don't have isoquant.log, it doesn't get written. All I get outputted is the slurm txt file.

sme229 avatar May 26 '25 09:05 sme229

@sme229 what about IsoQuant output folder (counts_cDNA2_dedup), is there a log file there?

andrewprzh avatar May 26 '25 09:05 andrewprzh

@andrewprzh yes, please find attached. This log file is from processing the bam file.

isoquant.log

sme229 avatar May 26 '25 09:05 sme229

@sme229

I guess I might know the reason. You use --read_group tag:CB option, which creates large count tables with per-cell counts. Visualizer was developed to compare bulk conditions and never tested on single-cell data. So I guess it just freaks out and thus OOM happens. The number of "conditions" is way too large for it.

To visualize your results I suggest to run without --read_group option (i.e. pseudobulk) or group reads by cell types so you will have a few dozens of columns in the counts file (per-cell-type counts), rather than thousands as you have now. Let me know if you need any help in generating those.

Also, I recommend to update to the latest version, it has better performance and some important bugs were fixed.

Best Andrey

andrewprzh avatar May 26 '25 10:05 andrewprzh

@andrewprzh thank you for your help with this. I'll re-run the analysis as pseudobulk.

sme229 avatar May 27 '25 21:05 sme229