pySCENIC
pySCENIC copied to clipboard
Pyscenic running grn multiple times
Hi,
So I'm not sure if this is a bug or not. However when I run GRN on a large dataset, GRN completes the infer partial partitions, then restarts and starts inferring partitions again. I'm not sure if this is intended, or if this is a bug.
Hi @LinearParadox
Sorry for the late reply, could you show the output of your command?
Best, Seppe
I think I've deleted/migrated a lot of the stuff on the machine unfortunately. I can describe it:
The job stops normally,and it properly outputs the adjacency list. Just for some reason, none of the steps after GRN ever start. For further context, I realized this is across SCENIC nextflow pipelines, not just multiple runs of GRN
I have gotten this error before. Assuming you're using Dask, this is due to the last step of GRN
, which involves overloading a single Dask worker's memory (memory usage for the single worker can increase drastically, from what I have seen). The issue is that if the memory usage by a Dask worker reaches a threshold (95% of the memory limit by default), then the Dask worker will automatically restart to avoid crashing. Since this Dask worker, whose prior memory content has already been lost, contained the information to create the final output file, Dask restarts the entire process from scratch.
Of course, so long as the memory limits for the Dask workers remain the same, GRN will restart every single time until someone manually stops this infinite loop. Therefore, I have found that the key to solving this issue is simply to increase the memory limit of the individual Dask workers.