pySCENIC icon indicating copy to clipboard operation
pySCENIC copied to clipboard

Possible solutions for GRNBoost2/GENIE3 Dask issues

Open cflerin opened this issue 5 years ago • 2 comments

A recurring problem is that the GRN inference step of pySCENIC (using Arboreto's GRNBoost2/GENIE3 implementation) fails to complete successfully. This seems to be due to issues with newer Dask releases being incompatible with the existing GRNBoost2/GENIE3 implementation.

Possible errors

  • ValueError: Metadata mismatch found in from_delayed
  • Expected partition of type DataFrame but got NoneType
  • ValueError: tuple is not allowed for map key
  • ...

Possible solutions

  1. In many cases using an older version of the dask/distributed packages can help to fix this. This is ideally accomplished using the Docker images, which already contain the stable versions of these packages (see here for usage details). Or, to install these via pip:
    pip install dask==1.0.0 distributed'>=1.21.6,<2.0.0'
    
  • Alternatively, some users have reported that upgrading to the newest version of Dask can resolve this as well (#147).
  1. Another option is to use a helper script (arboreto_with_multiprocessing.py) that runs the Arboreto GRN algorithms (GRNBoost2, GENIE3) without Dask for compatibility. See here, or the basic usage is:

    arboreto_with_multiprocessing.py \
        expr_mat.loom \
        allTFs_hg38.txt \
        --output adj.tsv \
        --num_workers 20 \
    

cflerin avatar May 01 '20 07:05 cflerin

Hello @cflerin BUG report, may be caused by Dask. pyscenic grn {EXP_MTX_QC_FNAME} {HUMAN_TFS_FNAME} -o {ADJACENCIES_FNAME} --num_workers 16 only works at --num_workers 16. If num_workers is more than 16, whatever the cell numbers or gene numbers, GRN hangs on forever or generates an error Worker exceeded 95% memory budget. Restarting . We tested this bug in the situations that cell numbers from 2000 to 40000, CPU cores from 16 to 40, memory from 64GB to 128GB both on Mac and Windows, this bug can be reproduced. Similar issue is here https://github.com/aertslab/pySCENIC/issues/314 Thanks! Best, YJ

hyjforesight avatar May 09 '22 22:05 hyjforesight

Hi @hyjforesight , You should create a new bug report and include all of the requested info on package versions. Having this info will make it much easier to address your issue.

cflerin avatar May 10 '22 14:05 cflerin