fmriprep icon indicating copy to clipboard operation
fmriprep copied to clipboard

dot never (well -- 6 days so far) finishes to render the graph

Open yarikoptic opened this issue 3 years ago • 3 comments

What happened?

@michael-sun ran fmriprep 21.0.1 singularity (from docker) container on local HPC and was surprised that job was not finished after a week. We checked on the compute node and discovered that dot is still running!

f003z4j  153080 99.7  0.3 1481260 1296420 ?     R    Sep23 9881:59             dot -Tsvg -o/dartfs-hpc/scratch/f003z4j/fmriprep-work/work-SID001651/fmriprep_wf/graph.svg /dartfs-hpc/scratch/f003z4j/fmriprep-work/work-SID001651/fmriprep_wf/graph.dot

that graph.dot renamed and compressed is: graph-bigfancy.dot.gz

I started to run locally with dot from graphviz 2.42.2-7 -- so far minutes with no completion. Might be an issue with graphviz to file or something about the .dot file to fix.

anyways -- I think it would be useful to establish some kind of upper-bound timeout for invocation of dot. I think it might be valuable to have some could-not-render graph.svg to be used in this case instead of actual graph and issue the warning instead of halting compute or completely errorring out. Although I could be proven wrong

What command did you use?

Most likely not needed since seems to hang locally as well but here it is -- just remove `\ `s ;)


singularity run \ --cleanenv \ -B ${MAINDIR}:${MAINDIR} \ -B ${BIDSDIR},${PREPROCDIR},${SCRATCHDIR} \ -B /optnfs/freesurfer:/optnfs/freesurfer ${IMAGE} \ ${BIDSDIR} ${OUTDIR} participant \ --participant_label ${SUBJ} \ --ignore slicetiming \ --resource-monitor \ --bold2t1w-dof 9 \ --dummy-scans 6 \ --write-graph \ --notrack \ --fs-no-reconall \ --nprocs 8 \ --omp-nthreads 5 \ --nthreads 5 \ --mem_mb 60000 \ --fs-license-file /optnfs/freesurfer/6.0.0/license.txt \ --skip_bids_validation \ --output-spaces T1w MNI152NLin2009cAsym \ -w ${WORKDIR} \ --use-aroma --aroma-melodic-dimensionality -200 --bids-filter-file ${FILTER_DIR}/${SUBJ}.json 

What version of fMRIPrep are you running?

21.0.1

How are you running fMRIPrep?

Singularity

Is your data BIDS valid?

Yes

Are you reusing any previously computed results?

Work directory

Please copy and paste any relevant log output.

No response

Additional information / screenshots

No response

yarikoptic avatar Sep 30 '22 15:09 yarikoptic

I would probably skip --write-graph on such large workflows. Probably the right thing to do is to set a timeout on graphviz and just print a warning that dot timed out and you can render the graph yourself if you want.

effigies avatar Sep 30 '22 15:09 effigies

This may be the culprit - https://github.com/nipy/nipype/issues/3526

mgxd avatar Oct 06 '22 19:10 mgxd

Nothing to do here until nipype allows us to set a timeout and catch a TimeoutError.

effigies avatar Dec 03 '22 02:12 effigies