mystmd icon indicating copy to clipboard operation
mystmd copied to clipboard

Bad interaction with `--execute` and parallel code (blas)

Open rossbar opened this issue 8 months ago • 0 comments

When running myst build --execute for numpy-tutorials, we've noticed issues on some systems with failed builds and hanging/zombie processes. Like all parallel execution issues this is hard to really pin down, but we have noticed this on both x86 and arm systems. This issue has only been seen on machines where there are fewer CPUs than documents that are being executed, though this detail may not be relevant.

From trying several incantations, I'm fairly confident this is a bad interaction between the document execution engine and executable code that relies on parallelization. In the specific case of numpy-tutorials, I suspect blas is one of the culprits... setting OMP_NUM_THREADS=1 or using threadpoolctl to limit the number of processes used in the notebooks skirts the issue.

Version info

$ myst -v
v1.2.5

Proposed solution

Just reporting to raise awareness - it'd be good to track down and understand exactly what is going on. If there are plans to support parallel document execution it'd probably be good to provide users a way to configure/control the number of workers.

rossbar avatar Jun 05 '24 19:06 rossbar