qonnx icon indicating copy to clipboard operation
qonnx copied to clipboard

Change NodeLocalTransformation Pool to use spawn instead of fork

Open bwintermann opened this issue 1 year ago • 0 comments

Prerequisites

Current main commit: db969e6

Quick summary

The use of mp.Pool in qonnx/transformation/base.py for NodeLocalTransform can cause deadlocks in certain cases.

Details

During work on FINN, I encountered an issue where calling HLSSynthIP() (which inherits from NodeLocalTransform) in a multithreaded context could deadlock the processes from the MP pool. This is very likely caused by Python's start method defaulting to 'fork'. It is a well known issue and the solution is mostly to change the start method either globally or locally using get_context("spawn").Pool(...). Arguably a transform designed to be parallelized should not be multithreaded as well normally, however the default start method will be switched to spawn in Python 3.14 anyways, and changing it to spawn manually for earlier versions does not have any negative impacts and might prevent issues in the best case.

bwintermann avatar Jun 06 '24 13:06 bwintermann