pisa
pisa copied to clipboard
hypo_testing.py doesn't work with osc.prob3gpu and reco.vbwkde
Using the hypo_testing.py script with a pipeline containing osc.prob3gpu and reco.vbwkde delivers the following error:
pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle
This problem is probably related to https://github.com/WIPACrepo/pisa/blob/cake_pre_opensource/pisa/utils/gaussians.py :
# TODO: if the Numba CUDA functions are defined, then other CUDA (e.g. pycuda)
# code doesn't run (possibly only when Nvidia driver is set to
# process-exclusive or thread-exclusive mode). Need to fix this behavior. (E.g.
# context that gets destroyed?)
Is your GPU in process exclusive or thread exclusive mode?
It's in default compute mode:
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
Well, that's the "best" mode, so it's not due to being in process- or thread-exclusive mode.
I don't have time to fully track this down (I've tried in the past, as that comment indicates, but failed before). So if someone else would like to fix this, that'd be great. In the meantime, a workaround is to disable using the numba cuda functions by forcing NUMBA_CUDA_AVAIL=False
to be the last say on that variable. To wit, in $PISA/pisa/__init__.py
, comment out the following lines:
#try:
# from numba import cuda
# assert cuda.gpus, 'No GPUs detected'
# cuda.jit('void(float64)')(dummy_func)
#except Exception:
# pass
#else:
# NUMBA_CUDA_AVAIL = True
#finally:
# if 'cuda' in globals() or 'cuda' in locals():
# if NUMBA_CUDA_AVAIL:
# cuda.close()
# del cuda
Note that with PISA pi, I think that Philipp is abandoning Pycuda altogether, so this might be "fixed" when we move over to that. (Also just a workaround, not really a fix.)
Not sure if this would be helpful for finding a fix for this solution, but posting the link here for posterity, at least. It shows interoperabillity between OpenGL, PyCUDA, and Numba cuda. (Shares memory between these, not clear to me on first glance how GPU contexts are handled, though.) https://gist.github.com/sklam/2ff89e40721d1f1a007449f02aee3990
Hi everyone, I'm wondering if there is a solution for this one on the horizon? As far as I understand, @jllanfranchi workaround disables GPU for both oscillation and reco stages? Is there a way to at least use GPU accelerated oscillations and do the VBWKDE on the CPU? I think in older versions this was possible.
The reason I bring this is up is that @JanWeldert is now running pseudo-experiments and one of them takes around 1 hours (3 fits) to converge.
Nothing here, sorry. Help from others is much appreciated on this issue, I don't have time (or the knowledge) to fix this.
did NUMBA_CUDA_AVAIL=False
not do the trick? Because it sounds to me that this should exactly do what you're looking for (?)
Does this still pose an issue @JanWeldert, otherwise feel free to get rid of this issue, please :pray: