pisa hypo_testing.py doesn't work with osc.prob3gpu and reco.vbwkde

Using the hypo_testing.py script with a pipeline containing osc.prob3gpu and reco.vbwkde delivers the following error:

pycuda._driver.LogicError: cuFuncSetBlockShape failed: invalid resource handle

This problem is probably related to https://github.com/WIPACrepo/pisa/blob/cake_pre_opensource/pisa/utils/gaussians.py :

# TODO: if the Numba CUDA functions are defined, then other CUDA (e.g. pycuda)
# code doesn't run (possibly only when Nvidia driver is set to
# process-exclusive or thread-exclusive mode). Need to fix this behavior. (E.g.
# context that gets destroyed?)

Dec 05 '17 12:12 JanWeldert

Is your GPU in process exclusive or thread exclusive mode?

Dec 05 '17 13:12 jllanfranchi

It's in default compute mode:

  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

Dec 05 '17 14:12 thehrh

Well, that's the "best" mode, so it's not due to being in process- or thread-exclusive mode.

Dec 05 '17 20:12 jllanfranchi

I don't have time to fully track this down (I've tried in the past, as that comment indicates, but failed before). So if someone else would like to fix this, that'd be great. In the meantime, a workaround is to disable using the numba cuda functions by forcing NUMBA_CUDA_AVAIL=False to be the last say on that variable. To wit, in $PISA/pisa/__init__.py, comment out the following lines:

#try:
#    from numba import cuda
#    assert cuda.gpus, 'No GPUs detected'
#    cuda.jit('void(float64)')(dummy_func)
#except Exception:
#    pass
#else:
#    NUMBA_CUDA_AVAIL = True
#finally:
#    if 'cuda' in globals() or 'cuda' in locals():
#        if NUMBA_CUDA_AVAIL:
#            cuda.close()
#        del cuda

Note that with PISA pi, I think that Philipp is abandoning Pycuda altogether, so this might be "fixed" when we move over to that. (Also just a workaround, not really a fix.)

Dec 05 '17 21:12 jllanfranchi

Not sure if this would be helpful for finding a fix for this solution, but posting the link here for posterity, at least. It shows interoperabillity between OpenGL, PyCUDA, and Numba cuda. (Shares memory between these, not clear to me on first glance how GPU contexts are handled, though.) https://gist.github.com/sklam/2ff89e40721d1f1a007449f02aee3990

Dec 06 '17 19:12 jllanfranchi

Hi everyone, I'm wondering if there is a solution for this one on the horizon? As far as I understand, @jllanfranchi workaround disables GPU for both oscillation and reco stages? Is there a way to at least use GPU accelerated oscillations and do the VBWKDE on the CPU? I think in older versions this was possible.

Feb 06 '18 15:02 sboeser

The reason I bring this is up is that @JanWeldert is now running pseudo-experiments and one of them takes around 1 hours (3 fits) to converge.

Feb 06 '18 15:02 sboeser

Nothing here, sorry. Help from others is much appreciated on this issue, I don't have time (or the knowledge) to fix this.

Feb 06 '18 15:02 jllanfranchi

did NUMBA_CUDA_AVAIL=False not do the trick? Because it sounds to me that this should exactly do what you're looking for (?)

Feb 06 '18 15:02 philippeller

Does this still pose an issue @JanWeldert, otherwise feel free to get rid of this issue, please :pray:

Jun 03 '24 15:06 LeanderFischer

pisa pisa copied to clipboard

hypo_testing.py doesn't work with osc.prob3gpu and reco.vbwkde

pisa
pisa copied to clipboard