fixes from hackathon
thanks! Can you remind me which branch of clima caused these?
master, using
julia --project=env/gpu --check-bounds=yes -g2 test/DGmethods/compressible_Navier_Stokes/mms_bc_atmos.jl
I'm in the process of trying to make a simpler repro.
I have made a slightly simpler test case: https://github.com/climate-machine/CLIMA/tree/sb/gpu-repro running
julia --project=env/gpu --check-bounds=yes -g2 test/DGmethods/compressible_Navier_Stokes/mms_bc_repro.jl
what is odd is that I can do a couple of timesteps: if I change timeend=1 to timeend=3*dt it works, but not for timeend=4*dt
Okay, I have it down to a single kernel: https://github.com/climate-machine/CLIMA/blob/sb/gpu-repro/test/DGmethods/compressible_Navier_Stokes/mms_bc_repro.jl
So I was able to track it down to my getproperty overloading not being inlined. Explicitly forcing an inline seems to solve the problem (https://github.com/climate-machine/CLIMA/pull/331/files#diff-7cce5e775ee9a06d14ffffc181f8ab67R137).
I can keep working on a reproducible case if you would like?
Hm, no I think that won't be necessary. Thanks for sticking with it. It's weird that the inlining hack we are doing in GPUifyLoops didn't stick, but nested generated functions are odd in any case.
But just to confirm you do need the changes in the PR to run with --bounds-check on?
I needed the changes from the PR to work in either case. I don't think the issue is the bounds checking, I suspect it might be storing the symbol in the error object.
Okay, here's a simpler one: https://gist.github.com/simonbyrne/6596d9094c4212ff5e888850c3ea2589 which gives a chain of
ERROR: a exception was thrown during kernel execution.
Run Julia on debug level 2 for device stack traces.
Just ran into this. It would definitely be nice to have. Or... a way to put @inbounds inside of the user's function.