Valentin Churavy
Valentin Churavy
Thanks for bringing this up, the goal in #55 was indeed to match Clang (we were hunting down a performance gap). I agree that the fact that we use contract...
Related #63
We should probably define `threadIdx` etc. in GPUIfyLoops, users currently still have to manually do `using CUDAnative` to get them.
If I understand you right you want to be able to turn a `println` into a `@cuprintln`? I am afraid that isn't easily overdubbed, since `@cuprintln` is a macro and...
No the issue is the macro nature of it. So we have to implement the macro again. On Tue, Feb 4, 2020, 21:17 Ali Ramadhan wrote: > Hmmm, would it...
Maybe I am on the wrong branch, but I am getting ``` julia> sol2 = solve(odeoop,GPUSimpleTsit5(),dt=dt) ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type...
So investigating with Cthulhu: ``` • %108 = invoke overdub(::Cassette.Context{…},::typeof(DiffEqBase.__has_syms){…},::typeof(loop){…})::Any ``` which calls fieldnames, which is not an inferrable function
@jrevels and I tried to fix this a while back, but our fix was to clever for our own good. See https://github.com/jrevels/Cassette.jl/issues/113, https://github.com/jrevels/Cassette.jl/pull/119 and https://github.com/jrevels/Cassette.jl/pull/115 Despite thinking hard we haven't...
```julia gpuIndex3D() = CartesianIndex( blockIdx().z, blockIdx().y - 1) * blockDim().y + threadIdx().y, blockIdx().x - 1) * blockDim().x + threadIdx().x ) # Calculate the divergence of f at every point and...
Something like this for index calc ``` maxThreads = 1024 Nx, Ny, Nz = size(f) Tx = min(maxThreads, Nx) Ty = min(fld(maxThreads, Tx), Ny) Tz = min(fld(maxThreads, (Tx*Ty)), Nz) Bx,...