Valentin Churavy

Results 1413 comments of Valentin Churavy

Thanks for bringing this up, the goal in #55 was indeed to match Clang (we were hunting down a performance gap). I agree that the fact that we use contract...

We should probably define `threadIdx` etc. in GPUIfyLoops, users currently still have to manually do `using CUDAnative` to get them.

If I understand you right you want to be able to turn a `println` into a `@cuprintln`? I am afraid that isn't easily overdubbed, since `@cuprintln` is a macro and...

No the issue is the macro nature of it. So we have to implement the macro again. On Tue, Feb 4, 2020, 21:17 Ali Ramadhan wrote: > Hmmm, would it...

Maybe I am on the wrong branch, but I am getting ``` julia> sol2 = solve(odeoop,GPUSimpleTsit5(),dt=dt) ERROR: MethodError: Cannot `convert` an object of type Nothing to an object of type...

So investigating with Cthulhu: ``` • %108 = invoke overdub(::Cassette.Context{…},::typeof(DiffEqBase.__has_syms){…},::typeof(loop){…})::Any ``` which calls fieldnames, which is not an inferrable function

@jrevels and I tried to fix this a while back, but our fix was to clever for our own good. See https://github.com/jrevels/Cassette.jl/issues/113, https://github.com/jrevels/Cassette.jl/pull/119 and https://github.com/jrevels/Cassette.jl/pull/115 Despite thinking hard we haven't...

```julia gpuIndex3D() = CartesianIndex( blockIdx().z, blockIdx().y - 1) * blockDim().y + threadIdx().y, blockIdx().x - 1) * blockDim().x + threadIdx().x ) # Calculate the divergence of f at every point and...

Something like this for index calc ``` maxThreads = 1024 Nx, Ny, Nz = size(f) Tx = min(maxThreads, Nx) Ty = min(fld(maxThreads, Tx), Ny) Tz = min(fld(maxThreads, (Tx*Ty)), Nz) Bx,...