James Schloss

Results 253 comments of James Schloss

After messing around with this some more, I don't really think the first 2 errors I listed are all that important. `sqrt` works on `main`, and the second error does...

Addressing the `ERROR: CUDA error: operation not supported on global/shared address space (code 717, ERROR_INVALID_ADDRESS_SPACE)` issue... As far as I can tell, the problem is based on the complexity of...

Still working on that... I'll post it as soon as possible

Ok, it's not perfect, but I stripped the code down a bit to generate a similar (but different) error: ``` using CUDA, Enzyme, LinearAlgebra, StaticArrays struct Atom σ::Float32 ϵ::Float32 end...

Brief update from last week. As it turns out, I created a MWE for the wrong error. To fix error 700, you need to increase the malloc heap size to...

well, good news, 717 will actually trigger on the previous MWE once adding the line to it from the previous comment: ``` using CUDA, Enzyme, LinearAlgebra, StaticArrays CUDA.limit!(CUDA.CU_LIMIT_MALLOC_HEAP_SIZE, 1*1024^3) #Enzyme.API.printall!(true)...

Right, sorry for the missing information. GPUs I have tested on: ``` Tesla V100-PCIE-16GB # Note: errors sometimes, but I cannot find a MWE for it Tesla P100-PCIE-16GB NVIDIA GeForce...

Ah, it seems to be the broadcast causing the error. This kernel errors: ``` @noinline function force(c1, c2) dr = c2 .- c1 return dr end ``` This one does...

Ok, working on it... From what I can tell, it's breaking on `bcsm(a, b)` in `Base.Broadcast` Relevant functions in Base.Broadcast: ``` broadcast(f::Tf, As...) where {Tf} = materialize(broadcasted(f, As...)) materialize(bc::Broadcasted) =...

Well, just updated to the new main (again) and now things work for the mwe posted above. My current st: ``` (@v1.8) pkg> st Status `~/.julia/environments/v1.8/Project.toml` [052768ef] CUDA v3.12.0 [7da242da]...