Valentin Churavy

Results 1413 comments of Valentin Churavy

Ok I just tagged KernelGradients 0.1.1 that supports Enzyme 0.10, maybe try again?

On KernelAbstractions 0.8 the situation should be much improved. Please open another issue if you continue to see these kinds of issues.

Briefly talking with @maleadt, he reminded me of the fact that streams are context bound and that contexts are device bound so reusing streams across devices is something not to...

Please go ahead and give it a go!

> This buffer is flushed only for > > the start of a kernel launch > synchronization (e.g. cudaDeviceSynchronize()) > blocking memory copies (e.g. cudaMemcpy(...)) > module load/unload > context...

It is definitely a cuda thing. The only thing we do is to call `vnprintf` on the device, the output is then managed by CUDA. The synchronize mentioned on the...

> I was trying CUDA.sync_threads() in the overdubbed print macro, which I thought would do the same thing? No that is still a within kernel synchronize. `CUDA.synchronize(ev.event)` is an event...

This is very peculiar. Haven't had time to investigate what is happening here.

I have been unable to reproduce this. Can you add Some prints around here? https://github.com/JuliaGPU/KernelAbstractions.jl/blob/dd93c7abed46b53d89e8368ca747af8e616c5489/src/backends/cpu.jl#L63 We are just calling the `Base.wait` Also which version of Julia are you using?

What is `Base.Threads.nthreads()`?