Marius Millea

Results 159 comments of Marius Millea

This works but I want to play around with more CUDA functionality to make sure enough stuff works. Any early feedback you may have is welcome too.

A decent bit of things work (broadcast, matmul, etc...), although we currently have: ```julia _,f = preallocate() do x = CUDA.zeros(10) fft(x) end CUDA.@allocated f() # ERROR: CUDA error: invalid...

Thinking about working on a PR for this. Any reason not to have them be in the same record and just change: ```julia struct AllocationRecord

Doesn't the overdub in the replay context assert the right type anyway? https://github.com/oxinabox/AutoPreallocation.jl/blob/1013ac618749ac3bd3534162cb6c176e4c6d19bf/src/replaying.jl#L37 (the `CuArray` version of this would similarly assert `CuArray{T,N}`)

It looks like removing `Tuple` from the blacklist fixes it: https://github.com/oxinabox/AutoPreallocation.jl/blob/1013ac618749ac3bd3534162cb6c176e4c6d19bf/src/inference_fixes.jl#L16 Not sure how bad the side-effects of this are? I'm not seeing any new test failures when I remove...

Is there a reason why Pluto's workers need to be processes and can't just be threads? (Two other upsides of threads would be reduced memory usage and the ability to...

> CuArrays broadcasting uses ForwardDiff Yea I think thats the fundamental difference. > Happens with `sum` too ~This actually is complex though, right? So that at least isnt a bug?~...

Thanks, yea this sounds like it could be the cause. I tried your suggestion and it deadlocks. My guess is you have to do something like [here](https://docs.julialang.org/en/v1/manual/multi-threading/#Safe-use-of-Finalizers). Playing with it,...

Thanks, need to think about this and read that code, but one thing I dont totally follow is that in solution in my PR, the PyObject finalizer never tries to...

Thanks, didn't now that. Don't know if there's the bandwidth for it but it would be great to have this back as an option. Kind of like `PYCALL_JL_RUNTIME_PYTHON` but which...