Stan Seibert
Stan Seibert
Yes, all of these methods should accept Numba device allocations. What error did you see?
I assume DeviceNDArrays can only be C ordered at the moment? Otherwise, option 2 seems best.
However, option 1 might be a good interim solution, since Numba's release cycle might be too long to wait to fix this.
Note that we're encouraging people to migrate to [CuPy](https://cupy.chainer.org/), which provides wrappers around the same CUDA libraries that pyculib provided, but with a nicer NumPy interface.
I'm not sure what vqgan is, but also the pyculib project isn't getting updates anymore. (We strongly encourage folks to check out CuPy.)
I like the way this is looking. One question: How is the launch configuration (# of blocks and threads per block) of the kernel selected?
We should make caching fail with a printed warning and disable itself when a suitable location cannot be found to write the cache. It isn't an essential operation, and a...
We do support object-mode ufuncs now, though you don't get any speed benefit from them.
I think I'm confused about what transformation you are wanting to do with Numba. Are you trying to switch time units for a datetime64 inside a Numba function?
OK, I think I understand the use case now. Beyond figuring out the segfault, I think this is a manipulation of datetime64 values that we don't yet support, but could....