Vyas Ramasubramani

Results 899 comments of Vyas Ramasubramani

OK, after some further research it looks like this cannot be done until CUDA 12.6. Pinned memory pools and asynchronous allocation was not supported until then. The pinned memory type...

Whoops my previous message was supposed to include links but I didn't copy them in. Yes, I'm basically just proposing that we do a [`cudaMemPoolCreate`](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__POOLS.html#group__CUDART__MEMORY__POOLS_1g8158cc4b2c0d2c2c771f9d1af3cf386e) with a memory type of...

At a high level my suspicion is that we are explicitly doing things that Cython is not designed to support by creating trampolines that do not reflect the original layout,...

I think you should be able to reproduce this issue by building a project that uses cuda-python's Cython via the new API (so `from cuda.bindings cimport ...`) and then at...

I'm a little hesitant to put this level of information into rmm, especially the application design piece of it. That feels better suited to CUDA or CCCL docs since using...

I agree with recommending the async resource as a default. The main exception is deciding how to talk about managed memory for larger-than-memory workloads > But first, a quick question:...

> why do we need the RMM abstraction if the CUDA driver already handles memory efficiently The rmm abstraction predates the CUDA driver functionality (and certainly being broadly available) 🙂...

FWIW [here's my proposal for cudf](https://github.com/rapidsai/cudf/issues/17626).

@leofang could you please give me an example of what you're referring to?

Couldn't cuda-python unconditionally load both sets of symbols and then dispatch appropriately at runtime based on which one we wanted?