Lawrence Mitchell comments

Results 226 comments of


                                            Lawrence Mitchell

[BUG] Warnings from methods that raise exceptions and are marked nogil.

They're not errors at the moment, so I think pinning would be a backward step (and we'd have to go through the rest of rapids and pin there too...)

[BUG] Warnings from methods that raise exceptions and are marked nogil.

> I don't think this will ever be an error because it's perfectly valid code. It's just telling you that it might be slower than you anticipated because of acquiring...

[FEA] Python bindings for `rmm::mr::pinned_host_memory_resource`

> Hey @harrism , > > Can you direct me on how to debug [this](https://gist.github.com/willtryagain/65d36b9b52ccb491bdba1caae1e3f0d0) error please? > > I got this when I ran > > `cmake .. -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX....

[FEA] Python bindings for `rmm::mr::pinned_host_memory_resource`

OK, thanks. I have a CTK 12.3 install here, so maybe I can reproduce.

[FEA] Python bindings for `rmm::mr::pinned_host_memory_resource`

Hmm, I could not reproduce, I did (I used mamba not conda, but everything else is the same). On ce3af2c46b8b: ``` git clean -fdx # there's no magic behind the...

[FEA] Python bindings for `rmm::mr::pinned_host_memory_resource`

Hmm, that is identical to mine, so I am somewhat at a loss as to what broke

[BUG] Maximum pool size exceeded when using ManagedMemory

How much host RAM is available on this system? The CPU log peaks at around 360GiB host RAM usage. Could it be that you're running out of both host and...

[BUG] Maximum pool size exceeded when using ManagedMemory

> I requested 400GB ram in slurm when submitting this job. That _might_ have been your problem (depending on how slurm manages these allocations). It could be that you got...

[BUG] Maximum pool size exceeded when using ManagedMemory

Hmm. Nothing obviously looks bad there, but if the GPU (and host) memory usage is always increasing this is _either_ because the RMM pool is so fragmented that you can...

[BUG] Maximum pool size exceeded when using ManagedMemory

> A quick look at the attached CSV log, sorting by size, all large allocations seem to have matching frees. Looking at them in order, the large allocations (~21GiB) occur...