Alessandro Bellina
Alessandro Bellina
@Artemy-Mellanox I believe you may be familiar with devx, so pinging you here.
@yosefe would both running MOFED 4 be supported? In this scenario, would the devx error be related?
@zengruios I am looking into this, trying to reproduce the issue on my side. I'll get back to you soon. Thanks for this report!
@zengruios reproduced the issue you reported. This is happening because the `ShuffleBufferCatalog` is using a different block manager than the `RapidsDiskStore`, and so this line fails to delete (because the...
I assume this issue is fixed now (with https://github.com/NVIDIA/spark-rapids/pull/6318), please let us know if you have further questions @zengruios.
I believe we should include https://github.com/NVIDIA/spark-rapids/issues/6095 in this overall ticket as a blocker.
I am unclear on whether https://github.com/NVIDIA/spark-rapids/pull/5989 is a requirement for this to be closed also.
Discussed with @revans2 and @jlowe today on this. @revans2 proposed an idea that would add callbacks likely in `DeviceMemoryBuffer` that could be used by the spill framework to register a...
I am interested in picking this up after my current tasks as this is related to the "maximum live memory" question we are trying to answer with changes to cuDF...
@lagarantie a couple of things would help (if you can provide): 1) The `.explain` output of `d.cache` (`d.cache.explain`). It should give us the full DAG to look at. 2) The...