Nic Eggert comments

Results 29 comments of


                                            Nic Eggert

Add kerberos changes for secure hadoop access

Any reason this PR was never merged?

[BUG] LocalCUDACluster doesn't work with NVIDIA MIG

We're still seeing this issue when running the latest Merlin image (`nvcr.io/nvidia/merlin/merlin-pytorch:22.06`), which includes CUDA 11.7, `dask-cuda==22.04`, and `pynvml==11.4.1`. Happens on both driver `515.48.07` and `510.47.03` if that makes any...

[BUG] LocalCUDACluster doesn't work with NVIDIA MIG

That didn't work, but setting the environment variable `export DASK_DISTRIBUTED__DIAGNOSTICS__NVML=False` did. Thanks for pointing me in the right direction.

DCGM_FI_PROF_GR_ENGINE_ACTIVE and MIG

Here's are examples for 2g.20gb and 3g.40gb instances: ``` nvidia-smi Mon Dec 4 21:04:13 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name...