Some low-level errors (like `pynvml.nvml.NVMLError_LibRmVersionMismatch`) result in nothing printed (std or diagnostic)
Describe the bug
Something caused a version mismatch somewhere and I can no longer use gpustat. Nothing at all is printed on stdout or stderr. Running with --debug prints nothing as well. I launched it as python -m pdb -m gpustat and stepped through until noticing an error raised in:
/opt/conda/lib/python3.8/site-packages/pynvml/nvml.py(718)
of type pynvml.nvml.NVMLError_LibRmVersionMismatch.
Screenshots or Program Output
Please provide the output of gpustat --debug and nvidia-smi. Or attach screenshots if applicable.
Environment information:
- OS: Ubuntu 20.04
- NVIDIA Driver version: 510.73.08
- The name(s) of GPU card: Tesla V100-SXM2
- gpustat version: 1.0.0
- pynvml version:
11.495.46
Additional context
Add any other context about the problem here.
Can you please provide a full stacktrace from gpustat --debug (or with pdb)? On your side nothing is printed, right? I'd like to know which nvml... call throws the error.
In pdb you can do (Pdb) bt to obtain the full stacktrace in a post-mortem mode.