Evan Lezar
Evan Lezar
Just a note: If we assume CDI support in the OCI mode, using a CDI spec generated for the Intel devices would allow injection of these. See #813
I would strongly recommend following the CDI route here instead of relying on vendor-specific logic in Singularity. If effort is to be spent, I would recommend adding (experiemental) CDI support...
> This is a first pass of this idea, but I am open to suggestions on how to split this effort Would splitting out the GFD documentation to separate folder...
@xhejtman could you provide the logs from the device plugin?
Looking at the logs, we're only starting 2 GRPC servers: ``` 2023-12-18T12:40:20.600590354+01:00 stderr F I1218 11:40:20.600444 1 server.go:165] Starting GRPC server for 'nvidia.com/mig-1g.10gb' 2023-12-18T12:40:20.601080041+01:00 stderr F I1218 11:40:20.600967 1 server.go:117]...
The plugin log requested in https://github.com/NVIDIA/k8s-device-plugin/issues/348#issuecomment-1369699003 were never supplied. @xlcbingo1999 if you are stting similar behaviour, please provide a description of your setup as well as the plugin logs.
Looking at the `Filter()` call that is returning all devices when `required` is empty, this should only happen if `required` is `nil`: ``` // Filter filters out the selected devices...
Assuming we could consider `required == nil` equivalent to `required == []string{}` we could apply the following diff: ``` diff --git a/internal/rm/nvml_manager.go b/internal/rm/nvml_manager.go index 56f05429..a00d41cf 100644 --- a/internal/rm/nvml_manager.go +++ b/internal/rm/nvml_manager.go...
Thanks for digging that up. I don't quite recall why I had that `nil` check present there. Looking at the existing code, it obviously doesn't make sense. I have created...
@sazo with the release of 0.13.0 of the device plugin we have much of the work in place to make progress on this. We also added logging around the events...