intel-device-plugins-for-kubernetes icon indicating copy to clipboard operation
intel-device-plugins-for-kubernetes copied to clipboard

Fake-gpu support

Open hsyrjaos opened this issue 1 year ago • 2 comments

This adds fake-gpu support to gpu-device plugin. With this any cluster running gpu-plugin could be batch to support fake-gpus. This fixes also one bug related pkg/deviceplugin/api.go /sys/dev/char access which was hard coded to /sys. pkg/fakedri/fakedir.go is copy from cmd/gpu_fakedev/gpu_fakedev.go only modified to usable as a package and klog support added.

hsyrjaos avatar Sep 20 '24 11:09 hsyrjaos

Fakedev support still use same different sys and dev set by Path like /tmp in examples. if it is /tmp then /tmp/sys /tmp/dev are used, and that really is root cause of the issue here in api.go as it is using hard coded path /dev and accessing /dev/char/ where fake major:minor won't exist and then generates error this has happen also before so that pr won't change that without that change in api.go. But as said error won't really matter as normally fakedev user pods won't access any actual devices so it is just ugly and confusing info in the plugin log.


From: Tuomas Katila @.> Sent: Monday, September 30, 2024 8:18 AM To: intel/intel-device-plugins-for-kubernetes @.> Cc: Syrja, Harri @.>; Author @.> Subject: Re: [intel/intel-device-plugins-for-kubernetes] Fake-gpu support (PR #1846)

@tkatila commented on this pull request.


In pkg/deviceplugin/api.gohttps://github.com/intel/intel-device-plugins-for-kubernetes/pull/1846#discussion_r1780449181:

  •   for _, node := range nodes {
    
  •           devPaths = append(devPaths, node.HostPath)
    
  •   // If devPaths are provided, use them; otherwise, generate from nodes
    
  •   if len(devPaths) > 0 && devPaths[0] != "" {
    
  •           devPathsComputed = devPaths
    

Previously fakedev support used different sysfs and devfs directories. Can those be used to again to expose the fake dev-devices?

In general, I'm against this particular change as it changes the internal API with no real added functionality.

hsyrjaos avatar Sep 30 '24 09:09 hsyrjaos

I'd check the code one more time if I were you. The functions that generate the device spec files should take into account the prefix. So if the fake devices/files are properly created they should be used without issues.

tkatila avatar Sep 30 '24 10:09 tkatila

to be re-opened if the work continues

mythi avatar May 20 '25 06:05 mythi