fn Device mapping to enable GPU access

Recently Nvidia shared libraries enabling GPU intensive computations in Docker enabled environments. That's a great improvement for all mathematicians and data scientists. Let's imagine Fn cluster enabling users access to quite expensive GPU.

What Nvida made is based mainly on device mapping. It's described here: (1) https://www.nvidia.com/object/docker-container.html, and (2) https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation-(version-1.0) And here what people did before: https://hub.docker.com/r/iahmedkaseb/cuda-digits/

Jan 16 '19 13:01 rstyczynski

Device mapping makes it possible to access host level devices from container. Let's imagine that you do not have good random generator in the container, but you have available the perfect one at host. To plug host device to the container do the following:

docker run -it --device /dev/random:/dev/HOST_RANDOM:r ubuntu bash -c "ls -l /dev | grep HOST_RANDOM" 
crw-rw---- 1 root root   1, 8 Jan 16 17:14 HOST_RANDOM_DEVICE

As you see there a new device. Non existing without "--device /dev/random:/dev/HOST_RANDOM:r". Now you can read from this new random generator.

docker run -it --device /dev/random:/dev/HOST_RANDOM_DEVICE:r ubuntu bash -c "cat /dev/HOST_RANDOM_DEVICE | head -1; exit"

It's the simplest demonstration. For random, which is read only, you may start multiple instances and read at the same time.

Former trick for Nvidia GPU was based on devices mapping, what is described here: https://hub.docker.com/r/iahmedkaseb/cuda-digits/, and here: https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation-(version-1.0). it was done like this:

docker run -it -p <port>:8080 --device /dev/nvidiactl:/dev/nvidiactl --device /dev/nvidia-uvm:/dev/nvidia-uvm --device /dev/nvidia0:/dev/nvidia0 -v <host_dir>:<container_dir> iahmedkaseb/cuda-digits

New way of doing this is little different, and based on "runtime". Described here: https://github.com/NVIDIA/nvidia-docker/wiki/Usage.

Device mapping has a wider context of a lot of arguments related to docker run. It's about a lot of mission critical arguments related to memory consumption, cgroups privileges, etc. Details are here: https://docs.docker.com/engine/reference/run/ According to Docker API device mapping (and all run related things) should be specified by HostConfig structure. Inside of this there is Devices collection holding number of Device structures. API: https://docs.docker.com/engine/api/v1.24/ tells that:

Devices - A list of devices to add to the container specified as a JSON object in the form { "PathOnHost": "/dev/deviceName", "PathInContainer": "/dev/deviceName", "CgroupPermissions": "mrw"}

I've tried to dig into Fn code, probably locating the place where container is started (docker_client.go:410), however it looks that whole HostConfig contructure is just empty.

func (d *dockerWrap) StartContainerWithContext(id string, hostConfig *docker.HostConfig, ctx context.Context) (err error) {
	ctx, closer := makeTracker(ctx, "docker_start_container")
	defer closer()

	ctx, _ = common.LoggerWithFields(ctx, logrus.Fields{"docker_cmd": "StartContainer"})
	err = d.retry(ctx, func() error {
		err = d.docker.StartContainerWithContext(id, hostConfig, ctx)
		if _, ok := err.(*docker.NoSuchContainer); ok {
			// for some reason create will sometimes return successfully then say no such container here. wtf. so just retry like normal
			return temp(err)
		}
		return err
	})
	return err
}

It looks the above is invoked from docker.go, where HostConfig is null.

func (drv *DockerDriver) startTask(ctx context.Context, container string) error {
	log := common.Logger(ctx)
	log.WithFields(logrus.Fields{"container": container}).Debug("Starting container execution")
	err := drv.docker.StartContainerWithContext(container, nil, ctx)
	if err != nil {
		dockerErr, ok := err.(*docker.Error)
		_, containerAlreadyRunning := err.(*docker.ContainerAlreadyRunning)
		if containerAlreadyRunning || (ok && dockerErr.Status == 304) {
			// 304=container already started -- so we can ignore error
		} else {
			return err
		}
	}
	return err
}

Adding device mapping on this stage is quite important for me, as I'm trying to demonstrate Fn as layer making it possible to provide remote access to quite expensive GPU platform. I'm more that interesting in implementing this even by myself, however after first try it seems not to work properly. It was my naive definition of Devices holding one Device mapping.

hostConfig= &docker.HostConfig{
        Devices:[]docker.Device{
            docker.Device {
                PathOnHost:"/dev/video0",
                PathInContainer:"/dev/video0",
                CgroupPermissions:"rwm",
            },
        },
    }

I've added above to (docker_client.go:410):

func (d *dockerWrap) StartContainerWithContext(id string, hostConfig *docker.HostConfig, ctx context.Context) (err error) {
	ctx, closer := makeTracker(ctx, "docker_start_container")
	defer closer()

        hostConfig= &docker.HostConfig{
              Devices:[]docker.Device{
                  docker.Device {
                      PathOnHost:"/dev/video0",
                      PathInContainer:"/dev/video0",
                      CgroupPermissions:"rwm",
                  },
              },
          }

	ctx, _ = common.LoggerWithFields(ctx, logrus.Fields{"docker_cmd": "StartContainer"})
	err = d.retry(ctx, func() error {
		err = d.docker.StartContainerWithContext(id, hostConfig, ctx)
		if _, ok := err.(*docker.NoSuchContainer); ok {
			// for some reason create will sometimes return successfully then say no such container here. wtf. so just retry like normal
			return temp(err)
		}
		return err
	})
	return err
}

I understand it's a wider issue around HostConfig, however on this stage I'm more than interested in having just device mapping. Let me know if above code is correct. Unfortunately Go is a a quite new environment for me.

Jan 16 '19 18:01 rstyczynski

That seems to be the right place for the device mapping. So, in order to make device mapping quite flexible, I recommend making it configurable and there are couple options:

Env var (FN_DOCKER_DEVICE_MAPPINGS for instance) with the structure: $PathOnHost:$PathInContainer:$CgroupPermissions,... so if I'd like to add the GPU device mapping for Fn I'd need to set the following env var: /dev/video0:/dev/video0:rwm if I need to add more device mapping I'd need to set the following env var: /dev/video0:/dev/video0:rwm,/dev/random:/dev/HOST_RANDOM:r
via device mapping config file that is defined as env var (FN_DOCKER_DEVICE_MAPPING_CONFIG). so, the config file would be a JSON file with config mappings:

[
    {
        "PathOnHost": "...",
        "PathOnDevice": "....",
        "CgroupPermissions": "..."
    },
]

this is actually more flexible and it's simple to process as if you'd try to marshal a JSON from a file:

dmc_file, err := os.Open(os.Getenv(`FN_DOCKER_DEVICE_MAPPING_CONFIG`))
// process error
var dms []docker.Device
err = json.NewDecoder(dmc_file).Decode(&dms)
// process error

hostConfig= &docker.HostConfig{
    Devices: &dms,
}

Personally I prefer 2nd option, because if people need to add more options to each device mapping they would have no reason to modify Fn's code and go straight away editing device mapping config.

@rdallman @skinowski thoughts?

Jan 16 '19 18:01 denismakogon

It's perfect. Will try it tomorrow.

Jan 16 '19 19:01 rstyczynski

2nd option will work, i wasn't sure whether docker.Device structure is tagged properly, but it is: https://github.com/fsouza/go-dockerclient/blob/07f79529d302a194a67d21f98bdd8f4725d24c4a/container.go#L705

Jan 16 '19 19:01 denismakogon

Hi, I'm trying to use the code posted by you

dmc_file, err := os.Open(os.Getenv(`FN_DOCKER_DEVICE_MAPPING_CONFIG`))
// process error
var dms []docker.Device
err = json.NewDecoder(dmc_file).Decode(&dms)
// process error

hostConfig= &docker.HostConfig{
    Devices: &dms,
}

And getting the complier error on line

hostConfig= &docker.HostConfig{
    Devices: &dms,
}

Error:

"cannot use &dms (type *[]docker.Device) as type []docker.Device in field value"

I'm quite new in Go and suspect that is some pointer trick here, Could you help?

Jan 17 '19 09:01 niktaken

hostConfig= &docker.HostConfig{
    Devices: dms,
}

Jan 17 '19 09:01 denismakogon

Thanks it is compiling, sadly do not work on my machine

I do the following test:

Print HostConfig (in docker_client.go) to verify that env was loaded

fmt.Println("HostConfig, %v", hostConfig)

Resoult:

HostConfig, %v &{[] [] [] []  [] map[] [] [] [] [] [] []      { 0} [{/dev/video0 /dev/video0 rw}] [] { map[]} []   0 0 0 0 0 0    0 0 0 0 0 [] [] [] [] [] []  0 0 0 map[] false false false false false map[] map[] 0 0 0 0 [] false }

I have a long running Fn python based function that

Sleep time.sleep(50)
and ls on /dev

During execution i try to get the running docker image (docker ps -a) and inspect that (docker inspect <CONTAINRE_ID>) I'm checking and unfortunately "Devices": null

"HostConfig": { 
...
            "Devices": null,
...
}

while when running

docker run -it --device=/dev/video0 darknet /bin/bash

docker inspect 1a1ffdc0daa1

gives

"HostConfig": { 
...
"Devices": [
                {
                    "PathOnHost": "/dev/video0",
                    "PathInContainer": "/dev/video0",
                    "CgroupPermissions": "rwm"
                }
            ],
...
}

Jan 17 '19 10:01 niktaken

Maybe you miss something in configuration?

Jan 17 '19 11:01 denismakogon

We have various functions in cookie.go to configure the container/host config. That would be a place to possibly add this. (eg. add new function such as cookie.configureDevices()).

I'm guessing this requires the container to run in privileged mode, which we don't in current Fn.

It is also possible to create custom driver. It's not difficult to implement such a driver and spin up Fn server with it. (https://github.com/fnproject/fn/blob/master/test/fn-system-tests/system_test.go#L441)

Jan 20 '19 08:01 skinowski

@skinowski I don't think that this is a reason for starting another driver, because device mapping is a part of docker API, so, this feature might end up being a part of docker driver.

Jan 20 '19 21:01 denismakogon

I meant to say, an implementer can write their own driver for this purpose. I'm not sure if we want to add device mapping and/or privileged mode into Fn.

Jan 21 '19 22:01 skinowski

According to Docker nvidia gpu guides there’s no need to run a container in priveleged mode I’m order to allow gpu device mapping.

But I’m general I get what you’re saying. @niktaken can you make sure that GPU device mapping fits into non-privileged container execution?

Jan 21 '19 22:01 denismakogon

fn fn copied to clipboard

Device mapping to enable GPU access

fn
fn copied to clipboard