docker-volume-gluster icon indicating copy to clipboard operation
docker-volume-gluster copied to clipboard

Fail fast on errors during mount

Open Vad1mo opened this issue 6 years ago • 8 comments

We have a hyper-converged system that runs Gluster and Swarm on the same node. so we are mounting the volumes like this "voluri": "localhost:gv-portainer"

When there is an error in gluster cluster. eg. Gluster servie isn't stated, then it is still possible to spinup a container that mounts a gluster volume. It is impossible to know if a volume was really mounted or not because in both cases the application runs. One only see in the application itself that the data is missing.

It would be a good idea to fail fast if a volume can't be mounted, so the container never comes up and can be addressed accordingly.

I am not sure if that is related to #18.

Vad1mo avatar Dec 11 '17 17:12 Vad1mo

How do you start container? Are you sure that it is a gluster volume that is started ? One problem that I encounter myself (still in integration tests), the script fail to create the volume but after docker create automatically a local volume (that is empty) with the name of the volume.

Now that I use glusterfs cli directly, I could detect when background process failed but after volume is mount docker drivers have no possibility to inform docker host that the mount point is unavailable.

sapk avatar Dec 11 '17 17:12 sapk

Its a docker stack ( with compose) and the plugin is also an container.

When I inspect the container I looks like it mounted the gluster volume. However when I inspect the volume I see that the mountpoint doesn't exist.

docker volume inspect portainer_portainer-data 
[
    {
        "CreatedAt": "0001-01-01T00:00:00Z",
        "Driver": "glusterfs:latest",
        "Labels": {
            "com.docker.stack.namespace": "portainer"
        },
        "Mountpoint": "/var/lib/docker-volumes/gluster/portainer_portainer-data",
        "Name": "portainer_portainer-data",
        "Options": {
            "voluri": "localhost:gv-portainer"
        },
        "Scope": "local"
    }
]

Docker Container:

                "Type": "volume",
                "Name": "portainer_portainer-data",
                "Source": "/var/lib/docker/plugins/eb54b605aac01e577ebad9e73439f0a228a1bbcc2789a7f8dc4a69149fbce971/rootfs/var/lib/docker-volumes/gluster/portainer_portainer-data",
                "Destination": "/data",
                "Driver": "glusterfs:latest",
                "Mode": "",
                "RW": true,
                "Propagation": ""

Vad1mo avatar Dec 11 '17 17:12 Vad1mo

As for #18 it is mostly to try to resolve hostname of server (and bricks) before creation and various little pre-check to limit error related to configuration later at mount.

sapk avatar Dec 11 '17 17:12 sapk

Otherwise, it seems to be using the plugin (I suppose you use a custom alias glusterfs). I will look at it near https://github.com/sapk/docker-volume-gluster/blob/82079baff00622cdfd637ddb7e8a7a344497900e/gluster/driver/driver.go#L241 to catch more errors. (Maybe keep process in foreground and attaching it to mount)

sapk avatar Dec 11 '17 17:12 sapk

Do you need more info to track down the issue. currently we can't use the plugin due to this issue

Config.json from cat /var/lib/docker/plugins/eb54b605aac01e577ebad9e73439f0a228a1bbcc2789a7f8dc4a69149fbce971/config.json

{
  "plugin": {
    "Config": {
      "Args": {
        "Description": "Arguments to be passed to the plugin",
        "Name": "args",
        "Settable": ["value"],
        "Value": []
      },
      "Description": "GlusterFS plugin for Docker",
      "DockerVersion": "17.10.0-ce",
      "Documentation": "https://docs.docker.com/engine/extend/plugins/",
      "Entrypoint": ["/usr/bin/docker-volume-gluster", "daemon"],
      "Env": [{ "Description": "", "Name": "DEBUG", "Settable": ["value"], "Value": "0" }],
      "Interface": { "Socket": "gluster.sock", "Types": ["docker.volumedriver/1.0"] },
      "IpcHost": false,
      "Linux": {
        "AllowAllDevices": false,
        "Capabilities": ["CAP_SYS_ADMIN"],
        "Devices": [{ "Description": "", "Name": "", "Path": "/dev/fuse", "Settable": null }]
      },
      "Mounts": null,
      "Network": { "Type": "host" },
      "PidHost": false,
      "PropagatedMount": "/var/lib/docker-volumes/gluster",
      "User": {},
      "WorkDir": "",
      "rootfs": {
        "diff_ids": ["sha256:8a21bfe4b75043c94d7f9a9201ea73932c3e489bd7fd2881021ceedcec3b19d5"],
        "type": "layers"
      }
    },
    "Enabled": true,
    "Id": "eb54b605aac01e577ebad9e73439f0a228a1bbcc2789a7f8dc4a69149fbce971",
    "Name": "glusterfs:latest",
    "PluginReference": "docker.io/sapk/plugin-gluster:latest",
    "Settings": {
      "Args": [],
      "Devices": [{ "Description": "", "Name": "", "Path": "/dev/fuse", "Settable": null }],
      "Env": ["DEBUG=0"],
      "Mounts": []
    }
  },
  "PropagatedMount":
    "/var/lib/docker/plugins/eb54b605aac01e577ebad9e73439f0a228a1bbcc2789a7f8dc4a69149fbce971/rootfs/var/lib/docker-volumes/gluster",
  "Rootfs": "/var/lib/docker/plugins/eb54b605aac01e577ebad9e73439f0a228a1bbcc2789a7f8dc4a69149fbce971/rootfs",
  "Config": "sha256:a1fbb7a93f194f604d1532d75913e4b89aa8c75c8dd4896baace96a41363ec6b",
  "Blobsums": ["sha256:c367876793b34af008c4e1f6d19ed7a584352092021e9ef03259c97bf94566f0"],
  "SwarmServiceID": ""
}

ls -la /var/lib/docker-volumes/gluster/portainer_portainer-data/ fails as this directory doesn't exist

Vad1mo avatar Dec 12 '17 21:12 Vad1mo

I try to setup tests in travis to deliver new image based on glusterfs cli. That could maybe fix the error since command could failed directly but I will try to improve handling cli by keep process monitored by the driver and logging output directly via the plugin.

sapk avatar Dec 13 '17 09:12 sapk

A new version should be released soon. It use gluster cli (plugin wasn't updated without a tag) that may return more error case at start-up when mount command would have detach it-self. I will definitively improve process handling but that could already fix your problem.

sapk avatar Dec 13 '17 15:12 sapk

Alright, I try that out later today and let you know about the outcome.

Vad1mo avatar Dec 13 '17 15:12 Vad1mo