docker-volume-netshare icon indicating copy to clipboard operation
docker-volume-netshare copied to clipboard

NFS broken on docker 1.11.2 - Failed to resolve server <volume_name>

Open moensch opened this issue 9 years ago • 12 comments

OS: CentOS 7.2 Docker version: docker-engine-1.11.2-1.el7.centos.x86_64 Plugin version: docker-volume-netshare_0.19_linux_amd64-bin

I invoke the plugin as root like so:

./docker-volume-netshare_0.19_linux_amd64-bin nfs

I start a container with a volume definition like so:

docker run --rm -it --name <somename> --volume-driver=nfs -v 10.47.65.10/some/export:/mount --entrypoint /bin/bash <myimage>

On the console with docker-volume-netshare I see the following error. It's as if it wants to use the volume name as the hostname for the NFS server:

INFO[0000] == docker-volume-netshare :: Version: 0.19 - Built: 2016-07-10T08:08:53-07:00 ==
INFO[0000] Starting NFS Version 4 :: options: ''
INFO[0018] Mounting NFS volume 4ad1e90b6af8dd20a01e09c1257df3497014a676bedaed0f7a219236a96ab509: on /var/lib/docker-volumes/netshare/nfs/4ad1e90b6af8dd20a01e09c1257df3497014a676bedaed0f7a219236a96ab509
2016/07/13 13:33:14 mount.nfs4: Failed to resolve server 4ad1e90b6af8dd20a01e09c1257df3497014a676bedaed0f7a219236a96ab509: Name or service not known

And here is what I see in /var/lib/docker/volumes on my box:

#> find /var/lib/docker/volumes/
/var/lib/docker/volumes/
/var/lib/docker/volumes/metadata.db
/var/lib/docker/volumes/6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f
/var/lib/docker/volumes/6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f/_data
/var/lib/docker/volumes/6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f/_data/.bash_logout
/var/lib/docker/volumes/6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f/_data/.bashrc
/var/lib/docker/volumes/6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f/_data/.profile
/var/lib/docker/volumes/cbee150732aa7b42c16a3c20440e152aa6fe3b44fbc75f660f60672a7e21866d
/var/lib/docker/volumes/cbee150732aa7b42c16a3c20440e152aa6fe3b44fbc75f660f60672a7e21866d/_data
/var/lib/docker/volumes/5286c12e059dcd4c46a31d01cbcbf255d861a6e3a9871807a1c07e805818f04b
/var/lib/docker/volumes/5286c12e059dcd4c46a31d01cbcbf255d861a6e3a9871807a1c07e805818f04b/_data
/var/lib/docker/volumes/9c5df72623e17b3da1709b6c7448b75368d064aa2dda064b762e0435e1a0e12b
/var/lib/docker/volumes/9c5df72623e17b3da1709b6c7448b75368d064aa2dda064b762e0435e1a0e12b/_data

moensch avatar Jul 13 '16 19:07 moensch

"docker volume ls" output:

[root@sam7 ~]# docker volume ls
DRIVER              VOLUME NAME
local               5286c12e059dcd4c46a31d01cbcbf255d861a6e3a9871807a1c07e805818f04b
local               6333dfba95fa957f3b4a814ff53b7a457abd6c6329b5b029ba01b175e5cfc65f
local               9c5df72623e17b3da1709b6c7448b75368d064aa2dda064b762e0435e1a0e12b
local               cbee150732aa7b42c16a3c20440e152aa6fe3b44fbc75f660f60672a7e21866d
[root@sam7 ~]#

moensch avatar Jul 13 '16 19:07 moensch

Debug log:

INFO[0000] == docker-volume-netshare :: Version: 0.19 - Built: 2016-07-10T08:08:53-07:00 ==
INFO[0000] Starting NFS Version 4 :: options: ''
DEBU[0008] Host path for 10.47.65.10/mxl/msg_archive is at /var/lib/docker-volumes/netshare/nfs/10.47.65.10/mxl/msg_archive
DEBU[0008] Entering Get: {a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36 map[]}
DEBU[0008] Entering Create: name: a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36, options map[]
DEBU[0008] Create volume -> name: a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36, map[]
DEBU[0008] Host path for a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36 is at /var/lib/docker-volumes/netshare/nfs/a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36
DEBU[0008] Entering Get: {a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811 map[]}
DEBU[0008] Entering Create: name: a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811, options map[]
DEBU[0008] Create volume -> name: a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811, map[]
DEBU[0008] Host path for a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811 is at /var/lib/docker-volumes/netshare/nfs/a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811
DEBU[0008] Entering Get: {9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e map[]}
DEBU[0008] Entering Create: name: 9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e, options map[]
DEBU[0008] Create volume -> name: 9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e, map[]
DEBU[0008] Host path for 9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e is at /var/lib/docker-volumes/netshare/nfs/9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e
DEBU[0008] Entering Get: {5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d map[]}
DEBU[0008] Entering Create: name: 5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d, options map[]
DEBU[0008] Create volume -> name: 5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d, map[]
DEBU[0008] Host path for 5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d is at /var/lib/docker-volumes/netshare/nfs/5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d
DEBU[0008] Entering Mount: {a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36 map[]}
INFO[0008] Mounting NFS volume a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36: on /var/lib/docker-volumes/netshare/nfs/a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36
DEBU[0008] Mounting with NFSv4 - src: a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36:, dest: /var/lib/docker-volumes/netshare/nfs/a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36
DEBU[0008] exec: mount -v -t nfs4 a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36: /var/lib/docker-volumes/netshare/nfs/a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36

2016/07/14 14:34:42 mount.nfs4: Failed to resolve server a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36: Name or service not known

DEBU[0009] Entering Remove: name: a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36, options map[]
DEBU[0009] Entering Remove: name: a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811, options map[]
DEBU[0009] Entering Remove: name: 9e822ee1e4fe525fe384ebdebfd7742cd28cb8eba8c7642d5e90ba4e05cf510e, options map[]
DEBU[0009] Entering Remove: name: 5f2963ae4edc6964ee06db93f244913ceee114367cfe117580eb2a1f3eea016d, options map[]

moensch avatar Jul 14 '16 20:07 moensch

It looks like docker uses a new scheme to generate the volume name - it ses a random id rather then the host:/export snytax. The volume plugin used the name to extract hostname and share.

Anyway, you could use option --opt share=10.47.65.10/some/export:/mount to solve the problem. Possibly this is even the cleanest way, since I didn't found any formal specification on how docker constructs the volume name...

holgerreif avatar Jul 15 '16 07:07 holgerreif

I'll give this a try for now, but I guess this ought to be fixed in the netshare driver, correct? I tried compiling it myself but the build failed (need to log a separate issue about that).

I assume --opt is a docker option and not a netshare option?

moensch avatar Jul 15 '16 07:07 moensch

but I guess this ought to be fixed in the netshare driver

Actually I have no idea, what really happens at the interface beetwen docker and the plugin.

DEBU[0008] Host path for 10.47.65.10/mxl/msg_archive is at /var/lib/docker-volumes/netshare/nfs/10.47.65.10/mxl/msg_archive
DEBU[0008] Entering Get: {a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36 map[]}
<...>
DEBU[0008] Host path for a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36 is at /var/lib/docker-volumes/netshare/nfs/a21777fabd86625d8fab1dd3608bdd64b4ec4b1816d811437bd3cb4b4742de36
DEBU[0008] Entering Get: {a5f6721755d027792e74793fe547d6689e3782d95b9bd378aaaa600b17479811 map[]}

This looks really weird.

@gondor: could you add a netshare option --dump (to be given at startup) that dumps the complete content of requests received from and responses send back to docker daemon?

I assume --opt is a docker option and not a netshare option?

Correct. --opt is an option for docker volume create and passes options to the volume driver, see the reference.

There is no way to provide this option with docker run.

holgerreif avatar Jul 15 '16 08:07 holgerreif

The workaround with "docker volume create" works. One note though: You now need a colon to separate host and directory as this no longer goes through nfsDriver.fixSource().

So it's actually:

#> docker volume create --driver=nfs --opt share=10.47.65.10:mxl/msg_archive --name mynfs
#> docker run --rm -it --name <somename> mynfs:/mount --entrypoint /bin/bash <myimage>
#> # no longer specifying --volume-driver in docker run command

I'm still hoping for a fix. Would love to take a stab at this fix myself but am still getting build errors on my box.

The workaround confirms that in theory, this works on docker 1.11.2, it's just the erroneous use of the volume name to deduce the NFS mount point that causes it to break. As I'd like to use this through docker-compose and Rancher, the extra step via volume create won't really work for me, I believe.

Thank you for your help, @holgerreif

moensch avatar Jul 18 '16 22:07 moensch

Any update on this? I'm running into the same issue. Oddly enough the problem is with a container that has no nfs volumes defined. When the docker-volume-netshare service is disabled this service starts fine although when it is started the container fails to start. I have other containers that use nfs volumes and they start and mount their volumes without error.

Here are my testing notes.

Stop netshare

systemctl stop docker-volume-netshare.service

Start it manually with verbose in foreground to watch for messages

/usr/bin/docker-volume-netshare nfs --verbose

Suspended mesos-zk-graphite-collector ( problem container without nfs configurations )

Suspended docker-nginx-nfs-kp ( working container with nfs configuration )

Add constraint to lock it to testing slave host

hostname:CLUSTER:myhost-platform-mesos-slave01.mydomain.com

Scale docker-nginx-nfs-kp to 1 ( Works Fine )

INFO[0304] Mounting NFS volume lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx on /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx DEBU[0304] exec: mount -t nfs4 lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx

INFO[0304] Unmounting volume lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx from /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx INFO[0304] Mounting NFS volume lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx on /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx DEBU[0304] exec: mount -t nfs4 lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx

Suspend docker-nginx-nfs-kp

INFO[0352] Unmounting volume lif1-myfiler.mydomain.com:/myhost_docker_netshare01/nginx from /var/lib/docker-volumes/netshare/nfs/lif1-myfiler.mydomain.com/myhost_docker_netshare01/nginx

Scale mesos-zk-graphite-collector to 1 ( also constrained to same test host )

DEBU[0406] Get for 65fd25ace71de26a2cb468445a4b7717415d559be5d725185f1adae4f7d0f03c is at /var/lib/docker-volumes/netshare/nfs/65fd25ace71de26a2cb468445a4b7717415d559be5d725185f1adae4f7d0f03c DEBU[0406] Get for 27728ddffba8fd19250b04e405e2d95a77c34e64031a8007ac3b21f85bc27b60 is at /var/lib/docker-volumes/netshare/nfs/27728ddffba8fd19250b04e405e2d95a77c34e64031a8007ac3b21f85bc27b60 DEBU[0407] Get for 8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14 is at /var/lib/docker-volumes/netshare/nfs/8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14 INFO[0407] Mounting NFS volume 8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14: on /var/lib/docker-volumes/netshare/nfs/8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14 DEBU[0407] exec: mount -t nfs4 8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14: /var/lib/docker-volumes/netshare/nfs/8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14

2016/08/22 11:15:06 mount.nfs4: Failed to resolve server 8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14: Name or service not known

DEBU[0407] Removing volume 8d91633bea4371223d72f3d39e4f8674a07612b6eaef2c5500877c39dfe5eb14 DEBU[0407] Removing volume 65fd25ace71de26a2cb468445a4b7717415d559be5d725185f1adae4f7d0f03c DEBU[0407] Removing volume 27728ddffba8fd19250b04e405e2d95a77c34e64031a8007ac3b21f85bc27b60

Deployment fails with "docker: Error response from daemon: VolumeDriver.Mount: exit status 32."

Stop docker-volumes-netshare

CTRL-C manually started version

Ensure service didn't somehow get started

systemctl -a | grep -i netshare

Scale mesos-zk-graphite-collector to 1 ( also constrained to same test host )

Container deploys successfully when netshare service is not running.

Thanks, Kenny

k3nnyP avatar Aug 22 '16 12:08 k3nnyP

I apologize for the crazy font in my previous post.

k3nnyP avatar Aug 23 '16 12:08 k3nnyP

Please send me your OS version, Docker version and exact use case thats failing in an email to my first name found on my profile @ containx.io. I will try to correlate the root cause and address this. I need test cases within Docker and not mesos so I can have a common environment. Thank you everyone :)

gondor avatar Sep 03 '16 04:09 gondor

just my 2 cents - got a similar issue recently, the same error messages and other symptoms. 'docker volume create' workaround was working fine.

Then I stumbled across 'Create Volume' calls in our logs which was suspiciously different across a couple of service we use. Which made me think that it was somehow related to underlying docker images. Sure enough in one of our docker images we defined 4 volumes, if we execute docker run with just one nfs mount and 3 others not mounted then it fails with 32 error, if we execute it with all four nfs mounts then it run without a glitch. Removing volumes from the image all together allows us to do docker run with any arbitrary mounts we want.

I have no idea what is happening under the hood, and how it is aligned with the most recent Docker changes and why it was not showing before. So far it has been tested few times with docker 1.12.3 and works without a glitch.

drbolsen avatar Jan 20 '17 13:01 drbolsen

I have seen similar issue on ubuntu server ubuntu 14.04.5 docker-engine: 1.11.0-0~trusty docker-volume-netshare: 0.32

Steps to reproduce:

  1. Create an image from following Docker file
FROM busybox
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol
  1. docker build -t volume:yes .
  2. docker run --rm -it --name crash --volume-driver=nfs -v 10.0.2.15/shome:/mount --entrypoint /bin/bash volume:yes
  3. Previous command with fail and the logs mentioned in this post can be viewed.
  4. Now run the following command for it to succeed. You can also see a /myvol created on the host. docker run --rm -it --name crash --volume-driver=nfs -v 10.0.2.15/shome:/mount -v /myvol:/myvol --entrypoint /bin/bash volume:yes

I think since the image wants to mount a volume and docker run has not specified the host mount point, there is some ambiguity. When docker-volume-netshare is not running the default action taken by docker is to create a host mount point /myvol. This is explained here https://github.com/docker/docker/issues/16055#issuecomment-138132556 When docker-volume-netshare is running, and host directory of a bind-mounted volume doesn’t exist, docker-volume-netshare plugin should let docker handle it. Or docker should not let docker-volume-netshare plugin handle it. Not sure which way it is. Investigating it further..

blackgold avatar Feb 22 '17 06:02 blackgold

So it's actually:

#> docker volume create --driver=nfs --opt share=10.47.65.10:mxl/msg_archive --name mynfs
#> docker run --rm -it --name <somename> mynfs:/mount --entrypoint /bin/bash <myimage>

No. It's: docker volume create --driver=nfs --opt share=10.47.65.10:mxl/msg_archive --name mynfs docker run --rm -it --name <somename> -v mynfs:/mount --entrypoint /bin/bash <myimage>

(See the "-v".)

FRNK-BK avatar Sep 16 '19 18:09 FRNK-BK