"Input/output error" when accessing an NFS share via bindfs
I am working on a Ubuntu install which is a client of an NFS share, mounted at /mnt/nas. This is mounted via a /etc/fstab entry using the nfs type.
Due to the NFS server's MAPALL configuration, chown fails with "Invalid argument" whenever it is run against any file/folder within /mnt/nas. This is causing issues with Docker, where a large amount of available prebuilt containers are designed to run chown on startup and will terminate if this fails.
I am aiming to work around this by using a FUSE/bindfs mount to make the share available at /mnt/nasfix with the --chown-ignore option.
From what I can tell this use of bindfs isn't entirely novel: it seems like vagrant-bindfs is designed with something similar in mind. (But I'm not using Vagrant)
But I'm running into an issue.
# bindfs /mnt/nas /mnt/nasfix
# sudo ls -al /mnt/nasfix
ls: cannot access '/mnt/nasfix': Input/output error
I'm having a lot of trouble finding anything online about this error. I looked at #78, but if my understanding is correct this is about running bindfs on the NFS server (and exposing the bindfs mount via an NFS share). I want to run bindfs on the NFS client.
When running bindfs with the debug option and then attempting to do a directory listing I get the following output:
# bindfs -f -s -d /mnt/nas /mnt/nasfix
FUSE library version: 2.9.9
nullpath_ok: 0
nopath: 0
utime_omit_ok: 0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.34
flags=0x33fffffb
max_readahead=0x00020000
INIT: 7.19
flags=0x00000011
max_readahead=0x00020000
max_write=0x00020000
max_background=0
congestion_threshold=0
unique: 2, success, outsize: 40
unique: 4, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 3377359
getattr /
unique: 4, error: -5 (Input/output error), outsize: 16
(I am currently using bindfs v1.14.7)
EDIT: Reproduced in bindfs 1.18.0 (built from source)
# /usr/local/bin/bindfs -f -s -d /mnt/nas /mnt/nasfix
FUSE library version: 3.10.3
nullpath_ok: 0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.34
flags=0x33fffffb
max_readahead=0x00020000
INIT: 7.31
flags=0x0040f039
max_readahead=0x00020000
max_write=0x00100000
max_background=0
congestion_threshold=0
time_gran=1
unique: 2, success, outsize: 80
unique: 4, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 3476211
getattr[NULL] /
unique: 4, error: -5 (Input/output error), outsize: 16
Strange, since bindfs should be reading the source directory like any other process.
To debug this, you could try running bindfs with strace -f to see which syscall it is exactly that returns that -5.
Command: strace -f /usr/local/bin/bindfs -f -s -d /mnt/nas /mnt/nasfix
Relevant part of output (emitted when attempting to ls /mnt/nasfix)
read(4, "8\0\0\0\3\0\0\0\16\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\350\3\0\0\350\3\0\0"..., 1052672) = 56
write(2, "unique: 14, opcode: GETATTR (3),"..., 69unique: 14, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 1102492
) = 69
write(2, "getattr[NULL] /\n", 16getattr[NULL] /
) = 16
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=11, ...}, AT_SYMLINK_NOFOLLOW) = 0
write(2, " unique: 14, error: -5 (Input/"..., 59 unique: 14, error: -5 (Input/output error), outsize: 16
) = 59
writev(4, [{iov_base="\20\0\0\0\373\377\377\377\16\0\0\0\0\0\0\0", iov_len=16}], 1) = 16
I'm not familiar with strace (or programming in lower-level languages generally), but it looks to me like nothing is returning -5?
Very strange. I don't have any great ideas then. You could try adding debug prints to various places in bindfs.c's functions bindfs_getattr, bindfs_fgetattr and getattr_common to see where it fails. Example:
if (lstat(real_path, stbuf) == -1) {
free(real_path);
DPRINTF("lstat failed: %s (%d)", strerror(errno), errno); <------ you could add this one
return -errno;
}
To enable the debug prints, you need to run ./configure --enable-debug-output before recompiling and reinstalling.
After adding some debug prints and running with debug outfit:
sudo /usr/local/bin/bindfs -f -s -d /mnt/nas/working /mnt/nasfix
FUSE library version: 3.10.3
nullpath_ok: 0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.34
flags=0x33fffffb
max_readahead=0x00020000
INIT: 7.31
flags=0x0040f039
max_readahead=0x00020000
max_write=0x00100000
max_background=0
congestion_threshold=0
time_gran=1
unique: 2, success, outsize: 80
unique: 4, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 3598412
getattr[NULL] /
DEBUG: GID 4294967294 out of bounds after applying offset
DEBUG: apply_gid_offset failed in getattr_common: Invalid argument (22)
unique: 4, error: -5 (Input/output error), outsize: 16
But I'm not specifying a GID offset in my command...
EDIT: Ah, if I explicitly set gid-offset to 0 I get the same behaviour. The issue is that my share reports file ownership as nobody, and so even applying an offset of 0 still results in nobody, which is apparently not valid.
Sorry for the very slow response. Does branch uid-size-fix fix it for you?
I ended up changing some settings on my NAS so that it wouldn't report ownership as nobody and that fixed the issue. I don't have a good way to test this anymore sorry.
I did run into the same issue with my ganesha server running in NFS4, I think this is due to NFSv4 idmapping not being properly set and I ended up getting all files owned by 4294967294 which resulted in the EIO error with bindfs