mergerfs icon indicating copy to clipboard operation
mergerfs copied to clipboard

Fish shell: Cannot cd in the pool dir

Open lowlow--- opened this issue 3 years ago • 14 comments

Describe the bug 'cd' to the pool directory from the fish shell raises an error: Directory doesn't exists. The same command succeed from the bash shell.

To Reproduce

  1. fish
  2. cd /to/the/pool

Expected behavior Change the current directory to pool dir

System information:

  • OS, kernel version: Linux 5.10.36-2-ARCH #1 SMP Fri May 14 14:12:56 UTC 2021 armv7l GNU/Linux
  • mergerfs version: 2.32.4
  • mergerfs settings: /mnt/drive_1:/mnt/drive_2 /mnt/pool fuse.mergerfs allow_other,use_ino,cache.files=off,dropcacheonclose=true,allow_other,category.create=mfs 0 0
  • List of drives, filesystems, & sizes:
dev                264M       0  264M   0% /dev
run                460M    8.7M  451M   2% /run
/dev/mmcblk0p2      59G     17G   40G  29% /
tmpfs              460M       0  460M   0% /dev/shm
tmpfs              460M    8.0K  460M   1% /tmp
/dev/mmcblk0p1     100M     37M   63M  37% /boot
1:2                 15T    6.8T  7.1T  49% /mnt/pool
tmpfs               92M       0   92M   0% /run/user/975
/dev/sda1          7.3T     93M  6.9T   1% /mnt/drive_2
/dev/sdb           7.3T    6.8T  156G  98% /mnt/drive_1
tmpfs               92M       0   92M   0% /run/user/1001
sda           8:0    0  7.3T  0 disk
└─sda1        8:1    0  7.3T  0 part /mnt/drive_2
sdb           8:16   0  7.3T  0 disk /mnt/drive_1
mmcblk0     179:0    0 59.6G  0 disk
├─mmcblk0p1 179:1    0  100M  0 part /boot
└─mmcblk0p2 179:2    0 59.5G  0 part /

Additional context None

lowlow--- avatar May 18 '21 11:05 lowlow---

I'm not seeing evidence of what you claim in the traces (the mergerfs trace has almost nothing in it) and fish on my system works exactly as expected.

How did you trace fish and mergerfs?

trapexit avatar May 18 '21 12:05 trapexit

Basically I've done the following:

  1. Get the pid of "fish shell",
  2. In another shell run strace -fvTtt -s 256 -p <pid of fish> -o somefile
  3. In the "first" shell run cd /to/pool

Similarly for mergerfs, I've launched the "strace" command in on shell and the "cd" command in another.

lowlow--- avatar May 18 '21 17:05 lowlow---

Hm. Something clearly wasn't right given the mergerfs trace has less than a page worth of content.

Perhaps something easier to follow would be

strace -fvTtt -s 256 -o /tmp/fish.trace fish -c "cd /mnt/pool"

trapexit avatar May 18 '21 19:05 trapexit

Here is the new trace with : strace -fvTtt -s 256 -o /tmp/fish.trace fish -c cd /mnt/pool

fish.trace.txt

Just to mention, other commands like ls, cp, mv, ... work as expected.

lowlow--- avatar May 18 '21 20:05 lowlow---

Not

strace -fvTtt -s 256 -o /tmp/fish.trace fish -c cd /mnt/pool

It needs to be

strace -fvTtt -s 256 -o /tmp/fish.trace fish -c "cd /mnt/pool"

The quotes around "cd /mnt/pool" are important.

trapexit avatar May 18 '21 20:05 trapexit

Ooops ...

Here is the correct trace: fish.trace.txt

lowlow--- avatar May 18 '21 20:05 lowlow---

I don't know what fish is doing but it's not chdir'ing like on my system nor do I see any particular error. And you can see it stat the path and returns successfully. It looks to me fish is deciding to do this and isn't an error in the normal sense.

13255 22:36:32.713574 stat64("/mnt/pool", <unfinished ...> 13255 22:36:32.715981 <... stat64 resumed>{st_dev=makedev(0, 0x22), st_ino=5424983562661234939, st_mode=S_IFDIR|0755, st_nlink=11, st_uid=975, st_gid=977, st_blksize=4096, st_blocks=8, st_size=4096, st_atime=1621368025 /* 2021-05-18T22:00:25.828919202+0200 /, st_atime_nsec=828919202, st_mtime=1621368025 / 2021-05-18T22:00:25.738919407+0200 /, st_mtime_nsec=738919407, st_ctime=1621368025 / 2021-05-18T22:00:25.738919407+0200 */, st_ctime_nsec=738919407}) = 0

What version of fish?

My system with fish 2.7.1 I see

10517 17:03:00.621705 stat("/media/tmp", {st_dev=makedev(0, 49), st_ino=279947601200259710, st_mode=S_IFDIR|[0/11698]777, st_nlink=33, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=4096, st_atime=1621316134 /* 2021-05-18T01:35:34.747370353-0400 /, st_atime_nsec=747370353, st_mtime=1616294012 / 2021-03-20T22:33:32.333296432-0400 /, st_mtime_nsec=333296432, st_ctime=1616294012 / 2021-03-20T22:33:32.333296432-0400 */, st_ctime_nsec=333296432}) = 0 <0.000061> 10517 17:03:00.621854 chdir("/media/tmp") = 0 <0.000015> 10517 17:03:00.621903 getcwd("/media/tmp", 4096) = 11 <0.000012>

trapexit avatar May 18 '21 21:05 trapexit

My system has fish version 3.2.2

I've downgraded fish version step by step. It works with version <= 3.0.2, with newer version the "cd" command fails.

lowlow--- avatar May 18 '21 21:05 lowlow---

It's something more subtle. 3.2.2 works fine for me.

Can you provide a full trace of mergerfs when doing "fish -c 'cd /mnt/pool'" ?

trapexit avatar May 18 '21 22:05 trapexit

Here are 2 traces, one with fish version 3.0.2 the other with version 3.3.2

I got the trace files with: strace -fvTtt -s 256 -p <pidofmergerfs> -o /tmp/file.txt and launching in another shell: fish -c 'cd /mnt/pool'

mergerfs_with_fish-3.0.2.trace.txt mergerfs_with_fish-3.2.2.trace.txt

lowlow--- avatar May 19 '21 13:05 lowlow---

These are again not full traces. mergerfs is just waiting for response from the kernel.

How about rather than tracing you run mergerfs in debug mode?

sudo mergerfs -d -o allow_other,use_ino,cache.files=off,dropcacheonclose=true,category.create=mfs /mnt/drive_1 /mnt/pool

Run that in one terminal and in another run fish and "cd /mnt/pool" then provide me the log.

trapexit avatar May 20 '21 18:05 trapexit

Output of mergerfs -d -o allow_other,use_ino,cache.files=off,dropcacheonclose=true,category.create=mfs /mnt/drive_1:/mnt/drive_2 /mnt/pool/ while cd /mnt/pool

FUSE library version: 2.9.7-mergerfs_2.30.0
unique: 2, opcode: INIT (26), nodeid: 0, insize: 56, pid: 0
INIT: 7.32
flags=0x03fffffb
max_readahead=0x00020000
   INIT: 7.31
   flags=0x00448079
   max_readahead=0x00020000
   max_write=0x00100000
   max_background=0
   congestion_threshold=0
   max_pages=256
   unique: 2, success, outsize: 80
unique: 4, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 25513
getattr /
   unique: 4, success, outsize: 120
unique: 6, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 25516
getattr /
   unique: 6, success, outsize: 120
unique: 8, opcode: LOOKUP (1), nodeid: 1, insize: 47, pid: 25519
LOOKUP /movies
getattr /movies
   NODEID: 2
   GEN: 1049015239426187667
   unique: 8, success, outsize: 144
unique: 10, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 25521
getattr /
   unique: 10, success, outsize: 120
unique: 12, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 25525
getattr /
   unique: 12, success, outsize: 120
unique: 14, opcode: LOOKUP (1), nodeid: 1, insize: 47, pid: 25525
LOOKUP /movies
getattr /movies
   NODEID: 2
   GEN: 1049015239426187667
   unique: 14, success, outsize: 144
unique: 16, opcode: GETATTR (3), nodeid: 1, insize: 56, pid: 25529
getattr /
   unique: 16, success, outsize: 12

lowlow--- avatar May 21 '21 08:05 lowlow---

I had wanted a singular branch but regardless as you can see from the log multiple apps were successful in querying the filesystem. WIthout you providing isolated logs from only fish there isn't anything I can do if I also can't replicate it. There are a decent number of changes between 3.0.2 and 3.1.0 and the one spot where chdir shows up looks totally fine (and work on my systems.)

trapexit avatar May 21 '21 11:05 trapexit

 if (!success) {
        struct stat buffer;
        int status;

        status = wstat(dir, &buffer);
        if (!status && S_ISDIR(buffer.st_mode)) {
            streams.err.append_format(_(L"%ls: Permission denied: '%ls'\n"), cmd, dir_in.c_str());
        } else {
            streams.err.append_format(_(L"%ls: '%ls' is not a directory\n"), cmd, dir_in.c_str());
        }

        if (!parser.is_interactive()) {
            streams.err.append(parser.current_line());
        }

        return STATUS_CMD_ERROR;
    }

That's the code after a fchdir for the failure where it would stat the file to explain the error. 1) The error handling is bad. 2) It doesn't match what's in the traces you gave. There is an open of the path prior and no such thing happens in the trace. The first thing seen is a stat which looks totally fine. Without knowing exactly where the error is coming from I can't comment further on why it happens.

trapexit avatar May 21 '21 11:05 trapexit