mergerfs icon indicating copy to clipboard operation
mergerfs copied to clipboard

Keep getting "Transport endpoint is not connected" on armv5 machine.

Open maroessler opened this issue 4 years ago • 12 comments

Describe the bug I am trying to create a mergerfs filesystem on an armv5 machine using two folders. Mounting works, however, I cannot write to it. A simple touch results in:

touch: failed to close 'merge/test': Transport endpoint is not connected

The file is created but cannot be accessed.

Using nano or other editors result in the same error. Same with echo "foo" > merge/test:

echo: write error: transport endpoint is not connected

Reading with ls or with cat works, however.

To Reproduce

mkdir /home/michael/test/disk1
mkdir /home/michael/test/disk1
mkdir /home/michael/test/merge

I am mounting the filesystem with the following /etc/fstab entry:

/home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge      fuse.mergerfs  defaults,allow_other,direct_io,use_ino   0       0

Then I use touch to create a new file on /home/michael/test/merge.

I already tried the same thing on a x86 machine and it works. So I assume that the configuration is correct.

Expected behavior I am expecting to be able to write to the mergerfs filesystem.

System information:

  • OS, kernel version: Linux nsa325v2 5.8.3-kirkwood-tld-1 #1.0 PREEMPT Sat Aug 22 16:10:01 PDT 2020 armv5tel GNU/Linux
  • mergerfs version: mergerfs version: 2.32.4
  • List of drives, filesystems, & sizes: df -h
# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
LABEL=rootfs    /               ext3    noatime,nodiratime,errors=remount-ro 0 1
tmpfs          /tmp            tmpfs   defaults          0       0
# >>> [openmediavault]
/dev/disk/by-id/md-name-nsa325v2:0              /srv/dev-disk-by-id-md-name-nsa325v2-0  ext4    defaults,nofail,user_xattr,jqfmt=vfsv0,acl      0 2
/dev/disk/by-label/backup               /srv/dev-disk-by-label-backup   ext4    defaults,nofail,user_xattr,jqfmt=vfsv0,acl      0 2
/srv/dev-disk-by-id-md-name-nsa325v2-0/Storage/Multimedia/              /export/Multimedia      none    bind,nofail     0 0
//192.168.7.8/collection                /srv/115ff000-1048-479a-aca7-b6aec4c40f6a       cifs    _netdev,iocharset=utf8,vers=3.0,nofail,credentials=/root/.cifscredentials-d3d9856e-cd39-4037-a13c-e82f7b1bd0a7  0 0
# <<< [openmediavault]

#/srv/dev-disk-by-id-md-name-nsa325v2-0/Storage/Multimedia=RO:/srv/dev-disk-by-id-md-name-nsa325v2-0/metadata-overlay  /srv/mergerfs-test      fuse.mergerfs  allow_other,use_ino   0       0
/home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge      fuse.mergerfs  defaults,allow_other,direct_io,use_ino   0       0

maroessler avatar Feb 26 '21 18:02 maroessler

Did you build the binary yourself? ARM systems are all a little different and I've seen some people have really unstable systems due to kernel versions or bad updates in ways I've not seen on other platforms.

Are you the person who had the chroot Debian system which had issues recently?

I'd really need a stack trace to really be able to tell what's happening.

trapexit avatar Feb 26 '21 19:02 trapexit

Did you build the binary yourself? ARM systems are all a little different and I've seen some people have really unstable systems due to kernel versions or bad updates in ways I've not seen on other platforms.

No, I have used the mergerfs_2.32.4.debian-buster_armel.deb from this repo.

Are you the person who had the chroot Debian system which had issues recently?

No, did he have a similar problem?

I'd really need a stack trace to really be able to tell what's happening.

Do you mean something else than the strace output that is attached in the top post? I am happy to help and supply further data if you want. This type of problem is really out of my league so I do not really know where to start debugging.

maroessler avatar Feb 26 '21 20:02 maroessler

No, did he have a similar problem?

Yes. In the end it appeared that the chrooted Debian install was bad somehow. He re-debootstrapped and it worked fine afterwards.

Do you mean something else than the strace output that is attached in the top post? I am happy to help and supply further data if you want. This type of problem is really out of my league so I do not really know where to start debugging.

Yes. I mean getting a coredump or at least stack trace. mergerfs is crashing. strace just shows what system calls it makes. Need to use something like gdb to catch it crashing. If you're having the same issue as the other person (this is a NAS?) then the stacktrace will be garbage.

My suggestion is to first try building it yourself. What OS are you running?

trapexit avatar Feb 26 '21 22:02 trapexit

I'm running on Debian Buster. However, it is customized for my NAS, Zyxel NSA325v2. I did not do anything myself but I'm rather just using what is provided in this forum.

I tried building mergerfs myself but the problem did not disappear. So I tried using gdb to get some more info as you suggested. Only running the program with gdb does not show anything after the crash happens. I have to quit it manually.

root@nsa325v2:~# gdb --args mergerfs -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge
(gdb) run
Starting program: /usr/bin/mergerfs -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2 /home/michael/test/merge
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
[Detaching after fork from child process 19315]
[Inferior 1 (process 19313) exited normally]
(gdb) quit

However, with set follow-fork-mode child I see the following:

root@nsa325v2:~# gdb --args mergerfs -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge
Reading symbols from mergerfs...Reading symbols from /usr/lib/debug/.build-id/11/d3ead6c6cdf21f8ade098531e45b70c46db967.debug...done.
done.
(gdb) set follow-fork-mode child
(gdb) run
Starting program: /usr/bin/mergerfs -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2 /home/michael/test/merge
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
[Attaching after Thread 0x76c02010 (LWP 17828) fork to child process 17832]
[New inferior 2 (process 17832)]
[Detaching after fork from parent process 17828]
[Inferior 1 (process 17828) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
[New Thread 0x76af6c40 (LWP 17833)]

Thread 2.2 "mergerfs" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x76af6c40 (LWP 17833)]
0x00417e6c in FUSE::flush (ffi_=0x76af64c8) at src/fs_dup.hpp:31
31      src/fs_dup.hpp: No such file or directory.
(gdb) backtrace
#0  0x00417e6c in FUSE::flush (ffi_=0x76af64c8) at src/fs_dup.hpp:31
#1  0x004525ac in fuse_flush_common (f=0x481968, ino=<optimized out>, fi=fi@entry=0x76af64c8, req=0x76100f60) at lib/fuse.c:4179
#2  0x004526e8 in fuse_lib_flush (req=0x76100f60, ino=<optimized out>, fi=0x76af64c8, fi@entry=0x76af64c0) at lib/fuse.c:4229
#3  0x00457d5c in do_flush (req=<optimized out>, nodeid=<optimized out>, inarg=0x76b00030) at lib/fuse_lowlevel.c:1595
#4  0x00458aec in fuse_ll_process_buf (data=0x481af8, buf=0x76af6608, ch=<optimized out>) at lib/fuse_lowlevel.c:3004
#5  0x00455288 in fuse_do_work (data=0x491d68) at lib/fuse_loop_mt.c:101
#6  0x76d68734 in start_thread () from /lib/arm-linux-gnueabi/libpthread.so.0
#7  0x76ce8878 in ?? () from /lib/arm-linux-gnueabi/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) quit
A debugging session is active.

        Inferior 2 [process 17832] will be killed.

Quit anyway? (y or n) y

Does that help in any way?

maroessler avatar Feb 27 '21 01:02 maroessler

You want to build with "make DEBUG=1" and then when running include "-f" to run mergerfs in the foreground.

trapexit avatar Feb 27 '21 05:02 trapexit

Here is the gdb output.

root@nsa325v2:~# gdb --args /home/michael/mergerfs/build/mergerfs -f -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/michael/mergerfs/build/mergerfs...rundone.
(gdb) run
Starting program: /home/michael/mergerfs/build/mergerfs -f -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2 /home/michael/test/merge
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
src/rnd.cpp:37:24: runtime error: shift exponent 32 is too large for 32-bit type 'long int'
[New Thread 0x76623c20 (LWP 28417)]
lib/fuse_lowlevel.c:388:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  74 33 62 76 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:389:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:390:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  bd 77 79 2c 10 25 9a 19  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:391:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  bd 77 79 2c 10 25 9a 19  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:392:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  bd 77 79 2c 10 25 9a 19  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:393:25: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  bd 77 79 2c 10 25 9a 19  01 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:394:3: runtime error: member access within misaligned address 0x7662331c for type 'struct fuse_entry_out', which requires 8 byte alignment
0x7662331c: note: pointer points here
  04 00 00 00 00 00 00 00  bd 77 79 2c 10 25 9a 19  01 00 00 00 00 00 00 00  01 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:61:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:62:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:63:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:64:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:65:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:66:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:67:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:68:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:69:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:70:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
              ^
lib/fuse_lowlevel.c:71:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  28 cd 3a 60 00 00 00 00
              ^
lib/fuse_lowlevel.c:72:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  28 cd 3a 60 00 00 00 00
              ^
lib/fuse_lowlevel.c:73:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  28 cd 3a 60 00 00 00 00
              ^
lib/fuse_lowlevel.c:74:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  28 cd 3a 60 00 00 00 00
              ^
lib/fuse_lowlevel.c:75:20: runtime error: member access within misaligned address 0x76623344 for type 'struct fuse_attr', which requires 8 byte alignment
0x76623344: note: pointer points here
  a6 74 90 57 56 5a 36 df  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  28 cd 3a 60 00 00 00 00
              ^
lib/fuse_lowlevel.c:402:11: runtime error: member access within misaligned address 0x7662339c for type 'struct fuse_open_out', which requires 8 byte alignment
0x7662339c: note: pointer points here
  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 9c 33 62 76  1c 33 62 76 80 00 00 00
              ^
lib/fuse_lowlevel.c:404:21: runtime error: member access within misaligned address 0x7662339c for type 'struct fuse_open_out', which requires 8 byte alignment
0x7662339c: note: pointer points here
  08 0e d0 75 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 9c 33 62 76  1c 33 62 76 80 00 00 00
              ^
lib/fuse_lowlevel.c:404:21: runtime error: member access within misaligned address 0x7662339c for type 'struct fuse_open_out', which requires 8 byte alignment
0x7662339c: note: pointer points here
  08 0e d0 75 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 9c 33 62 76  1c 33 62 76 80 00 00 00
              ^
src/fuse_flush.cpp:50:20: runtime error: member access within null pointer of type 'struct FileInfo'

Thread 2 "mergerfs" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x76623c20 (LWP 28417)]
0x004a28c0 in FUSE::flush (ffi_=0x766234b0) at src/fuse_flush.cpp:50
50          return l::flush(fi->fd);
(gdb) backtrace
#0  0x004a28c0 in FUSE::flush (ffi_=0x766234b0) at src/fuse_flush.cpp:50
#1  0x0059871c in fuse_fs_flush (fs=0x6b3a78, fi=0x766234b0) at lib/fuse.c:1991
#2  0x005a8f5c in fuse_flush_common (f=0x6b39c0, req=0x75d01290, ino=3204724263483342848, fi=0x766234b0) at lib/fuse.c:4179
#3  0x005a92ec in fuse_lib_flush (req=0x75d01290, ino=3204724263483342848, fi=0x766234b0) at lib/fuse.c:4229
#4  0x005ba570 in do_flush (req=0x75d01290, nodeid=3204724263483342848, inarg=0x7662d030) at lib/fuse_lowlevel.c:1595
#5  0x005c7ddc in fuse_ll_process_buf (data=0x6b3b50, buf=0x766235f8, ch=0x6b3398) at lib/fuse_lowlevel.c:3004
#6  0x005cebd4 in fuse_session_process_buf (se=0x6b35a0, buf=0x766235f8, ch=0x6b3398) at lib/fuse_session.c:80
#7  0x005ad3d0 in fuse_do_work (data=0x6b3210) at lib/fuse_loop_mt.c:101
#8  0x768a8734 in start_thread () from /lib/arm-linux-gnueabi/libpthread.so.0
#9  0x76828878 in ?? () from /lib/arm-linux-gnueabi/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) continue
Continuing.
[Thread 0x76623c20 (LWP 28417) exited]

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

maroessler avatar Feb 27 '21 23:02 maroessler

Those errors are very strange. The first one makes sense and easy to fix but the misaligned address ones are odd. The memory it complains of is allocated on the stack and there shouldn't be any misalignment. Though it's possible there is corruption of some sort and that's leading to bogus reporting.

Can you try using the master branch? I added the use of UBSAN which might help. While not ideal I might need access to the system if possible. I'd prefer not but if I can't replicate this it will probably be difficult for me to track down. I'm not seeing any armv5tel QEMU machines though I need to go through the systems more closely.

trapexit avatar Feb 28 '21 04:02 trapexit

This actually was the master branch. Sorry for the mixup. I also tried it again with the 2.32.4 tag. Here I get this output:

root@nsa325v2:~# gdb --args /home/michael/mergerfs/build/mergerfs -f -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2  /home/michael/test/merge
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabi".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/michael/mergerfs/build/mergerfs...done.
(gdb) run
Starting program: /home/michael/mergerfs/build/mergerfs -f -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2 /home/michael/test/merge
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
[New Thread 0x76af6c40 (LWP 24501)]

Thread 2 "mergerfs" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x76af6c40 (LWP 24501)]
0x00448f0c in FUSE::flush (ffi_=0x76af64f0) at src/fuse_flush.cpp:49
49          return l::flush(fi->fd);
(gdb) backtrace
#0  0x00448f0c in FUSE::flush (ffi_=0x76af64f0) at src/fuse_flush.cpp:49
#1  0x00450a14 in fuse_fs_flush (fs=0x48ace8, fi=0x76af64f0) at lib/fuse.c:1986
#2  0x0045597c in fuse_flush_common (f=0x48ac30, req=0x76100f60, ino=4797533075846201344, fi=0x76af64f0) at lib/fuse.c:4161
#3  0x00455b00 in fuse_lib_flush (req=0x76100f60, ino=4797533075846201344, fi=0x76af64f0) at lib/fuse.c:4211
#4  0x0045aac0 in do_flush (req=0x76100f60, nodeid=4797533075846201344, inarg=0x76b00030) at lib/fuse_lowlevel.c:1595
#5  0x0045dbf8 in fuse_ll_process_buf (data=0x48adc0, buf=0x76af6618, ch=0x48a5a0) at lib/fuse_lowlevel.c:3004
#6  0x0045fe60 in fuse_session_process_buf (se=0x48a810, buf=0x76af6618, ch=0x48a5a0) at lib/fuse_session.c:80
#7  0x00457194 in fuse_do_work (data=0x48a480) at lib/fuse_loop_mt.c:101
#8  0x76d68734 in start_thread () from /lib/arm-linux-gnueabi/libpthread.so.0
#9  0x76ce8878 in ?? () from /lib/arm-linux-gnueabi/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) continue
Continuing.
[Thread 0x76af6c40 (LWP 24501) exited]

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb) quit

I think I can give you access to my machine. I just have to think about how best to do this. In the meantime, do you have other suggestions what I could try?

maroessler avatar Feb 28 '21 17:02 maroessler

I don't really know. Those errors imply some alignment issue but I don't know why, unless the compiler is messed up, it'd be doing that. Maybe try clang? Install clang and clang++ and try building with CC=clang CXX=clang++?

trapexit avatar Feb 28 '21 19:02 trapexit

When doing that I get:

(gdb) run
Starting program: /home/michael/mergerfs/build/mergerfs -f -o defaults,allow_other,direct_io,use_ino /home/michael/test/disk1:/home/michael/test/disk2 /home/michael/test/merge
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabi/libthread_db.so.1".
[New Thread 0x76af6c40 (LWP 28126)]
fuse: copy from pipe: Bad file descriptor

and nothing after that. ls ~/test/merge hangs with no further output from gdb.

I read in the forum that someone uses mergerfs on the same NAS that I have so it must work somehow. I will setup a fresh system and try it then. I don't know when I will get around to do that but I will update when I do. When that does not work, I could try giving you access to the new system.

Thank you for your help!

maroessler avatar Mar 01 '21 17:03 maroessler

OK. Thanks for testing these things. Sorry I've not been more helpful. It's certainly possible there is something funny with mergerfs' build or code given two different systems with the same chipset had seemingly random crashes. Though as I mentioned the previous person installed a fresh debian setup and everything started working so we stopped investigating.

trapexit avatar Mar 01 '21 17:03 trapexit

Hi, just to let you know. I tested mergerfs on a fresh system. However, it still gave the same error. I'm trying to contact the developer of the rootfs for my device and see if they got it to work or if they have any pointers.

maroessler avatar Mar 13 '21 20:03 maroessler

Yes. In the end it appeared that the chrooted Debian install was bad somehow. He re-debootstrapped and it worked fine afterwards.

Do you know what he changed to make it works?

I'm facing the same issue on armv5 NAS (DLink DNS-320).
I deboostrap debian on my own, here is my script. Any clue on what could be wrong?

ggirou avatar Dec 05 '22 21:12 ggirou

@ggirou What exactly is the issue? mergerfs is crashing or something about the general setup? I don't have an armv5 system off hand to test. Maybe there is some qemu based setup I could try but I'd need to look. I did tweak some code that worked fine but gave some warnings on my ODroid XU4 and someone said that resolved crashes for them. A data alignment issue. That's in master branch but not yet released.

trapexit avatar Dec 05 '22 21:12 trapexit

@trapexit Exactly the same issue mentionned in the first post of @maroessler, first write crashes the mount:

dlink@nas2:~$ ls /mnt/*
/mnt/all:
aaa  d6  lost+found

/mnt/hd1:
lost+found

/mnt/hd2:
aaa  d6  lost+found
dlink@nas2:~$ touch /mnt/all/test
touch: failed to close '/mnt/all/test': Transport endpoint is not connected

More info about my configuration:

dlink@nas2:~$ uname -a
Linux nas2 5.10.0-18-marvell #1 Debian 5.10.140-1 (2022-09-02) armv5tel GNU/Linux
dlink@nas2:~$ mergerfs --version
mergerfs version: 2.31.0
dlink@nas2:~$ cat /etc/fstab 
# ...
/dev/disk/by-path/platform-f1080000.sata-ata-1-part1   /mnt/hd1/  ext4    nodev,nosuid,nofail,auto,defaults,relatime     0 0
/dev/disk/by-path/platform-f1080000.sata-ata-2-part1   /mnt/hd2/  ext4    nodev,nosuid,nofail,auto,defaults,relatime     0 0
  
/mnt/hd1/:/mnt/hd2/  /mnt/all/   fuse.mergerfs   defaults,allow_other,use_ino,cache.files=off,dropcacheonclose=true,category.create=mfs,nonempty     0 0

ggirou avatar Dec 06 '22 08:12 ggirou

You are using a version of mergerfs from years ago. Please test the current release. It's the first thing suggested/requested in the support docs. If the current release still exhibits the issue then try the master branch.

trapexit avatar Dec 06 '22 12:12 trapexit

Same problem with latest version:

dlink@nas2:~$ mergerfs --version
mergerfs version: 2.33.5
dlink@nas2:~$ touch /mnt/all/bbb
touch: failed to close '/mnt/all/bbb': Transport endpoint is not connected

I'll try to build from master branch.

ggirou avatar Dec 06 '22 15:12 ggirou

No more bug with the build from master branch! Thanks :)

dlink@nas2:~/mergerfs$ touch /mnt/all/ccc
dlink@nas2:~/mergerfs$ touch /mnt/all/ddd
dlink@nas2:~/mergerfs$ ls /mnt/*
/mnt/all:
aaa  ccc  d6  ddd  lost+found  test

/mnt/hd1:
lost+found

/mnt/hd2:
aaa  ccc  d6  ddd  lost+found  test

When do you think you'll publish a new release with this fix?

ggirou avatar Dec 06 '22 16:12 ggirou

Soon. I've been preparing a new release... just need to finish some things.

trapexit avatar Dec 06 '22 16:12 trapexit