glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

Nextcloud on GlusterFS+ZFS (via TrueNas) = sticky bit and empty files

Open skvarovski opened this issue 2 months ago • 11 comments

Hello, everyone, I'm using Truenas, where I've deployed zfs. I then had to containerize GlusterFS to work with the file system. However, after connecting via glusterfs-client, when I try to write files via Nextcloud, I get the error message that the file is empty. A closer look (listing the files) shows that two identical files are appearing: one is valid, and the other is empty and has the sticky bit attribute. Researching this issue has revealed that this doesn't happen on the ext4 file system (there's no sticky bit attribute), but on zfs, a sibling file with a zero size and the sticky bit attribute sometimes appears.

Versions used: Truenas 25.04 GlusterFS 11.1 on Docker (I'm using two servers in distribution mode) Nextcloud 31

What does it look like if I list files on zfs - ls -la :

-rw-r--r-- 1 www-data www-data 61759 Jun 4 2024 alice2.txt -rw-r--r-T 1 www-data www-data 0 Oct 3 15:07 alice2.txt

Can anyone tell me how to fix this?

The photo shows the listing of files on different servers, where Nextcloud writes (both examples are built on two GlusterFS servers in distribution mode). Image

for example file on ZFS look at

Image Image

skvarovski avatar Oct 06 '25 09:10 skvarovski

T files are called linkto files, they are created when the brick where the file is supposed to reside based on the hash of the name doesn't have enough space, so an empty file is created and the actual file with data is created in another brick.

pranithk avatar Oct 06 '25 10:10 pranithk

But why isn't this behavior present on ext4? There's plenty of space on the disks. Is it possible to disable this behavior for zfs?

skvarovski avatar Oct 06 '25 10:10 skvarovski

T files are called linkto files, they are created when the brick where the file is supposed to reside based on the hash of the name doesn't have enough space, so an empty file is created and the actual file with data is created in another brick.

I think I understand what you're saying—the "brick" itself does contain files with the "t" attribute. But I'm talking about a completely different situation: when I mount a disk via gluster-client, I only see these "t" attributes when running zfs. When mounting the disk when the server is running ext4, however, this effect isn't observed.

ps. Please note that in the screenshots above, this is all that is visible on the client side.

skvarovski avatar Oct 06 '25 13:10 skvarovski

T files are called linkto files, they are created when the brick where the file is supposed to reside based on the hash of the name doesn't have enough space, so an empty file is created and the actual file with data is created in another brick.

I think I understand what you're saying—the "brick" itself does contain files with the "t" attribute. But I'm talking about a completely different situation: when I mount a disk via gluster-client, I only see these "t" attributes when running zfs. When mounting the disk when the server is running ext4, however, this effect isn't observed.

ps. Please note that in the screenshots above, this is all that is visible on the client side.

Sorry, I didn't understand, what do you mean by mounting the disk via gluster client? Are you mounting the volume? Maybe providing the output of gluster volume info and df -Th on the server nodes would help clarify the setup configuration.

pranithk avatar Oct 07 '25 05:10 pranithk

Sorry, I didn't understand, what do you mean by mounting the disk via gluster client? Are you mounting the volume? Maybe providing the output of gluster volume info and df -Th on the server nodes would help clarify the setup configuration.

Sorry for the confusion, I mounted the volume using the command as per the documentation: mount.glusterfs host:volume . I'm attaching a screenshot of the two nodes and the client configuration, which shows the information.

Image

skvarovski avatar Oct 07 '25 08:10 skvarovski

@skvarovski Could you also provide the zpool information of the bricks on the two nodes?

@rafikc30 This looks similar to the issue you said you were saying where in ls -l output double names are coming, once with sticky bit and another without sticky bit

pranithk avatar Oct 08 '25 05:10 pranithk

@skvarovski Could you please share the stat output of the file from both bricks (node1 and node2)?

Also, were you able to reproduce the issue? It would be helpful if you could provide some insights into the file system operations performed on the file — such as rename, hardlink, or any other relevant operations.

Additionally, please let me know if any cluster management actions like rebalance, add-brick, or remove-brick were performed recently. I’m trying to gather enough context to reproduce the issue.

rafikc30 avatar Oct 08 '25 06:10 rafikc30

@skvarovski Could you please share the stat output of the file from both bricks (node1 and node2)?

Also, were you able to reproduce the issue? It would be helpful if you could provide some insights into the file system operations performed on the file — such as rename, hardlink, or any other relevant operations.

Additionally, please let me know if any cluster management actions like rebalance, add-brick, or remove-brick were performed recently. I’m trying to gather enough context to reproduce the issue.

Hello Rafikc30, I built a cluster on virtual machines and studied the documentation. I didn't encounter any problems during creation or initial testing – everything worked fine.

Now I've deployed it on two physical servers running TrueNas 25.04, where I created a zpool using the standard GUI with default parameters. Because TrueNas doesn't allow apt-install, I had to containerize glusterfs. I first built and tested the Docker version in my home lab, then transferred it to the physical servers. Copying files via the terminal didn't cause any problems or errors until Nextcloud started doing it. I used the Docker version nextcloud:apache, which runs under the www-data user.

I configured DNS on the servers (manually added it). I specified the required IP address for the network interface in the /etc/glusterfs/glusterd.vol file. Then I used the standard commands:

_ gluster volume create gv0 node1:/data/brick1/gv0 node2:/data/brick1/gv0 _ gluster volume start gv0

What's unusual about me and what I noticed:

  • I'm using Docker, copying files as root doesn't produce any errors (no sticky bits).
  • Sometimes, when repeating uploads via NextCloud, the same files appear with a size of 0 KB. About every 3-5 files in a batch upload eventually fails.
  • When running a script from the console to create files as the www-data user in a single thread (or even multithreaded), there are no errors.

My Zfs stats:

Image

zpool list -v Image

Stat of broken file

Image

.... my configuration docker for truenas

FROM ubuntu:25.10

# glusterfs-server
RUN apt-get update && \
    apt-get install -y \
    glusterfs-server \
    apt-transport-https \
    software-properties-common \
    iproute2 \
    iputils-ping \
    dnsutils \
    net-tools && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Ports
EXPOSE 24007 24008 49152 49153 49154 49155 49156

#ENTRYPOINT ["glusterd"]
CMD ["glusterd", "-N", "-p", "/var/run/glusterd.pid", "--log-level", "INFO", "--log-file=/dev/stdout"]
# Node GlusterFS cluster
---
services:
  gluster-node-1:
    build:
      context: .
      dockerfile: Dockerfile
    image: gluster-server-11-dev:latest
    container_name: gluster-1
    hostname: gluster-node-1
    privileged: true
    stdin_open: true
    tty: true
    network_mode: host
    volumes:
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
      - ./log:/var/log/glusterfs
      - ./brick1:/data/brick1:rw
      - ./gluster_etc:/etc/glusterfs
      - ./gluster_lib:/var/lib/glusterd

    cap_add:
      - SYS_ADMIN

PS. When I use the multi-threaded file creation script, even under the user name www-data, there are no files anywhere (even on the node) that have the T-attribute.

#!/bin/bash

NUM_FILES=200
THREADS=5
TARGET_DIR="/mnt/gv0"

echo "Prepare..."
sudo mkdir -p "$TARGET_DIR"
sudo chown www-data:www-data "$TARGET_DIR"

# Run thread
sudo -u www-data bash -c "
cd '$TARGET_DIR'
echo 'thread 1'
for i in {1..400}; do
    size=\$(( RANDOM % 950 + 50 ))K  #  50K - 1000K
    dd if=/dev/zero of=thread1_file\$i.dat bs=\$size count=1 status=none
done &

echo 'thread 2'
for i in {401..80}; do
    size=\$(( RANDOM % 950 + 50 ))K  #  50K - 1000K
    dd if=/dev/zero of=thread2_file\$i.dat bs=\$size count=1 status=none
done &

echo 'thread 3'
for i in {801..1200}; do
    size=\$(( RANDOM % 1900 + 100 ))K  # 100K - 2000K
    dd if=/dev/zero of=thread3_file\$i.dat bs=\$size count=1 status=none
done &

echo 'thread 4'
for i in {1201..1600}; do
    size=\$(( RANDOM % 2900 + 100 ))K  # 100K - 3000K
    dd if=/dev/zero of=thread4_file\$i.dat bs=\$size count=1 status=none
done &

echo 'thread 5'
for i in {1601..2000}; do
    size=\$(( RANDOM % 4900 + 100 ))K  # 100K - 5000K
    dd if=/dev/zero of=thread5_file\$i.dat bs=\$size count=1 status=none
done &

wait
echo 'done'
echo 'stats:'
ls -la | head -10
echo '...'
echo 'count files:'
ls -1 | wc -l
echo 'total size:'
du -sh '$TARGET_DIR'
"

Image

skvarovski avatar Oct 08 '25 09:10 skvarovski

I think I found the cause of this error. After forcing a rebalance and resetting the volume settings, the t-attribute issue disappeared. I'm looking at the logs and see some files have moved between nodes. Frankly, I'm surprised it was resolved this way, but I'm even more surprised why it only showed up in Nextcloud. I'm continuing to study the information and the commands that were logged; maybe I'll find something else.

UPD. Perhaps the answer lies in these logs: after adding a node and creating a cluster, I encountered an error that created a directory. Further investigation revealed that I deleted the node from the peer and recreated it successfully. Did an imbalance occur somewhere along the way? Image

Am I correct in understanding that if a file's hash is calculated incorrectly, it could be hosted on the wrong node, leading to a situation where the client sees the t-attribute?

skvarovski avatar Oct 10 '25 10:10 skvarovski

@skvarovski I'm glad to hear that the rebalance fixed the issue. But in any circumstances, the empty link files should not appear on the mounted path. We still have to find the root cause

rafikc30 avatar Oct 14 '25 07:10 rafikc30

@skvarovski I'm glad to hear that the rebalance fixed the issue. But in any circumstances, the empty link files should not appear on the mounted path. We still have to find the root cause

As far as I understand, a possible error occurred when I tried to build Distribute Volume when one of the nodes was unavailable. That's why there's an error in the cli logs. It was then built with the same name. Apparently, that's where the error lies. If I understand glusterfs correctly, file placement depends on the hash file and the node itself. When I created volume gv0 with the error, a volume entry was already created in the settings, which may have calculated the hash incorrectly. This is my guess. So, I have a question: if two nodes are working properly, will rebalance not move files at all? In my case, they moved, meaning the cluster was already built incorrectly?

skvarovski avatar Oct 14 '25 09:10 skvarovski