glusterfs icon indicating copy to clipboard operation
glusterfs copied to clipboard

File exists but listxattr() returns error "No data available".

Open lvtao-sec opened this issue 2 years ago • 6 comments

Description of problem: Create a file with O_DIRECTORY and then set for list extended attributes on that file will failed.

The exact command to reproduce the issue:

  1. A volume with three servers and each server only has one brick.
  2. Volume configuration: gluster volume create test-volume disperse 3 redundancy 1 ...
  3. Mount configuration: mount -t glusterfs $server_ip:/test-volume mount-point
  4. Execute the following open("./file1", O_RDONLY|O_CREAT|O_TRUNC|O_DIRECTORY, 0330); in one client. A file is created and the metadata can be successfully obtained through stat.
  5. Wait for a while and execute listxattr("./file1",0, 0), the results will be an error, whose error string is "No data available".

The full output of the command that failed: return value of listxattr is -1, whose error string is No data available

Expected results: The listxattr, setxattr or getxattr should be successful.

Mandatory info: - The output of the gluster volume info command:

Volume Name: test-volume
Type: Disperse
Volume ID: 9327b8a4-1a13-48af-a9f3-b7e596c81577
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 192.168.0.30:/root/glusterfs-server
Brick2: 192.168.0.31:/root/glusterfs-server
Brick3: 192.168.0.32:/root/glusterfs-server
Options Reconfigured:
storage.fips-mode-rchecksum: on
transport.address-family: inet
nfs.disable: on

- The output of the gluster volume status command:

Status of volume: test-volume
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.0.30:/root/glusterfs-server   50486     0          Y       381  
Brick 192.168.0.31:/root/glusterfs-server   52565     0          Y       298  
Brick 192.168.0.32:/root/glusterfs-server   57576     0          Y       296  
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
Self-heal Daemon on 192.168.0.31            N/A       N/A        Y       314  
Self-heal Daemon on 192.168.0.32            N/A       N/A        Y       312  
 
Task Status of Volume test-volume
------------------------------------------------------------------------------
There are no active volume tasks

- The output of the gluster volume heal command:

Launching heal operation to perform index self heal on volume test-volume has been unsuccessful:
Self-heal daemon is not running. Check self-heal daemon log file.

- Provide logs present on following locations of client and server nodes - /var/log/glusterfs/

cat /var/log/glusterfs/bricks/root-glusterfs-server.log

[2022-07-02 16:34:03.159446 +0000] E [MSGID: 113069] [posix-entry-ops.c:2439:posix_create] 0-test-volume-posix: open on /root/glusterfs-server/dir1/file1 failed [Not a directory]
[2022-07-02 16:34:03.159642 +0000] I [MSGID: 115071] [server-rpc-fops_v2.c:1569:server4_create_cbk] 0-test-volume-server: CREATE info [{frame=108}, {path=/dir1/file1}, {uuid_utoa=15d733ae-601a-4284-8eaa-39cbeb508a5b}, {bname=file1}, {client=CTX_ID:55e54b6c-0577-405a-a2f2-ac7fe9914c95-GRAPH_ID:0-PID:303-HOST:dfs-fuzzing-PC_NAME:test-volume-client-0-RECON_NO:-0}, {error-xlator=test-volume-posix}, {errno=20}, {error=Not a directory}] 
[2022-07-02 16:43:24.054220 +0000] W [socket.c:751:__socket_rwv] 0-tcp.test-volume-server: readv on 192.168.0.30:49148 failed (No data available)

- Is there any crash ? Provide the backtrace and coredump No crash.

- The operating system / glusterfs version: Git version: 79154ae538f4539ccf69272b2b4736d891262781

lvtao-sec avatar Jul 02 '22 16:07 lvtao-sec

@lvtao-sec as per the logs the file is not created. Could you post the return value of open() call once again? Even better would be giving an strace of the application

pranithk avatar Jul 04 '22 15:07 pranithk

Hi @pranithk , thanks for your reply. The strace is attached. Although open returns ENOTDIR, a regular file file1 is created due to O_CREAT flag. And another interesting point is that listxattr immediately after the open returns success but returns an error when trying again.

root@dfs:~/glusterfs-client/dir1# strace ./poc 
execve("./poc", ["./poc"], 0x7ffd57bb4140 /* 25 vars */) = 0
brk(NULL)                               = 0x55eabc087000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffd96942320) = -1 EINVAL (Invalid argument)
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=27999, ...}) = 0
mmap(NULL, 27999, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f278f0ee000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360q\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32, 848) = 32
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0cBR\340\305\370\2609W\242\345)q\235A\1"..., 68, 880) = 68
fstat(3, {st_mode=S_IFREG|0755, st_size=2029224, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f278f0ec000
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
pread64(3, "\4\0\0\0\20\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0", 32, 848) = 32
pread64(3, "\4\0\0\0\24\0\0\0\3\0\0\0GNU\0cBR\340\305\370\2609W\242\345)q\235A\1"..., 68, 880) = 68
mmap(NULL, 2036952, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f278eefa000
mprotect(0x7f278ef1f000, 1847296, PROT_NONE) = 0
mmap(0x7f278ef1f000, 1540096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x25000) = 0x7f278ef1f000
mmap(0x7f278f097000, 303104, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x19d000) = 0x7f278f097000
mmap(0x7f278f0e2000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f278f0e2000
mmap(0x7f278f0e8000, 13528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f278f0e8000
close(3)                                = 0
arch_prctl(ARCH_SET_FS, 0x7f278f0ed540) = 0
mprotect(0x7f278f0e2000, 12288, PROT_READ) = 0
mprotect(0x55eaba2ed000, 4096, PROT_READ) = 0
mprotect(0x7f278f122000, 4096, PROT_READ) = 0
munmap(0x7f278f0ee000, 27999)           = 0
openat(AT_FDCWD, "./file1", O_RDONLY|O_CREAT|O_TRUNC|O_DIRECTORY, 0330) = -1 ENOTDIR (Not a directory)
exit_group(-1)                          = ?
+++ exited with 255 +++
root@dfs:~/glusterfs-client/dir1# ls
file1  poc  xattr-check
root@dfs:~/glusterfs-client/dir1# ./xattr-check 
listxattr 0
Success
root@dfs:~/glusterfs-client/dir1# ./xattr-check 
listxattr -1
No data available

lvtao-sec avatar Jul 05 '22 08:07 lvtao-sec

The problem here is that O_DIRECTORY cannot be used to create a file. openat() correctly returns ENOTDIR in this case.

From man page:

    ENOTDIR
           A component used as a directory in pathname is not, in fact, a directory, or O_DIRECTORY was specified and
           pathname was not a directory.

The issue here is that probably Gluster doesn't do the proper check and the file is partially created in posix (I'm just guessing). I'll try to check that when I have some time.

Why do you want to use O_DIRECTORY while creating a file ?

xhernandez avatar Jul 08 '22 07:07 xhernandez

Hi @xhernandez, thanks for your reply. Totally agree with your opinion.

The problem here is that O_DIRECTORY cannot be used to create a file. openat() correctly returns ENOTDIR in this case. The issue here is that probably Gluster doesn't do the proper check and the file is partially created in posix (I'm just guessing). I'll try to check that when I have some time.

According to the POSIX specification shown below, the behavior of open(file, O_CREATE|O_DIRECTORY|O_WDWR) is unspecified. And currently, in GlusterFS, it will create a regular file although it returns ENOTDIR. Based on this implementation of open, it should ensure the following file operations on this file are successful. Or it should just refuse to create a regular file when opening with these flags. I tried this PoC on ext4, and it's the former case.

If O_CREAT and O_DIRECTORY are set and the requested access mode is neither O_WRONLY nor O_RDWR, the result is unspecified. POSIX

Why do you want to use O_DIRECTORY while creating a file?

I'm doing the file system testing, and test cases are randomly generated.

lvtao-sec avatar Jul 08 '22 08:07 lvtao-sec

Hi @xhernandez, thanks for your reply. Totally agree with your opinion.

The problem here is that O_DIRECTORY cannot be used to create a file. openat() correctly returns ENOTDIR in this case. The issue here is that probably Gluster doesn't do the proper check and the file is partially created in posix (I'm just guessing). I'll try to check that when I have some time.

According to the POSIX specification shown below, the behavior of open(file, O_CREATE|O_DIRECTORY|O_WDWR) is unspecified. And currently, in GlusterFS, it will create a regular file although it returns ENOTDIR. Based on this implementation of open, it should ensure the following file operations on this file are successful. Or it should just refuse to create a regular file when opening with these flags. I tried this PoC on ext4, and it's the former case.

That's the bug. GlusterFS relies on the characteristics of the underlying filesystem, so it cannot do anything that the filesystem doesn't support. However there might be a bug that doesn't correctly handle the failure in this case and something is not correctly cleaned up, allowing future accesses to a file that has actually failed to be fully created.

xhernandez avatar Jul 08 '22 08:07 xhernandez

@xhernandez, yep, looking forward to seeing the patches. Thanks!

lvtao-sec avatar Jul 08 '22 08:07 lvtao-sec

Thank you for your contributions. Noticed that this issue is not having any activity in last ~6 months! We are marking this issue as stale because it has not had recent activity. It will be closed in 2 weeks if no one responds with a comment here.

stale[bot] avatar Mar 19 '23 22:03 stale[bot]