s3fs-fuse icon indicating copy to clipboard operation
s3fs-fuse copied to clipboard

s3fs1.91 : can't touch file using NFS

Open shiwu515 opened this issue 2 years ago • 9 comments

Version of s3fs being used s3fs --1.91

Version of fuse being used 2.9.2

Kernel information 3.10.0-327.el7.x86_64

GNU/Linux Distribution, if applicable

NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"

s3fs command line used, if applicable

s3fs test /mnt/s3fs -o passwd_file=/etc/.passwd-s3fs -o url=http://... -o use_path_request_style -o noxmlns -o dbglevel=debug -f -o curldbg | tee -a /root/log.txt

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

if you execute s3fs with dbglevel, curldbg option, you can get detail debug messages 2022-04-18T01:41:39.697Z [INF] s3fs.cpp:s3fs_create(905): [path=/7.c][mode=100644][flags=0xc1] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_parent_object_access(627): [path=/7.c] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(519): [path=/] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(524): [pid=32142,uid=0,gid=0] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:get_object_attribute(350): [path=/] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(519): [path=/7.c] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(524): [pid=32142,uid=0,gid=0] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:get_object_attribute(350): [path=/7.c] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_parent_object_access(627): [path=/7.c] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(519): [path=/] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:check_object_access(524): [pid=32142,uid=0,gid=0] 2022-04-18T01:41:39.697Z [DBG] s3fs.cpp:get_object_attribute(350): [path=/] 2022-04-18T01:41:39.697Z [INF] s3fs.cpp:s3fs_create(951): !!!!!test111 2022-04-18T01:41:39.697Z [INF] cache.cpp:AddStat(343): add stat cache entry[path=/7.c] 2022-04-18T01:41:39.697Z [INF] cache.cpp:DelStat(591): delete stat cache entry[path=/7.c] 2022-04-18T01:41:39.697Z [DBG] fdcache.cpp:Open(538): [path=/7.c][size=0][time=-1][flags=0xc1][force_tmpfile=no][create=yes][ignore_modify=no] 2022-04-18T01:41:39.698Z [DBG] fdcache_entity.cpp:Open(411): [path=/7.c][physical_fd=-1][size=0][time=-1][flags=0xc1] 2022-04-18T01:41:39.700Z [INF] s3fs.cpp:s3fs_create(967): !!!!!test333 2022-04-18T01:41:39.700Z [INF] s3fs.cpp:s3fs_getattr(763): [path=/7.c] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:check_parent_object_access(627): [path=/7.c] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:check_object_access(519): [path=/] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:check_object_access(524): [pid=32142,uid=0,gid=0] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:get_object_attribute(350): [path=/] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:check_object_access(519): [path=/7.c] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:check_object_access(524): [pid=32142,uid=0,gid=0] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:get_object_attribute(350): [path=/7.c] 2022-04-18T01:41:39.700Z [DBG] cache.cpp:GetStat(266): stat cache hit [path=/7.c][time=240938.657257555][hit count=0] 2022-04-18T01:41:39.700Z [DBG] fdcache.cpp:OpenExistFdEntity(646): [path=/7.c][flags=0x0] 2022-04-18T01:41:39.700Z [DBG] fdcache.cpp:Open(538): [path=/7.c][size=-1][time=-1][flags=0x0][force_tmpfile=no][create=no][ignore_modify=no] 2022-04-18T01:41:39.700Z [DBG] fdcache_entity.cpp:Open(411): [path=/7.c][physical_fd=17][size=-1][time=-1][flags=0x0] 2022-04-18T01:41:39.700Z [DBG] s3fs.cpp:s3fs_getattr(786): [path=/7.c] uid=0, gid=0, mode=100644 2022-04-18T01:41:39.700Z [DBG] fdcache.cpp:Close(721): [ent->file=/7.c][pseudo_fd=3] 2022-04-18T01:41:39.700Z [DBG] fdcache_entity.cpp:Close(207): [path=/7.c][pseudo_fd=3][physical_fd=17] 2022-04-18T01:41:39.700Z [INF] s3fs.cpp:s3fs_release(2485): [path=/7.c][pseudo_fd=2] 2022-04-18T01:41:39.700Z [INF] cache.cpp:DelStat(591): delete stat cache entry[path=/7.c] 2022-04-18T01:41:39.700Z [INF] fdcache.cpp:GetFdEntity(485): [path=/7.c][pseudo_fd=2] 2022-04-18T01:41:39.700Z [DBG] fdcache.cpp:Close(721): [ent->file=/7.c][pseudo_fd=2] 2022-04-18T01:41:39.700Z [DBG] fdcache_entity.cpp:Close(207): [path=/7.c][pseudo_fd=2][physical_fd=17] 2022-04-18T01:41:39.700Z [INF] fdcache.cpp:GetFdEntity(485): [path=/7.c][pseudo_fd=-1]

Details about issue

S3fs Connection Storage As a server, clients use NFS(V3 or V4) connection, clients cannot create file objects. example: touch 7.c touch: setting times of ‘7.c’: No such file or directory

shiwu515 avatar Apr 18 '22 03:04 shiwu515

@shiwu515 I checked your s3fs log and after that, I thnik that it seems to have successfully created the /7.c file.(by touch 7.c command)

If I don't mis-understand, your client accesses to the file in which is mounted by NFS, and the NFS server shares the s3fs mount point, right? What does this file(7.c) look like on the host running s3fs when this issue occurs? (you can get a result by ls command, etc.) Also if possible, first try to run touch command directly on the host running s3fs. (Try it without going through NFS) The reason for this is that I thought we should first know the difference between using NFS and not.

Thanks in advance for your assistance.

ggtakec avatar May 17 '22 11:05 ggtakec

@ggtakec Your understanding is right, and this file(7.c) does not exist in the s3fs on the host when this problem occurs. This works fine if you run the touch command directly on the host running s3fs.

I used tcpdump to obtain NFS packets and found that NFS client sent three requests. The first request was to check whether the file existed, the second request was to create a file, and the third request was to change the file access time. The first two requests were successful, but after NFS responded to create a file successfully, the s3fs log appeared, calling s3fs_release released the file (i don't know why), and did not "PUT" the file. As a result, the third request received a non-existent response

shiwu515 avatar May 24 '22 07:05 shiwu515

@shiwu515 Sorry for the late reply. I was able to reproduce this problem. As you pointed out, s3fs_flush has never been called, but s3fs_release has been called and the file has not been uploaded. This only happened when mounted with NFS.

I will try to find out more about this.

ggtakec avatar Jun 02 '22 13:06 ggtakec

@ggtakec Thank you very much for your help

shiwu515 avatar Jun 06 '22 00:06 shiwu515

@shiwu515 I have identified the cause of this and fixed it.(see #1957)

When NFS receives a touch command, it tries to create a regular file using mknod. By mknod system call, FUSE calls the same hook(create) as a normal file open to create a regular file and closes the file handle. At this time, flush was not called for the file because it was triggered by the mknod call.

The current s3fs delays the creation of new files until they are flushed. As a result, the file could not be created and the touch command failed.

If you can try the before merging code(https://github.com/ggtakec/s3fs-fuse/tree/fix_release_without_flush) for #1957, please test it.

ggtakec avatar Jun 10 '22 16:06 ggtakec

kahing/goofys#175 has some discussion about NFS and mknod. Is this sufficient to support NFS or will s3fs need more changes?

gaul avatar Jun 19 '22 05:06 gaul

With the modification of mknod issue(#1957), s3fs can read and write from NFS. However, additional modifications are still needed.

If user creates or deletes a file from an s3fs process directly, the NFS client does not recognize.(#1961)

To fix this, it is simply that mtime/ctime in the parent directory where the file was created/deleted is updated. (This is a regular local disk, the same as updating the timestamp in the parent directory when the i-node is updated) When checking the file(directory) from the NFS client, the timestamp of the parent directory is checked, so this fix will make it work properly.

I'm fixing the code I have now and I'm trying the case of accessing the s3fs mount from NFS and it almost works.(I will create a PR soon) But, in addition to modifying mtime/ctime in the parent directory, another modification is required, and I'm working on that as well.(Please wait a little)

ggtakec avatar Jun 19 '22 06:06 ggtakec

thank you @ggtakec , let me know if I can help you doing some tests of NFS client mounting s3fs export?

giulianoc avatar Jun 20 '22 16:06 giulianoc

@giulianoc I am updating s3fs codes to support this NFS issue, and it probably is submitted in two separate PRs. I submitted one of them(first PR: #1964 ). Once it's merged, I'll submit the next PR. It would be great if you could test it when I submitted both PR. Thanks in advance for your help.

ggtakec avatar Jun 20 '22 17:06 ggtakec

Added update_parent_dir_stat option to update parent directory stat information when file operations are performed. With this, I think that you can catch file and directory updates when used with NFS. See #2016 for details.

This issue will be closed, but if you still have problems please reopen it or post a new issue.

ggtakec avatar Feb 12 '23 09:02 ggtakec

hi @ggtakec ,
sorry to use this closed issue, I think there is more effort to do to support NFS.

It seems that we are not care abort file INO now.

if we want to support NFSD, I think we need to add "noforget" flag, otherwise we may run into "stale file handle" problem. Because NFS use file handle to operate files, we can not guarantee the file ino not changing after forget.

Or we should use fuse low level API to calculate INO based on file path?

VVoidV avatar Mar 27 '23 03:03 VVoidV

@VVoidV Thanks for your kindness.

The current inode returned by s3fs for FUSE calls is either the inode number of the cache file or the temporary file. If the cache file exists and is kept, the inode can be maintained, but it is not guaranteed. If you are given an option like noforget, you should either have another table or prepare some other way.(And it will affect memory usage) It may be difficult to address this issue immediately.

This matter will be continued as a separate Issue as #2145.

ggtakec avatar Apr 03 '23 14:04 ggtakec