mpifileutils
mpifileutils copied to clipboard
Problems with dcp/dsync with ACLs
Using version 0.10.1
Simple test case:
b-cn0123 [ake]$ ls -lR dcp-test
dcp-test:
total 4
dr-xr-x---+ 3 ake folk 4096 Oct 2 21:02 q/
dcp-test/q:
total 4
dr-xr-x---+ 2 ake folk 4096 Oct 2 20:30 r/
dcp-test/q/r:
total 0
-r--r-x---+ 1 ake folk 0 Oct 2 20:30 file*
b-cn0123 [ake]$ getfacl -R dcp-test/
# file: dcp-test/
# owner: ake
# group: folk
user::r-x
user:yyy:r-x
group::r-x
mask::r-x
other::---
# file: dcp-test//q
# owner: ake
# group: folk
user::r-x
user:yyy:r-x
group::---
mask::r-x
other::---
# file: dcp-test//q/r
# owner: ake
# group: folk
user::r-x
user:yyy:r-x
group::---
mask::r-x
other::---
# file: dcp-test//q/r/file
# owner: ake
# group: folk
user::r--
user:yyy:r-x
group::---
mask::r-x
other::---
And this is the result:
b-cn0123 [stor10]$ mpirun -n 1 dcp --preserve /pfs/nobackup/home/a/ake/dcp-test /pfs/stor10/proj-test
[2020-10-02T21:13:03] Preserving file attributes.
[2020-10-02T21:13:03] Walking /pfs/nobackup/home/a/ake/dcp-test
[2020-10-02T21:13:03] Walked 4 items in 0.008278 secs (483.180546 items/sec) ...
[2020-10-02T21:13:03] Walked 4 items in 0.008357 seconds (478.636938 items/sec)
[2020-10-02T21:13:03] Copying to /pfs/stor10/proj-test
[2020-10-02T21:13:03] Items: 4
[2020-10-02T21:13:03] Directories: 3
[2020-10-02T21:13:03] Files: 1
[2020-10-02T21:13:03] Links: 0
[2020-10-02T21:13:03] Data: 0.000 B (0.000 B per file)
[2020-10-02T21:13:03] Creating directories.
[2020-10-02T21:13:03] level=6 min=1 max=1 sum=1 rate=118.951962/sec secs=0.008407
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:866] ERROR: Create `/pfs/stor10/proj-test/dcp-test/q' mkdir() failed (errno=13 Permission denied)
[2020-10-02T21:13:03] level=7 min=1 max=1 sum=1 rate=1242.445929/sec secs=0.000805
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:866] ERROR: Create `/pfs/stor10/proj-test/dcp-test/q/r' mkdir() failed (errno=2 No such file or directory)
[2020-10-02T21:13:03] level=8 min=1 max=1 sum=1 rate=26178.695777/sec secs=0.000038
[2020-10-02T21:13:03] level=9 min=0 max=0 sum=0 rate=0.000000/sec secs=0.000000
[2020-10-02T21:13:03] Created 3 directories in 0.009315 seconds (322.064200 items/sec)
[2020-10-02T21:13:03] Creating files.
[2020-10-02T21:13:03] level=6 min=0 max=0 sum=0 rate=0.000000 secs=0.000002
[2020-10-02T21:13:03] level=7 min=0 max=0 sum=0 rate=0.000000 secs=0.000000
[2020-10-02T21:13:03] level=8 min=0 max=0 sum=0 rate=0.000000 secs=0.000000
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:1113] ERROR: File `/pfs/stor10/proj-test/dcp-test/q/r/file' mknod() failed (errno=2 No such file or directory)
[2020-10-02T21:13:03] level=9 min=1 max=1 sum=1 rate=21728.266302 secs=0.000046
[2020-10-02T21:13:03] Created 1 items in 0.000194 seconds (5160.331500 items/sec)
[2020-10-02T21:13:03] Copying data.
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:1926] ERROR: Failed to open output file `/pfs/stor10/proj-test/dcp-test/q/r/file' (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:2059] ERROR: Failed to copy `/pfs/nobackup/home/a/ake/dcp-test/q/r/file' to `/pfs/stor10/proj-test/dcp-test/q/r/file'
[2020-10-02T21:13:03] Copy data: 0.000 B (0 bytes)
[2020-10-02T21:13:03] Copy rate: 0.000 B/s (0 bytes in 0.002185 seconds)
[2020-10-02T21:13:03] Syncing data to disk.
[2020-10-02T21:13:03] Sync completed in 0.000652 seconds.
[2020-10-02T21:13:03] Setting ownership, permissions, and timestamps.
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:328] ERROR: Failed to change ownership on `/pfs/stor10/proj-test/dcp-test/q/r/file' lchown() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:354] ERROR: Failed to change permissions on `/pfs/stor10/proj-test/dcp-test/q/r/file' chmod() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:491] ERROR: Failed to change timestamps on `/pfs/stor10/proj-test/dcp-test/q/r/file' utime() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:328] ERROR: Failed to change ownership on `/pfs/stor10/proj-test/dcp-test/q/r' lchown() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:354] ERROR: Failed to change permissions on `/pfs/stor10/proj-test/dcp-test/q/r' chmod() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:491] ERROR: Failed to change timestamps on `/pfs/stor10/proj-test/dcp-test/q/r' utime() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:328] ERROR: Failed to change ownership on `/pfs/stor10/proj-test/dcp-test/q' lchown() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:354] ERROR: Failed to change permissions on `/pfs/stor10/proj-test/dcp-test/q' chmod() (errno=2 No such file or directory)
[2020-10-02T21:13:03] [0] [/scratch/eb-buildpath/mpifileutils/0.10.1/gompi-2020a/mpifileutils-0.10.1/src/common/mfu_flist_copy.c:491] ERROR: Failed to change timestamps on `/pfs/stor10/proj-test/dcp-test/q' utime() (errno=2 No such file or directory)
[2020-10-02T21:13:03] Updated 4 items in 0.008529 seconds (468.992447 items/sec)
[2020-10-02T21:13:03] Syncing directory updates to disk.
[2020-10-02T21:13:03] Sync completed in 0.000095 seconds.
[2020-10-02T21:13:03] Started: Oct-02-2020,21:13:03
[2020-10-02T21:13:03] Completed: Oct-02-2020,21:13:03
[2020-10-02T21:13:03] Seconds: 0.021
[2020-10-02T21:13:03] Items: 1
[2020-10-02T21:13:03] Directories: 1
[2020-10-02T21:13:03] Files: 0
[2020-10-02T21:13:03] Links: 0
[2020-10-02T21:13:03] Data: 0.000 B (0 bytes)
[2020-10-02T21:13:03] Rate: 0.000 B/s (000 bytes in 0.021 seconds)
The same test with a directory without ACLs works perfectly.
Problem seem to be in mfu_copy_xattrs when called from mfu_create_directory. The xattrs contain system.posix_acl_default and system.posix_acl_access. When chmod -w has been done on the src dir, the system.posix_acl_access xattr contains user::r-x resulting in write-protected target.
I suggest filtering out any system.posix_acl xattrs in mfu_copy_xattrs, like in #401
I think part of the issue here is that mfu_copy_xattrs
is called immediately after creating each directory/file, to accommodate Lustre striping params:
https://github.com/daltonbohning/mpifileutils/blob/4ec784108d066abe5060ebb197c1dba83e88bd7d/src/common/mfu_flist_copy.c#L933-L942
In contrast, ownership, permissions, and timestamps are copied after copying the files, and starting from the deepest level, to handle similar issues with standard permissions.
I wonder if there is a solution that will allow the system.posix_acl*
xattrs to still be copied, while also circumventing this permission issue during directory creation.
Well if they can be edited so that you have u+w then it would be ok to copy them at that point in time.
Ideally, a full solution might look something like:
// copy lustre xattrs before copying files
setxattr(dst_dir) for each "lustre.*" xattr
// copy the files
// copy all other xattrs
setxattr(dst_dir) for all other non-lustre xattr
Or, we could filter out system.*
attrs in the first pass, and then the second pass would add those in.
But this would require two passes over the xattrs and some filtering, which would affect performance. But perhaps this performance hit could be mitigated somewhat by using flags in some way to determine if one or both passes are necessary.
@adammoody Do you have any thoughts on this? DAOS also utilizes the system.posix_acl*
attributes, so just not copying those at all would not be desirable.
@daltonbohning typically, system xattrs should not be copied directly, since they represent internal filesystem state. ACLs and such should be copied via the appropriate APIs, see GNU tar, for example.
@adilger On my system at least, cp --preserve=xattr src dst
does seem to copy xattrs
set with setfacl
(CentOS Linux release 7.8.2003)
Running strace cp --preserve src dst
:
...
fgetxattr(3, "system.posix_acl_access", ...)
fsetxattr(4, "system.posix_acl_access", ...)
...
So cp
does explicitly copy over the ACLs through xattr
access.
I think the underlying issue with dcp
isn't that the system.* xattrs
are being copied incorrectly (I.e. with incorrect API), but that they are just being copied at the wrong time. I think it could be possible to implement a sort of "two-pass" xattr
copy, which would be compatible with both Lustre (needs lustre.* xattrs
before copying files) and other setups such as yours (need system.* xattrs
after copying files)
Also, something that isn't yet supported is --preserve=[ATTR_LIST]
, which would allow specifically copying only some things (mode, ownership, timestamps, xattrs)
Yes, I get the impression that full xattr support needs to be generalized in a couple ways. It seems like some xattrs must be copied before and some settings should be copied after. We've also talked about ways to allow a user to filter out some xattr settings completely.
Related issues: https://github.com/hpc/mpifileutils/issues/324 https://github.com/hpc/mpifileutils/issues/49
Ah, yes! That would be a more complete solution.
I think there are a couple of options in this case:
- add
--xattr-include=<regexp>
and--xattr-exclude=<regexp>
options (allow multiple) to filter xattrs - use
/etc/xattr.conf
to filter xattrs, which is whatlibattr.so
checks
I can't find any documentation for /etc/xattr.conf
in a man page, but the Git repo has an example file like:
# /etc/xattr.conf
#
# Format:
# <pattern> <action>
#
# Actions:
# permissions - copy when trying to preserve permissions.
# skip - do not copy.
system.nfs4_acl permissions
system.nfs4acl permissions
system.posix_acl_access permissions
system.posix_acl_default permissions
trusted.SGI_ACL_DEFAULT skip # xfs specific
trusted.SGI_ACL_FILE skip # xfs specific
trusted.SGI_CAP_FILE skip # xfs specific
trusted.SGI_DMI_* skip # xfs specific
trusted.SGI_MAC_FILE skip # xfs specific
xfsroot.* skip # xfs specific; obsolete
user.Beagle.* skip # ignore Beagle index data
security.evm skip # may only be written by kernel