HDDS-10881. Recursive delete on special volume /tmp fails
What changes were proposed in this pull request?
Recursive delete on the /tmp special volume, as well as its FSO/LEGACY buckets, fails using the ozone shell delete -r -y command with "java.lang.RuntimeException: Failed to clean bucket."
As part of this patch, whenever we encounter the /tmp special volume, we call the volume.deleteBucket() API and we in turn utilize the bucket.deleteKeys() API to delete individual keys by iterating over the list of keys, instead of going with the current fs.delete() API over the path.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10881
How was this patch tested?
Tested on a local docker-compose cluster.
Volume creation using ozone sh:
bash-4.2$ ozone sh volume create tmp
bash-4.2$ ozone sh bucket create tmp/dir1
bash-4.2$ ozone sh bucket info tmp/dir1
{
"metadata" : { },
"volumeName" : "tmp",
"name" : "dir1",
"storageType" : "DISK",
"versioning" : false,
"listCacheSize" : 1000,
"usedBytes" : 0,
"usedNamespace" : 0,
"creationTime" : "2024-06-03T11:58:22.160Z",
"modificationTime" : "2024-06-03T11:58:22.160Z",
"sourcePathExist" : true,
"quotaInBytes" : -1,
"quotaInNamespace" : -1,
"bucketLayout" : "FILE_SYSTEM_OPTIMIZED",
"owner" : "om",
"link" : false
}
Before Changes:
bash-4.2$ ozone sh bucket delete -r -y tmp/dir1
Could not delete bucket dir1.
After Changes:
bash-4.2$ ozone sh bucket delete -r -y tmp/dir1
Bucket dir1 is deleted
bash-4.2$ ozone sh bucket list tmp
[ ]
Before Changes:
bash-4.2$ ozone sh volume delete -r -y tmp
Exception in thread "pool-2-thread-1" java.lang.RuntimeException: Failed to clean bucket
at org.apache.hadoop.ozone.shell.volume.DeleteVolumeHandler$BucketCleaner.run(DeleteVolumeHandler.java:202)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
VOLUME_NOT_EMPTY
After Changes:
bash-4.2$ ozone sh volume delete -r -y tmp
Volume tmp is deleted
bash-4.2$ ozone sh volume list | grep tmp
bash-4.2$
Volume creation using ozone fs:
bash-4.2$ ozone fs -mkdir ofs://omservice/tmp/
bash-4.2$ ozone sh volume info tmp
{
"metadata" : { },
"name" : "tmp",
"admin" : "om",
"owner" : "om",
"quotaInBytes" : -1,
"quotaInNamespace" : -1,
"usedNamespace" : 1,
"creationTime" : "2024-06-03T12:31:24.334Z",
"modificationTime" : "2024-06-03T12:31:24.334Z",
"acls" : [ {
"type" : "USER",
"name" : "om",
"aclScope" : "ACCESS",
"aclList" : [ "ALL" ]
}, {
"type" : "GROUP",
"name" : "om",
"aclScope" : "ACCESS",
"aclList" : [ "ALL" ]
} ],
"refCount" : 0
}
bash-4.2$ ozone sh bucket list tmp
[ {
"metadata" : { },
"volumeName" : "tmp",
"name" : "d58da82289939d8c4ec4f40689c2847e",
"storageType" : "DISK",
"versioning" : false,
"listCacheSize" : 1000,
"usedBytes" : 0,
"usedNamespace" : 0,
"creationTime" : "2024-06-03T12:31:24.416Z",
"modificationTime" : "2024-06-03T12:31:24.416Z",
"sourcePathExist" : true,
"quotaInBytes" : -1,
"quotaInNamespace" : -1,
"bucketLayout" : "FILE_SYSTEM_OPTIMIZED",
"owner" : "om",
"link" : false
} ]
bash-4.2$ ozone fs -touch ofs://omservice/tmp/sample.txt
bash-4.2$ ozone fs -ls ofs://omservice/tmp/
Found 1 items
-rw-rw-rw- 3 om om 0 2024-06-03 12:33 ofs://omservice/tmp/sample.txt
Before Changes:
bash-4.2$ ozone sh volume delete -r -y o3://omservice/tmp/
Exception in thread "pool-2-thread-1" java.lang.RuntimeException: Failed to clean bucket
at org.apache.hadoop.ozone.shell.volume.DeleteVolumeHandler$BucketCleaner.run(DeleteVolumeHandler.java:202)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
VOLUME_NOT_EMPTY
After Changes:
bash-4.2$ ozone sh volume delete -r -y tmp
Volume tmp is deleted
bash-4.2$ ozone fs -ls ofs://omservice/
Found 1 items
drwxrwxrwx - om om 0 2024-06-03 12:30 ofs://omservice/s3v
@sadanand48, @ashishkumar50 could you please review the changes? Thanks.
@tanvipenumudy could you push an empty commit? CI is not triggered, probably because it was converted from a draft.
Thank you @sadanand48 for the review. It appears that the CI has been triggered now, thanks @nandakumar131.
@tanvipenumudy, why are we running into error when we use fs.delete on FSO bucket?
why are we running into error when we use fs.delete on FSO bucket?
This is because, we know that:
- When user:A performs mkdir
ofs://<ozone-service-id>/tmp-> this auto-creates a bucket (whose name is the MD5-hash(user:A) - When user:B performs mkdir
ofs://<ozone-service-id>/tmp-> this auto-creates a bucket (whose name is the MD5-hash(user:B)
Let us assume the scenario wherein:
- user:A -> mkdir
ofs://<ozone-service-id>/tmp-> this creates a bucket MD5-hash(user:A) - If user:A tries to fs.delete() /tmp -> bucket MD5-hash(user:A) is identified.
- If user:B tries to fs.delete() /tmp -> bucket MD5-hash(user:B) is prompted as does not exist and results in the 'Failed to clean bucket' RuntimeException.
Marking the PR as draft for now until we come up with a consensus
cc: @nandakumar131, @sadanand48, @ashishkumar50
It seems using ofs we can't access buckets inside tmp volume and so no keys will be visible apart from keys inside auto created hashed bucket for that user. Also ofs doesn't allow to create bucket manually inside tmp volume. But sh allows bucket creation(FSO/LEGACY/OBS) inside tmp volume but those buckets are not accessible through fs commands. tmp is intended for special file system volume, but using sh it's allowed to do all the operation inside tmp volume which leads to discrepancies.
tmp is intended for special file system volume, but using sh it's allowed to do all the operation inside tmp volume which leads to discrepancies.
@ashishkumar50 Are you suggesting that we don't allow "ozone sh" to delete buckets inside tmp?
tmp is intended for special file system volume, but using sh it's allowed to do all the operation inside tmp volume which leads to discrepancies.
@ashishkumar50 Are you suggesting that we don't allow "ozone sh" to delete buckets inside tmp?
If bucket creation is allowed using sh then deletion also should be allowed inside tmp. I am okay with the change done in this PR to delete keys individually as some of them can't be accessed using fs commands and hence can't be deleted through fs command.
Only concern here is user1 can't delete files inside tmp of other user2 file using fs command. But same user1 can delete files through sh command of other user2 file. If we want to allow tmp volume deletion better we can show some warning message here?
I have not gone through the code but I remember that sticky bit behaviour was implemented in this feature i.e the hashed buckets should have acls that state only owner can write to it, We should validate if that holds true here and if it does, ozone sh volume delete tmp performed by user1 should fail due to an insufficient permissions to delete hashed bucket created by user2.
I think key acls are now persisted with key ownership feature (HDDS-7791) , we should also check if bucket acls are respected.
If user1 runs below command
ozone fs -rm -r -skipTrash ofs://om/tmp/
It will just delete user1 files inside user1 hashed bucket
But if same user1 runs below command
ozone sh volume delete -r -y tmp
In this case it will delete all users file.
Both command looks same using different system but does different work. User will be unaware if this will happen.
I have not gone through the code but I remember that sticky bit behaviour was implemented in this feature i.e the hashed buckets should have acls that state only owner can write to it, We should validate if that holds true here and if it does,
ozone sh volume delete tmpperformed by user1 should fail due to an insufficient permissions to delete hashed bucket created by user2. I think key acls are now persisted with key ownership feature (HDDS-7791) , we should also check if bucket acls are respected.
If this is the case then I think it is fine. It should fail for other user hashed bucket.
Both command looks same using different system but does different work. User will be unaware if this will happen.
Right I agree with your point, I think we can look into why user1 allows deleting user2's bucket if user2's bucket has sticky bit acl. Ideally it shouldn't allow that. For this PR, I'm okay if we just log/print saying that tmp has buckets from the other user which cannot be deleted, so log in as that user and delete them manually
If user1 runs below command ozone fs -rm -r -skipTrash ofs://om/tmp/ It will just delete user1 files inside user1 hashed bucket
But if same user1 runs below command ozone sh volume delete -r -y tmp In this case it will delete all users file.
We should not have such inconsistent behaviour, we can throw error if the user performing the delete doesn't have permission. Here the user is explicitly performing delete operation on the whole tmp directory (ozone fs -rm -r -skipTrash ofs://om/tmp/), but we are returning success without deleting the whole data.
If we are allowing same operation via both Object Store API and Filesystem API, we should be consistent.
/pending consensus
Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.
It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.
It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.
If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."