ozone
ozone copied to clipboard
HDDS-7290. provide a config to increase the list batch size in OzoneFileSystem
What changes were proposed in this pull request?
Currently the value of list page size for listing items of ozone fs sub-commands output is set as hardcoded value 1024.
A new configuration "ozone.client.fs.listing.page.size" will be provided to configure the value.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7290
How was this patch tested?
Manual testing and configured value is reflecting.
2022-10-18 09:28:49,534 [IPC Server handler 51 on default port 9862] WARN om.OzoneConfigUtil: ozone.client.fs.listing.page.size config value is greater than server config ozone.server.fs.listing.page.size value currently set at : 1024, so limiting the config value to be used at server side to max value supported at server - 1024
@adoroszlai @nandakumar131 , pls review
@devmadhuu this might not be in scope for this change, but we need a limit applied server side as well. The client should not be able to send an arbitrarily large count to the server.
@devmadhuu this might not be in scope for this change, but we need a limit applied server side as well. The client should not be able to send an arbitrarily large count to the server.
Thanks @kerneltime for reviewing the PR, Understood the point. I have added the config on server side as well and below will be the behavior:
if client sends the config value for num of listing entries higher than server side config, then cap at server side val, If client sends less than what is configured at server side, then send same number of entries as what client sent.
Request you to pls re-review
This looks good, will go over it one more time before approving.
@ayushtkn , thank you for your review and pointing out the missing case, so here is the scenario for our test cases in "TestRootedOzoneFileSystem".
- If I give value at client less than equal to 1, it hung.
- If I give value more than 1 upto 3, then it doesn't hung any of the test case of TestRootedOzone file, but "listStatusCheckHelper" method makes assertion fail for 2-3 test cases.
- If give value more than 3, all test cases passed.
So, lets investigate in a separate JIRA item as it is highly unlikely and unrealistic that a client would set a value less than equal to 3 and also this may be our test case problem as well and not the actual problem, I have created a new JIRA to track this: https://issues.apache.org/jira/browse/HDDS-7360
So, lets investigate in a separate JIRA item as it is highly unlikely and unrealistic that a client would set a value less than equal to 3 and also this may be our test case problem as well and not the actual problem, I have created a new JIRA to track this: https://issues.apache.org/jira/browse/HDDS-7360
Makes sense, we can investigate further on a different Jira,
Just to clear it isn't a Test-Only
issue. It is bug in the prod code. The listStatusRoot behaves differently than the regular listStatus. So that is most probably the reason for the failure of the 2nd case.
Moreover 3
isn't as such a magic number, it is because of the number of volumes, If you create more volumes the tests will fail with higher page size as well.
ex:
diff --git a/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java b/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java
index 96ac47f06..1946e60fb 100644
--- a/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java
+++ b/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/fs/ozone/TestRootedOzoneFileSystem.java
@@ -101,6 +101,8 @@
import static org.apache.hadoop.fs.FileSystem.TRASH_PREFIX;
import static org.apache.hadoop.fs.ozone.Constants.LISTING_PAGE_SIZE;
import static org.apache.hadoop.ozone.OzoneAcl.AclScope.ACCESS;
+import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_CLIENT_FS_LISTING_PAGE_SIZE;
+import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_CLIENT_FS_LISTING_PAGE_SIZE_DEFAULT;
import static org.apache.hadoop.ozone.OzoneConfigKeys.OZONE_FS_ITERATE_BATCH_SIZE;
import static org.apache.hadoop.ozone.OzoneConsts.OZONE_URI_DELIMITER;
import static org.apache.hadoop.ozone.om.OMConfigKeys.OZONE_OM_ADDRESS_KEY;
@@ -184,8 +186,6 @@ public static Path getBucketPath() {
return bucketPath;
}
- @Rule
- public Timeout globalTimeout = Timeout.seconds(300);
private static boolean enabledFileSystemPaths;
private static boolean omRatisEnabled;
@@ -217,6 +217,7 @@ public static Path getBucketPath() {
public static void initClusterAndEnv() throws IOException,
InterruptedException, TimeoutException {
conf = new OzoneConfiguration();
+ conf.setInt(OZONE_CLIENT_FS_LISTING_PAGE_SIZE, 50);
conf.setFloat(OMConfigKeys.OZONE_FS_TRASH_INTERVAL_KEY, TRASH_INTERVAL);
conf.setFloat(FS_TRASH_INTERVAL_KEY, TRASH_INTERVAL);
conf.setFloat(FS_TRASH_CHECKPOINT_INTERVAL_KEY, TRASH_INTERVAL / 2);
@@ -1084,6 +1085,9 @@ private void listStatusCheckHelper(Path path) throws IOException {
public void testListStatusRootAndVolumeRecursive() throws IOException {
Path bucketPath1 = createRandomVolumeBucketWithDirs();
Path bucketPath2 = createRandomVolumeBucketWithDirs();
+ for(int i=0;i<55;i++) {
+ createRandomVolumeBucketWithDirs();
+ }
// listStatus("/volume/bucket")
listStatusCheckHelper(bucketPath1);
// listStatus("/volume")
Run this and it will fail with 50 as well.
Regarding the 1st one the Client getting Hanged, yep may be nobody will do that, but better when you chase the new jira have a validation to not allow a value which looks genuine but leads to issues...
The 2nd issue I feel is independent of this PR, 1st one is exposed by this PR but can say not a prod use case & can be chased separately in the new Jira.
Rest changes LGTM. Leaving for others to have the final look & commit. Thanx Everyone!!!
@ayushtkn , as discussed, this has been handled in https://issues.apache.org/jira/browse/HDDS-7360
Please address the logging level change on the server.
@kerneltime , logging level has been changed to debug now as discussed. Please re-review and merge the PR.