ozone
ozone copied to clipboard
HDDS-7121. Support namespace summaries (du, dist & counts) for legacy FS buckets
What changes were proposed in this pull request?
In the previous PR HDDS-5504, we refactored NSSummaryEndpoint and NSSummaryTask so that we can extract the bucket specific code to new separated classes. So far, Recon has only been supporting FSO buckets. All the existing bucket related code exists in handlers/FSOBucketHandler and tasks/NSSummaryTaskWithFSO. In order to add support for Legacy buckets, we created two new classes handlers/LegacyBucketHandler and tasks/NSSummaryTaskWithLegacy.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7121
How was this patch tested?
This patch was tested with new unit tests. We added two new test classes, TestNSSummaryEndpointWithLegacy and TestNSSummaryTaskWithLegacy. These two classes are as similar as possible to the corresponding FSO classes. We tried to keep the setups exactly the same with TestNSSummaryEndpointWithFSO and TestNSSummaryTaskWithFSO as if they were handling FSO related code.
So there is a problem with clearing the NSSummary table during reprocess().
Both the fso and legacy tasks write to the nssummary table. The reprocess() method for FSO clears the nssummary table here The Legacy task doesn't clear the table.
But both tasks run in separate threads in parallel There is no guarantee that FSO will run first.
In addition failed tasks get rerun here If one fails and the other doesn't the table clear won't work correctly.
Also because the two tasks are writing to the same table from separate threads, it would be good to clarify why no synchronization is needed.
Hi @xBis7 and @GeorgeJahad, thanks for the patch and the problem description. I believe if we already separated the NSSummary handling for legacy and FSO buckets (here) we can maintain two separate NSSummary tables, one for each bucket layout. This would help with the table clear and if one of them fails and there is a retry. Let me know what you think of this idea. cc @smengcl
Hi @dombizita, thanks for the feedback. With two separate tables, what happens when Recon gets statistics for both bucket layouts? Do we iterate both tables and sum the results?
@dombizita @smengcl Another solution to this, would be to have a common NSSummary table for both bucket layouts. Instead of calling both tasks, we could add process and reprocess methods to NSSummaryTask and call that class in place of the subclasses. NSSummaryTask.process will call both NSSummaryTaskWithFSO.process and NSSummaryTaskWithLegacy.process and we will do the same for reprocess. NSSummaryTask will also manage common setups or operations for NSSummary table, such as clearing it. What do you think about this approach?
If we have two separate tables for NSSummarys based on the bucket layout (legacy and FSO) we can iterate through both tables right after each other if there is a task that needs information from both types. I think this approach would be nice to avoid issues as we already separated the classes based on the bucket layout.
But as I looked into it for both bucket types we would maintain a table with the same columns, as we add long value (ID) and a NSSummary in both cases. Based on this it would be better to avoid having two same tables.
Your approach sounds good to me, if I understand correctly the NSSummaryTask.process would call eg. first NSSummaryTaskWithFSO.process and after the NSSummaryTaskWithLegacy.process, right?
@dombizita NSSummary isn't bucket specific. We would have an issue if we were gathering info for volumes but we are dealing with buckets.
NSSummaryTask.process would call eg. first NSSummaryTaskWithFSO.process and after the NSSummaryTaskWithLegacy.process, right?
Yes, you are right. This way we can control the order by which we are calling any bucket specific task. NSSummaryTask.process will return new ImmutablePair<>(getTaskName(), true) after both bucket tasks have finished running and getTaskName() will be NSSummaryTask. Since NSSummary isn't specific to bucket layout and the entries for different layouts won't have an attribute that differentiates them, it seems more clean to have one table instead of two tables with entries of the same type.
@GeorgeJahad I've made the changes based on your recommendations.
@dombizita I've made the changes so that NSSummaryTask gets called in place of NSSummaryTaskWithFSO and NSSummaryTaskWithLegacy, like we discussed. Let me know how it looks. I've also modified the tests to go only over each subclass and not NSSummaryTask. If we decide to move forward with this approach I will cleanup the code and add a new class TestNSSummaryTask that will test the case of having buckets of different layouts.
I just tried running: "ozone admin namespace summary /vol1"
It gets a warning:
[Warning] Namespace CLI is only designed for FSO mode.
Bucket being accessed must be of type FILE_SYSTEM_OPTIMIZED bucket layout.
It should be updated to something like this:
[Warning] Namespace CLI is only designed for FSO buckets or LEGACY buckets with "ozone.om.enable.filesystem.paths" set to true.
LGTM
@dombizita @smengcl Could you take a look at this patch?
@smengcl thanks for the review. I 've addressed all your comments except this one. Do you want to do something about the RunTimeException or do you want to leave it to prolong and crash the thread?
@smengcl I pushed the latest changes for getting FileSystemPaths value from configuration.
@smengcl CI looks good. Can we merge this?
Merged. Thanks @xBis7 for the work! Thanks @GeorgeJahad and @dombizita for the review.