ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-7121. Support namespace summaries (du, dist & counts) for legacy FS buckets

Open xBis7 opened this issue 3 years ago • 12 comments

What changes were proposed in this pull request?

In the previous PR HDDS-5504, we refactored NSSummaryEndpoint and NSSummaryTask so that we can extract the bucket specific code to new separated classes. So far, Recon has only been supporting FSO buckets. All the existing bucket related code exists in handlers/FSOBucketHandler and tasks/NSSummaryTaskWithFSO. In order to add support for Legacy buckets, we created two new classes handlers/LegacyBucketHandler and tasks/NSSummaryTaskWithLegacy.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7121

How was this patch tested?

This patch was tested with new unit tests. We added two new test classes, TestNSSummaryEndpointWithLegacy and TestNSSummaryTaskWithLegacy. These two classes are as similar as possible to the corresponding FSO classes. We tried to keep the setups exactly the same with TestNSSummaryEndpointWithFSO and TestNSSummaryTaskWithFSO as if they were handling FSO related code.

xBis7 avatar Sep 09 '22 23:09 xBis7

So there is a problem with clearing the NSSummary table during reprocess().

Both the fso and legacy tasks write to the nssummary table. The reprocess() method for FSO clears the nssummary table here The Legacy task doesn't clear the table.

But both tasks run in separate threads in parallel There is no guarantee that FSO will run first.

In addition failed tasks get rerun here If one fails and the other doesn't the table clear won't work correctly.

GeorgeJahad avatar Sep 15 '22 22:09 GeorgeJahad

Also because the two tasks are writing to the same table from separate threads, it would be good to clarify why no synchronization is needed.

GeorgeJahad avatar Sep 15 '22 22:09 GeorgeJahad

Hi @xBis7 and @GeorgeJahad, thanks for the patch and the problem description. I believe if we already separated the NSSummary handling for legacy and FSO buckets (here) we can maintain two separate NSSummary tables, one for each bucket layout. This would help with the table clear and if one of them fails and there is a retry. Let me know what you think of this idea. cc @smengcl

dombizita avatar Sep 20 '22 11:09 dombizita

Hi @dombizita, thanks for the feedback. With two separate tables, what happens when Recon gets statistics for both bucket layouts? Do we iterate both tables and sum the results?

xBis7 avatar Sep 21 '22 15:09 xBis7

@dombizita @smengcl Another solution to this, would be to have a common NSSummary table for both bucket layouts. Instead of calling both tasks, we could add process and reprocess methods to NSSummaryTask and call that class in place of the subclasses. NSSummaryTask.process will call both NSSummaryTaskWithFSO.process and NSSummaryTaskWithLegacy.process and we will do the same for reprocess. NSSummaryTask will also manage common setups or operations for NSSummary table, such as clearing it. What do you think about this approach?

xBis7 avatar Sep 21 '22 17:09 xBis7

If we have two separate tables for NSSummarys based on the bucket layout (legacy and FSO) we can iterate through both tables right after each other if there is a task that needs information from both types. I think this approach would be nice to avoid issues as we already separated the classes based on the bucket layout. But as I looked into it for both bucket types we would maintain a table with the same columns, as we add long value (ID) and a NSSummary in both cases. Based on this it would be better to avoid having two same tables. Your approach sounds good to me, if I understand correctly the NSSummaryTask.process would call eg. first NSSummaryTaskWithFSO.process and after the NSSummaryTaskWithLegacy.process, right?

dombizita avatar Sep 22 '22 18:09 dombizita

@dombizita NSSummary isn't bucket specific. We would have an issue if we were gathering info for volumes but we are dealing with buckets.

NSSummaryTask.process would call eg. first NSSummaryTaskWithFSO.process and after the NSSummaryTaskWithLegacy.process, right?

Yes, you are right. This way we can control the order by which we are calling any bucket specific task. NSSummaryTask.process will return new ImmutablePair<>(getTaskName(), true) after both bucket tasks have finished running and getTaskName() will be NSSummaryTask. Since NSSummary isn't specific to bucket layout and the entries for different layouts won't have an attribute that differentiates them, it seems more clean to have one table instead of two tables with entries of the same type.

xBis7 avatar Sep 23 '22 10:09 xBis7

@GeorgeJahad I've made the changes based on your recommendations.

xBis7 avatar Sep 26 '22 13:09 xBis7

@dombizita I've made the changes so that NSSummaryTask gets called in place of NSSummaryTaskWithFSO and NSSummaryTaskWithLegacy, like we discussed. Let me know how it looks. I've also modified the tests to go only over each subclass and not NSSummaryTask. If we decide to move forward with this approach I will cleanup the code and add a new class TestNSSummaryTask that will test the case of having buckets of different layouts.

xBis7 avatar Sep 26 '22 13:09 xBis7

I just tried running: "ozone admin namespace summary /vol1"

It gets a warning:

[Warning] Namespace CLI is only designed for FSO mode.
Bucket being accessed must be of type FILE_SYSTEM_OPTIMIZED bucket layout.

It should be updated to something like this:

[Warning] Namespace CLI is only designed for FSO buckets or LEGACY buckets with "ozone.om.enable.filesystem.paths" set to true.

GeorgeJahad avatar Oct 04 '22 23:10 GeorgeJahad

LGTM

GeorgeJahad avatar Oct 06 '22 18:10 GeorgeJahad

@dombizita @smengcl Could you take a look at this patch?

xBis7 avatar Oct 11 '22 13:10 xBis7

@smengcl thanks for the review. I 've addressed all your comments except this one. Do you want to do something about the RunTimeException or do you want to leave it to prolong and crash the thread?

xBis7 avatar Oct 19 '22 14:10 xBis7

@smengcl I pushed the latest changes for getting FileSystemPaths value from configuration.

xBis7 avatar Oct 20 '22 17:10 xBis7

@smengcl CI looks good. Can we merge this?

xBis7 avatar Oct 25 '22 17:10 xBis7

Merged. Thanks @xBis7 for the work! Thanks @GeorgeJahad and @dombizita for the review.

smengcl avatar Oct 25 '22 18:10 smengcl