jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-10748: Improve statistics to collect which type of garbage is sent/deleted

Open Joscorbe opened this issue 1 year ago • 6 comments

Joscorbe avatar Jun 19 '24 07:06 Joscorbe

Will integrate this into tests in a future iteration (as stated in the ticket). This statistics are just informative and we could introduce potential flaky tests.

Joscorbe avatar Jun 24 '24 10:06 Joscorbe

actually, there are compilation errors:

[INFO] -------------------------------------------------------------
Error:  COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
Error:  /home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[367,11] setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is not public in org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be accessed from outside package
Error:  /home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[538,11] setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is not public in org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be accessed from outside package
[INFO] 2 errors 

stefan-egli avatar Jun 24 '24 10:06 stefan-egli

actually, there are compilation errors:

[INFO] -------------------------------------------------------------
Error:  COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
Error:  /home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[367,11] setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is not public in org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be accessed from outside package
Error:  /home/runner/work/jackrabbit-oak/jackrabbit-oak/oak-run/src/main/java/org/apache/jackrabbit/oak/run/RevisionsCommand.java:[538,11] setStatisticsProvider(org.apache.jackrabbit.oak.stats.StatisticsProvider) is not public in org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector; cannot be accessed from outside package
[INFO] 2 errors 

I will double-check this, sounds weird, I built it and ran all the integration tests locally...

Joscorbe avatar Jun 24 '24 11:06 Joscorbe

I will squash this PR into a single commit once approved.

Joscorbe avatar Jun 24 '24 11:06 Joscorbe

I understand that this PR is trying to get the stats of fullGC in terms of GCPhase, but the place is wrong in this PR.

We are collecting deletion stats while collecting potential garbage but while deletion, we might skip those documents/properties; thus, the actual deletion no. could be different.

cc @stefan-egli

rishabhdaim avatar Jun 25 '24 12:06 rishabhdaim

Right now we are only getting the aggregated number of garbage that gets deleted, but we don't know which of the phases found it. This PR focuses on the garbage that gets found by each of the phases, that's why the statistics are collected during the collection phase, not at deletion.

Joscorbe avatar Jun 25 '24 13:06 Joscorbe

+1, I don't have full overview of how many metrics this will introduce (i.e. whether that number might be excessive), but I think that's something that we'll see and can handle downstream, in case.

I have added a proper description to the PR. Thanks for double checking, will merge this end of the day.

Joscorbe avatar Aug 13 '24 09:08 Joscorbe