accesscontroltool icon indicating copy to clipboard operation
accesscontroltool copied to clipboard

drop authorizable cache

Open kwin opened this issue 3 years ago • 1 comments

takes too much memory looking up authorizable by ID is really fast also cache empty memberships

kwin avatar Sep 27 '22 07:09 kwin

This needs further investigation as it seems at least sometimes lookup of authorizables by IDs is relying on Query internally. @otarsko Please provide further insights here.

kwin avatar Oct 18 '22 15:10 kwin

Did local test on dummy data and got next results: With ~20k groups in AEM and ~3k groups affected by configuration got next results:

  • Authorizables installation:
    • with cache - Finished installation of authorizables without errors in 24.9sec
    • without cache - Finished installation of authorizables without errors in 6.7min
  • Complete installation:
    • with cache - Successfully applied AC Tool configuration in 10.6min
    • without cache - Successfully applied AC Tool configuration in 16.3min
  • biz.netcentric.cq.tools.actool.impl.AcInstallationServiceImpl.installAuthorizables() run time:
    • with cache - 3.9% of execution time
    • without cache - 41.3% of execution time

Top image - with cache, bottom - without (changes from this PR + #649 image

Looks like Query is used to extract the Authorizable, which leads to the slowness: image image

otarsko avatar Oct 27 '22 07:10 otarsko

2 ideas (not very original though) I have:

  • Provide possibility to enable/disable cache, notifying user that second is decreasing consumption of the memory, but will lead to the increased execution time.
  • Use cache based on SoftReferences, to let GC collect them, when this is really needed.

@kwin wdyt?

otarsko avatar Oct 27 '22 09:10 otarsko

The query being used for authorizable ids is the following: https://github.com/apache/jackrabbit-oak/blob/5a1a902e7a89fc44cb9e2f59b0c6939efa9c16e4/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/identifier/IdentifierManager.java#L342-L366. Usually that should be fast, but it is obviously not neglectable. The most important metric is how big the memory impact of removing the authorizable cache really is. @otarsko Do you have metrics for that as well?

If that is considerable we could just cache the path per authorizable id, as that lookup does not require a search.

kwin avatar Oct 27 '22 11:10 kwin

@kwin, in the cloud we had nearly 4 GB in the cache with 3.0.4: image

However locally, on the bigger amount of authorizables to process (17314 authorizables), I didn't manage to get even close to that amount with cache in place: image

So, from one side - OOM in AEMaaCS was, most probably, caused by the cache. On the other side - It's not reproducible locally.

otarsko avatar Oct 27 '22 12:10 otarsko

@otarsko Please try https://github.com/Netcentric/accesscontroltool/pull/647/commits/fa36728fdf495880129983a94b8ac53acc45a06a.

kwin avatar Oct 28 '22 13:10 kwin

@kwin latests changes look good to me:

with 17319 authorizables and without oak index:

  • 3.0.6 - 26.9min
  • this branch - 27.1min

Memory consumption is also OK (no visible changes to the previous test) 2,000 MB -

otarsko avatar Oct 31 '22 10:10 otarsko