jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-10494 - Add cache to reduce number of remote blobstore calls.

Open ahanikel opened this issue 1 year ago • 3 comments

The cache size should be made configurable and I should probably add some tests, too, but we can still start the discussion.

ahanikel avatar Oct 16 '23 11:10 ahanikel

@amit-jain Could you have a look when you have time? Thanks a lot! The test is kinda stupid but I couldn't think of a better one...

ahanikel avatar Oct 17 '23 08:10 ahanikel

@ahanikel the PR looks good but I am not sure if the invalidation also has to happen somewhere. The tests are in CachingDataStoreTest for this class.

Overall, I think the call need not go to the backend in case the the corresponding file is available in the cache.

amit-jain avatar Oct 17 '23 09:10 amit-jain

@amit-jain I think the least recently used entries in the cache automatically fall off the cliff when the maxSize is reached, but I've still added an expiration of 15 minutes, so we don't waste memory unnecessarily.

I've added my test to the tests in CachingDataStoreTest. I've also tried to use Mockito but it does not seem to work in this case (modifying the class variable seems to be ignored somehow in the mocking process).

Overall, I think the call need not go to the backend in case the the corresponding file is available in the cache.

Yes, but my understanding is that we try to avoid loading the blob if we don't need it, and so if we only call backend.getRecord() it never ends up in the existing cache, and therefore getRecordIfStored() always falls back to backend.getRecord(). If my understanding is wrong, then there must be something else at play, because the recordCache has a measurable impact on performance (I've put the details on my internal git.corp account unfortunately, the link is in GRANITE-47685).

ahanikel avatar Oct 18 '23 10:10 ahanikel