beaker icon indicating copy to clipboard operation
beaker copied to clipboard

Value.clear_value is not optimal in case of a distributed cache

Open devilicecream opened this issue 8 years ago • 4 comments

In the file beaker/beaker/container.py:420, the Value class checks that the given key to remove from the cache exists in the namespace before attempting removal. While this is ok by concept, it is not optimal in cases of distributed cache, where reads are made locally and writes/deletes are spread to remote servers. Checking the existence of the key in the namespace before attempting removal in this case is dangerous, cause if the key is not present locally it doesn't get removed from the remote servers either, where it could be existing. Since the del self.namespace[self.key] is in a try: ... except: block anyway, I think the check could be removed without any problems.

devilicecream avatar Feb 10 '16 20:02 devilicecream

The issue is pretty much rooted in beaker design which takes for granted that the locks guarantee serialised access to the data, that was true for the original backends mostly on disk but it's not true on any system that is not strongly consistent.

As distributed systems often trade off consistency to ensure networking partitioning resilience on writes it leads to the same exact behaviour you are explaining.

I suppose in case of key removal the approach of considering deletion to be idempotent properly fixes your issue of keys not being deleted, but generally speaking consider that beaker takes for granted that backend is consistent and access is serialised, so it's not generally properly designed for non consistent systems.

Also before being able to apply such a change all existing backends should probably be checked that they do not throw unhandled exceptions when trying to delete a non existing entry.

In the mean time you can work-around the issue by configuring a region for reads and a region for writes and have updates/deletes happening on the region that is connected to the primary/master node.

amol- avatar Feb 10 '16 22:02 amol-

Interesting, any suggestions about an alternative library that does the same job, but considering not-consistent systems?

Should I run the tests with all the other backends or, given the library design, it is very unlikely that the patch is going to be accepted?

I already have a region-based approach on the cache operations, and since the reads are not consistent across all the nodes, this leads to the missing delete. A fix would be to restore the consistency on reads, but right now I prefer to keep this architectural choice and apply the patch from my own fork of the repo.

devilicecream avatar Feb 10 '16 23:02 devilicecream

Unless @bbangert can see any reason not to do so (I don't remember historical reason why delete wasn't considered idempotent in Beaker) I think we can merge the patch as soon as you are able to check that deletion of cache entries doesn't trigger exceptions different from KeyError for missing keys in in all beaker backends and https://github.com/didip/beaker_extensions backends

That should cope with your current need. My reference to beaker design lacking support for data consistency issues (in terms of CAP) is just that it might be that in the future you will face other similar issues in beaker as it seems to me that the issue wasn't much a problem at the time beaker was designed.

I don't think there are other solutions which will resolve consistency issues automatically, usually that's something you have to face at application level and more recent libraries like dogpile.cache provide an even lower level API leaving much of the work in the hands of the developer itself.

amol- avatar Feb 11 '16 12:02 amol-

I do happen to recall a case where failing to check for the key resulted in deletes issued that shouldn't have been. In some systems writes are quite expensive, so doing the check is cheaper than issuing deletes that aren't actually needed. I'm hesitant at this point to change the behavior as I have no idea what systems in the wild either purposely or inadvertently depend on that.

Any change to this behavior should be pref'd, such that it defaults to existing behavior unless a toggle of some sort is flagged.

bbangert avatar Feb 12 '16 20:02 bbangert