pymemcache HashClient._set_many() fails to report failures

trafficstars

Following the fix of #319 in #320, set_many() does not return a list of failed keys like it used to, when ignore_exc is True.

This was detected by the tests, but instead of fixing the code, #320 includes a change to the tests, to mark the new behavior as valid.

With the old code, when exceptions are ignored, we still get a boolean value reporting the success of each operation, and the keys to be set are properly collected into their proper lists.

For the new code, there is only one set_many() operation. I submit that the correct semantics, when ignore_exc is True and the underlying operation still raised an exception, is to still avoid raising, but return all the keys in the failed collection, not the succeeded one.

Oct 06 '21 15:10 shaib

@martinnj thoughts?

Oct 06 '21 16:10 jogo

It's been a hot minute since I wrote that PR. But I seem to recall the reason for the change, and therefore the tests being "weakened", was related to two things:

Something to do with how/what data gets back from BaseClient.set_many that made it hard to figure out the failed keys.
The order (or lack thereof) of keys in Python dicts prior to 3.6.

There is a little discussion about it in #320, but don't mind giving it a once over, and seeing if I missed something. I'll hopefully have time tomorrow.

Oct 06 '21 18:10 martinnj

@jogo : I've dug into the code again.

The long answer

So two unit-tests where changed in #320, one of the relate to the semantics described above, I'll just go over both to make it clear what I'm talking about:

test_set_many_unix: This one was changed because you didn't know which order the keys would be in the dict, prior to Python 3.6, dicts where unordered. Since Python 2 support have been dropped from the package this can technically be undone as it will always be key2 that fails now.
test_ignore_exec_set_many: This one is the interesting one. Basically the fact that the "old" HashClient was able to return the keys was a side-effect of it sending the keys one-by-one, which let it know which failed. Since the "new" client just calls BaseClient.set_many which does not return failed keys even when ignore_exec=True the HashClient can't know this either. The interesting lines are from BaseClient._store_cmd: https://github.com/pinterest/pymemcache/blob/25e38dfbf488d969b42af7a3299e33044e7a8730/pymemcache/client/base.py#L1097-L1108

_store_cmd doesn't take into account the self.ignore_exec value, neither does the _raise_errors method it calls.

TLDR

So the solution to this issue is to make the BaseClient exhibit the intended behaviour first, when this is implemented, HashClient will automatically do the same, as it just parrots the replies from the collection of inner clients.

Extra

It also looks like HashClient does not hand down the ignore_exec argument it recieves to any BaseClients it constructs itself, as it never adds it to the default kwargs: https://github.com/pinterest/pymemcache/blob/25e38dfbf488d969b42af7a3299e33044e7a8730/pymemcache/client/hash.py#L90-L104

Oct 07 '21 06:10 martinnj

@martinnj Thanks for the deep and enlightening analysis. This explains a lot.

I would prefer that, in cases where a whole operation on a specific client fails, all the keys for that client are marked as failed; I understand that HashClient may not be always be able to tell the difference between "whole client failed" and "specific key failed", and further, I agree that it's more sensible to fix the problem at the source than to take my original suggestion, which is at best a workaround.

Oct 07 '21 08:10 shaib

pymemcache pymemcache copied to clipboard

HashClient._set_many() fails to report failures

pymemcache
pymemcache copied to clipboard