hypothesis icon indicating copy to clipboard operation
hypothesis copied to clipboard

Cache fixes

Open jobh opened this issue 1 year ago • 5 comments

Fix some issues with the cache implementation. No actual bugs, only latent issues and cleanup.

  • The cache docs were were not in sync with the code: it was documented that on_access should return the updated score, but the code expected the score to be modified in-place. Code changed to match docs.
  • Simplify by removing the unnecessary count of pinned entries.
  • "ignore pin() on expired keys" is unsafe: if we later insert the key, the pin-state of the key is unknown. This can be observed "in the wild" in our current usage — by adding artificial cache pressure before pin() we fail when calling unpin() on a not-pinned key.
  • max_size=0 (ignoring new entries) is no longer allowed, as that violates the handy "last entry successfully set is present in cache" invariant.

jobh avatar Jun 24 '24 10:06 jobh

If you dislike the new pin semantics, a fairly straightforward alternative is to allow "sticky" pinning of unknown keys by storing them as not_set. I.e., it's the key that's pinned, even if it's not set; and the entry will start out pinned if it's added later. I see some beauty in this approach, but it's perhaps a bit unconventional.

jobh avatar Jun 24 '24 20:06 jobh

Unfortunately it might be a week or two before I have space to reload context and review this properly, but if e.g. @tybug wants to review this I'm happy to confirm and merge.

(also, fantastic that we have coverage again - it's already catching an untested branch 🥳)

Zac-HD avatar Jun 25 '24 05:06 Zac-HD

will check this in the next few days!

Liam-DeVoe avatar Jun 27 '24 05:06 Liam-DeVoe

Am I correct in understanding that with the new pin semantics, if we pin a key that has been evicted (and don't pass value), we will get a KeyError? This feels like a bit of a footgun.

  • old
    • semantics: ignore KeyError when pinning an evicted key
    • footgun: if a key is evicted before you pin it, and you later re-cache that key, you think it's pinned but the cache doesn't, so if you unpin it you get a surprise error
  • new
    • semantics: pin key + error if no key exists by default, optionally choose to pin a k/v pair atomically which cannot error
    • footgun: if you forget to use the atomic option, or can't because you only have the key, you may get a fatal KeyError under cache pressure

I would say your proposed sticky semantics solves these concerns and would actually advocate for that. It makes me a bit nervous in its fanciness but I can't think of a failure case?

Other changes look good 👍. I added one comment where I had to stop and think about a condition, but feel free to remove if you think it's clear without.

Liam-DeVoe avatar Jun 30 '24 06:06 Liam-DeVoe

Thanks for the review @tybug!

  • old footgun: if a key is evicted before you pin it, and you later re-cache that key, you think it's pinned but the cache doesn't, so if you unpin it you get a surprise error
  • new footgun: if you forget to use the atomic option, or can't because you only have the key, you may get a fatal KeyError under cache pressure

Yes, this is correctly understood, and a nice explanation of the problem.

I pushed a change now to just require the value to be supplied to pin(). It is a bit unsatisfying, being less orthogonal and more repetitive API-wise. But on balance I think it's wiser than the sticky semantics, they do add a little complexity which we don't really need in our internal usage (we always have the value).

(I have implemented the sticky semantics too, so it's no extra work to switch tough)

jobh avatar Jun 30 '24 10:06 jobh

Note, I added a minor change to always compare sort_key instead of score, and to remove a bad comment and associated no-cover pragma.

jobh avatar Jul 01 '24 11:07 jobh

Note, I added a minor change to always compare sort_key instead of score, and to remove a bad comment and associated no-cover pragma.

It turned out this was a real bug (heap-property violation), so I spent some more time understanding and fixing the heap balancing. The stateful test now checks the heap property at each step, so I'm fairly confident about the change. Sorry for doing this post-review.

jobh avatar Jul 02 '24 07:07 jobh