dbal icon indicating copy to clipboard operation
dbal copied to clipboard

Cache key matters

Open Seldaek opened this issue 9 years ago • 11 comments

This is a post-mortem of an issue I hit yesterday on Packagist. I did a lot of reading of the Doctrine cache code in the process of understanding exactly what happened and why, and here are some thoughts/notes that hopefully can lead to improvements. It's a bit lengthy, sorry :)

The failure

A few days ago I moved the Doctrine cache to use the PredisCache driver instead of ApcCache. Yesterday the https://packagist.org/explore/ page was suddenly returning a 500. I saw that it failed with an out of memory error in the Predis connection, so that narrowed it down quite a bit to this query: https://github.com/composer/packagist/blob/f457090/src/Packagist/WebBundle/Entity/VersionRepository.php#L94-106

As we had an out of memory failure while decoding a redis response I checked for large keys and figured out that the [new_releases][1] key was taking 60MB. That seemed odd as it only caches 10 records but I thought fair enough let's see what's in there. I wiped the key and the site got back on track.

Invalid Assumptions

As you can see in the query, we run it with ->setParameter('now', date('Y-m-d H:i:s')); to get results that are older than the current time. Since I want to perform the query only every 15minutes I set the cache like this ->useResultCache(true, 900, 'new_releases') which the way I understood it means run that query and cache it for 900 seconds in that cache key. As I did not include the parameters in the cache key, I assumed that it would work.

That assumption is incorrect due to the fact doctrine creates an internal "real cache key" at https://github.com/doctrine/dbal/blob/f0eaaf/lib/Doctrine/DBAL/Cache/QueryCacheProfile.php#L101 which includes the query parameters. This is used together with the cache key, and means that my cache never worked as it re-creates it new every time the now param changes (which is every second..).

That was a first surprise, but fair enough you can write that down to a misunderstanding on my part, although I would also say that it's quite unintuitive that if I specify a cache key it's not actually taken for what it is. If I rely on automatic cache key generation using the params makes sense I guess. I realize this is hard to fix without BC break though but it'd be great to have a way to really set a cache key + real cache key.

Compound Problems

The above explains that I was caching for nothing, fair enough. However there is an additional issue that really caused this to go to hell. The way the cache is then stored is also very unintuitive, and I would argue quite buggy. As you see in https://github.com/doctrine/dbal/blob/3210835/lib/Doctrine/DBAL/Cache/ResultCacheStatement.php#L108-L114 what happens is, the cache key is fetched, and it contains an array of real cache keys and their values. Then the real cache key and result are added to it, and it's written again in the cache.

This causes two problems:

  • The first is that the TTL is not actually enforced per real-cache-key but rather every time you write a new real-cache-key entry ALL of them for that one cache key end up having their TTL reset as they are stored together. This is solved if you do not pass a cache key and rely on auto-generation instead, as the auto-generation will create a unique cache key for every real cache key.
  • The other one is that in my case since I specify a cache key, every second basically you get a new real-cache-key, that is added to that single cache key in redis, so every request that hits the site adds another 10 rows of result cache inside that one key, and it keeps growing and growing and growing. Due to the first issue about the TTL, as long as someone requests it every 15min, the key keeps growing, until at some point unserialize() just blows up trying to handle such a large amount of data.

Solutions

As a summary/TL;DR, here are a few ideas:

  • Fix the TTL issue by using a compound of cache key + real cache key as the key sent to the cache backend. No more bundling of real cache keys in one big array (I didn't really understand why it was done that way.. might be a valid reason I can't think of).
  • Allow users to specify a real cache key somehow, or just drop that concept entirely and use my cache key when I give you one without adding the parameters info to it.
  • Discourage specifying cache keys, as it's actually harmful right now, and it doesn't do anything useful really.

Thanks for reading.

Seldaek avatar May 12 '16 16:05 Seldaek

Fix the TTL issue by using a compound of cache key + real cache key as the key sent to the cache backend. No more bundling of real cache keys in one big array (I didn't really understand why it was done that way.. might be a valid reason I can't think of).

I really don't know about the reasons behind it, but I suspect changing the code will break some tests, and there we'll discover why.

Allow users to specify a real cache key somehow, or just drop that concept entirely and use my cache key when I give you one without adding the parameters info to it.

Very error-prone. The caching system in ORM seems to work reliably, we'll just need to push it down into DBAL too, I suppose

Discourage specifying cache keys, as it's actually harmful right now, and it doesn't do anything useful really.

Fully support this approach: it is a detail, and the user should really avoid specifying the key. If a user needs to interact with the cache, he can retrieve the QueryCacheProfile and then operate on the generated key instead.

Ocramius avatar May 12 '16 21:05 Ocramius

Btw when you would use a NOW() SQL function in the query. The query would not change anymore with every request.

jenschude avatar May 13 '16 06:05 jenschude

@jayS-de I would say that such a query is not cachable anyway, as one of the parameters relies on global state

Ocramius avatar May 13 '16 06:05 Ocramius

It is. Cause it's by intention that the result should only be updated every 15 minutes. But it's a bad idea in SQL to use parameters where the DBMS supports you with internal functions. Cause a DBMS can also be optimized to usw query cache. But the query cache only works for the same queries. Which would be the case with NOW()

jenschude avatar May 13 '16 07:05 jenschude

@Ocramius when changing the query to the following:

$qb->select('v')
            ->from('Packagist\WebBundle\Entity\Version', 'v')
            ->where('v.development = false')
            ->andWhere('v.releasedAt < NOW()')
            ->orderBy('v.releasedAt', 'DESC')
            ->setMaxResults($count);

@Seldaek would never have run into this issue, cause he would not have needed any parameter which had been used for the cache-key calculation. As well as the DBMS would have the possibility, when setup for result caching, to cache this query.

And as you also can see now you don't have a global state anymore. :)

jenschude avatar May 13 '16 07:05 jenschude

Just as a side note: imho CURRENT_TIMESTAMP is preferred over NOW(), because it's part of the SQL-92 standard. btw: @jayS-de nice pic :)

maglnet avatar May 13 '16 07:05 maglnet

@Seldaek I quickly read your issue, and I promise to re-read and provide a more throughly comment later. We store the parameters together with the query because IN (:param) can be expanded into multiple IN (?, ?, ?), depending of the provided parameters. Sounds simple, but remember that a single parameter binding also may be reused multiple times in the same query too (like WHERE u.email = :email OR u.username = :email).

guilhermeblanco avatar May 13 '16 12:05 guilhermeblanco

@maglnet @jayS-de if you don't wanna be helpful, don't feel like you have to answer :) Both CURRENT_TIMESTAMP and NOW() can not be used within doctrine (at least without defining new query type stuff, which I am not gonna waste my time on when using date() works fine..).

@Ocramius @guilhermeblanco I still think if you specify a key it should just take it without overriding it with this real cache key, because otherwise AFAIK there is no way to use the cache key to "suppress" one of the query's parameters like I was trying to do with the date. I didn't figure out a way to override the key as you say by using the QueryCacheProfile, but maybe I missed smth.

Anyway I fixed it by handling caching myself now as you can see in https://github.com/composer/packagist/blob/29017b6d041619de1435864796c665ac7912efcf/src/Packagist/WebBundle/Entity/VersionRepository.php#L102

Not an issue for me anymore, I definitely learned my lesson, but for others to avoid WTFs I think the cache key param should be deprecated if it can't be fixed or at least the docs should clearly say that specifying it is discouraged and that it does not have any actual effect on caching it's just used for grouping real cache keys together, which lets you blast the cache by name I guess, but also means that cache operations get less efficient for that one cache key as all real cache keys are stored together within it.

Seldaek avatar May 17 '16 11:05 Seldaek

@Seldaek Okay have to admit, I didn't tried the snippet with doctrine. Tested it now and I see what was the problem. As Now is mysql specific, the doctrine query builder doesn't allows it to be used.

$qb->select('v')
            ->from('Packagist\WebBundle\Entity\Version', 'v')
            ->where('v.development = false')
            ->andWhere('v.releasedAt < CURRENT_TIMESTAMP()')
            ->orderBy('v.releasedAt', 'DESC')
            ->setMaxResults($count);

with CURRENT_TIMESTAMP as suggested by magInet at least Doctrine don't complains and creates a valid SQL statement. SELECT p0_.name AS name_0, p0_.version AS version_1, p0_.description AS description_2 FROM package_version p0_ WHERE p0_.development = 0 AND p0_.releasedAt < CURRENT_TIMESTAMP ORDER BY p0_.releasedAt DESC LIMIT 10

See also DQL Functions

jenschude avatar May 17 '16 12:05 jenschude

Ah, I tried CURRENT_TIMESTAMP without ().. but anyway it's fixed.

Seldaek avatar May 17 '16 13:05 Seldaek

Yep. Only improvement would be, that you could use the doctrine result cache again.

jenschude avatar May 17 '16 13:05 jenschude