dogpile.cache
dogpile.cache copied to clipboard
Ability to invalidate in middle of 'get_or_create' value creation
Migrated issue, originally created by lxyu (lxyu)
I'm trying to integrate dogpilecache with sqlalchemy to create a real-time cache layer above RDBS.
To make it real-time, I'm gonna need the ability to invalidate a key during the creator. For example:
#!python
@region.cache_on_arguements()
def get(user_id):
return DBSession().using_bind("master").query(User).get(user_id)
And there's a event listener that will listen for after_commit
signal from sqlalchemy, if a user model is updated in commit, I'll issue a delete command on the corresponding user_id to expire it.
So what if a commit happens and issued a delete command right in the middle of creator generating a value? The delete command will be ignored since that key is not in the cache(only a lock).
This problem would be solved if I can invalidate the stale value during it's creation. And another get_or_create may also enter creator once the previous creator marked invalid.
Currently I think it can be implemented by create a customized lock with unique id inside.
The creator_a enter and require a lock with unique id, if a invalidate issued, it delete the value and lock(if exists). And another creator_b get the lock begin value generation. When creator_a go out and want to release the lock, it find out the unique id not matched, so it directly returned without write value to cache. Finally creator_b goes out and release the lock, the unique id matched so it write value to cache and returned.
This will provide a better answer to refresh fn in #36 too. The current implementation is not usable in a distribute situation, if the refresh called in multiple servers without lock, it may result in data inconsistent situation. If it can be replaced by a 'invalidate -> get_or_create' then it works better as a real refresh.
How do you think about the idea or do you have a better solution to this use case?
Michael Bayer (zzzeek) wrote:
So what if a commit happens and issued a delete command right in the middle of creator generating a value? The delete command will be ignored since that key is not in the cache(only a lock).
if you're really trying to serialize cache access with concurrent real time update operations then you need to use locks. so a delete would never occur within the creator generating a value.
This problem would be solved if I can invalidate the stale value during it's creation
The creator is where the actual database query happens. If you have a record, that means you just got it from the DB, it hasn't been deleted yet. your next step is to return the object. What exactly is going to tell you it's stale? It seems like you want to say "obj = get_from_db(); check_if_deleted(obj); return obj" ? that can't be right because it makes no sense, without locks there is no way to guarantee the object remains fresh in between the database fetch and the return, adding a "check and invalidate" makes no sense at all. And even then, if the "creator" marks something as stale, then what? what are you returning?
I can't really understand the paragraphs following that without understanding what exactly is happening.
lxyu (lxyu) wrote:
Try this example:
Suppose we have a User<id=1, username=’tom’, email=’[email protected]’>
in database.
And we have this function to get model obj using dogpile.cache:
@region.cache_on_arguements()
def get(user_id):
return DBSession().using_bind("master").query(User).get(user_id)
0s:
The get(1)
called and it enter to the creator since there’s no cache in region, it acquires the lock and then gets the user with email ’[email protected]’ from db.
1s: The user tom updated his email to ‘[email protected]’, and committed it to database.
2s:
Another get(1)
called and it waits there since it can’t acquire lock.
3s: The creator finishes, then release the lock and write serialised data to region cache.
4s:
The second get(1)
returns the wrong value from cache.
What I’m proposing is:
0s:
The get(1)
called and it enter to the creator since there’s no cache in region, it acquires the lock with id ‘12345’ and gets the user with email ’[email protected]’.
1s: The user tom updated his email to ‘[email protected]’, and committed it to database. And it delete the lock.
2s:
Another get(1)
called, it acquire the lock with id ‘24680’ and enter creator.
3s:
The first get(1)
finishes, when it want to release the lock, it finds that the lock id not matched to ‘12345’, it returns value with out write it to cache.
4s:
The 2nd get(1)
finishes, it release the lock and write the correct value to cache.
Michael Bayer (zzzeek) wrote:
What I'm not understanding is that the first call in which the caching of the object is "cancelled", still returns the old object. it's returning the "wrong" result. How is that any better? The traditional approach here is to of course serialize all cache writing operations. Which means the delete operation would wait for the first get to be complete, and then when the lock is free it's given the opportunity to affect the cache, where instead of invalidating the data, it would write the value right there. This is called write-through caching. If you want mutation operations to result in immediate cache updates, rather than relying upon expiration which always means there can be some staleness, this is the usual technique.
As I've said before, the whole point of dogpile is that of a cache which actually wants to return stale data. If the cache is never to be stale ever, then you don't need the dogpile feature at all, just read/write from the backends directly and read/write operations must be serialized using a read/write lock (e.g. multiple readers, single writer). The dogpile.core.readwrite_lock.ReadWriteMutex does this with threads and the dogpile.cache.backends.file.FileLock does this with flock().
Michael Bayer (zzzeek) wrote:
I can see now that the case where the "delete" proceeds in the middle of the original create produces the case where the wrong data is now in the cache. But it still seems awkward that we are attempting to perform multiple mutations to the cache not by the usual means of mutexing all write operations, but instead trying to "cheat" by intricately overlapping operations. Also, "delete a lock" is not an operation that works in the general case. If we are using a plain Python mutex, the lock cannot be "deleted" or otherwise changed. So again taking advantage of particular kinds of locks like Redis or Memcached and the fact that they can theoretically be "deleted" by an out of band operation seems really scary to me. Concurrency is hard and I don't know that this pattern is very solid, so baking it into the API leaves me pretty hesistant. It could be amazing, I don't know. I do know that just the simple "dogpile" pattern took me years to get right, though.
lxyu (lxyu) wrote:
Response to the 1st comment, the first call is returning "right" value for the time it's calling, so it's right cause we never gonna predict future.
Yah you're right, this only suits for distribute situation and would require a distributed lock such as redis.
The origin purpose is to create a near real-time cache layer above RDBS using sqlalchemy and dogpile. I have not came to a final decision myself, so it's just a discussion here, cause I think you may have worked through similar situations.
I'm glad to know and very understand that you want to keep dogpile cache simple and right. This use case is relatively narrow, I'll try a standalone implementation then.
And I'm very interested in the 'write-through' caching you mentioned in the first comment, is that possible with current sqlalchemy events? If it can be done, which signal do you recommend? As far as I know the 'write-through' has better perfomance but much harder to get it right due to concurrent write(session.commit()) and read.
Michael Bayer (zzzeek) wrote:
if you wanted to do "write-through" caching you might use the session after_flush event - you can detect in that event everything that's changed. However, you don't know for sure if that data will be committed successfully. If you wanted to be more conservative, you could keep track of things you care about in after_flush and then write to the cache in after_commit.