stretto
stretto copied to clipboard
Getting `fail to send msg to channel: cache set buf sender: channel closed` on inserts
Under some pressure, we start seeing this and the cache stops adding new entries. Any guidelines on troubleshooting this?
Hi, could you provide some example code for me to reproduce?
@al8n it's hard to reproduce in a development environment also from me, but under real load it starts happening after some minutes. Tracking down the code it seems like the insert buffer channel gets closed at some point and then we start seeing this. I've been trying to reproduce this also using the benchmarking scripts in stretto, but no luck so far.
This is why I'd be interested on some guidance to troubleshoot this.
@sebasmagri which one are you using, Cache
or AsyncCache
?
AsyncCache. I think it might be related to using wait too often. I had to do that since I started seeing some issues otherwise, but now I've removed it and I still get a very high dropped sets rate.
Il mer 13 apr 2022, 12:59 Al Liu @.***> ha scritto:
@sebasmagri https://github.com/sebasmagri which one are you using, Cache or AsyncCache?
— Reply to this email directly, view it on GitHub https://github.com/al8n/stretto/issues/13#issuecomment-1097913805, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACXAJLWD6XIZNMAAMQRT3VE2SJBANCNFSM5S2D7HCQ . You are receiving this because you were mentioned.Message ID: @.***>
I've also tried to track if .close is getting called explicitly at some point but didn't find any cases.
For the record, the issue I get when I don't explicitly wait after inserting is that then I find the expiration map mostly already borrowed on update.
Found something interesting using tokio-console, when this condition happens, I can only see the policy task in the tasks list. so could this confirm that AsyncCache::spawn
is going down, potentially by receiving a stop event and exiting the loop?
Yes, it could be AsyncProcessor stops. It's weird, because the only way to stop AsyncProcessor is to call close method or the reference counter of AsyncCache goes to 0. May need to add tracing as a new feature for this crate to help find possible bugs.
As I'm putting the cache instance as managed state in a Rocket application, i wonder if the second case could be possible, however in such case also the policy should be dropped right?
Il mer 13 apr 2022, 20:03 Al Liu @.***> ha scritto:
Yes, it could be AsyncProcessor stops. It's weird, because the only way to stop AsyncProcessor is to call close method or the reference counter of AsyncCache goes to 0. May need to add tracing as a new feature for this crate to help find possible bugs.
— Reply to this email directly, view it on GitHub https://github.com/al8n/stretto/issues/13#issuecomment-1098335229, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACXAJBFEIJS5KKHL4TGDLVE4D5LANCNFSM5S2D7HCQ . You are receiving this because you were mentioned.Message ID: @.***>
Yes, it should be. However, it may also possible the PolicyProcessor be kept. Because in AsyncCache, the inner fields are wrapped by different Arc. It may be a bug.
@al8n let me know if I can help checking somehow. We're looking forward to deploy a component to production using stretto and would like to get these issues sorted out.
Thanks! The plan is to add the tracing stuff for the crate, especially the Processors. So that we can know when the Processor stops. Would you like to do this? Or I can start on this weekend. Once I complete, I will let you know and you can test it again.
@al8n any updates on this front? I had a few days out and would like to catch up with this.
Sorry, there is no update. I am busy with my uni study and my full-time work these days.
The ExpirationMap is RwLock<RefCell<HashMap<K,V>>>, when trying to borrow mutable reference failed, the CacheProcessor will return, and finally the channel closed. Maybe you can remove the RefCell, just use RwLock's write method instead. Further more, return statement in CacheProcessor is not reasonable when error occurred. @sebasmagri @al8n
Thanks @phantomhker! I am really silly at that moment for adding a RefCell
when there is already a RwLock
.