will
will copied to clipboard
internal_roster can become a stale cache; no way to reset
The redis entry "will_roster" is persisted and holds a mapping of hipchat ids to often-accessed user data. It's initialized first in update_will_roster_and_rooms() and the user data within is accessed very many times.
Each subsequent update_will_roster_and_rooms() is additive and will not overwrite any user data that already exists (i.e. 'if user is not "": ..." and "if not hasattr nick: ...").
When a user changes their name/nick/mention_name, there is no mechanism to push that change into the "will_roster" cache, so their user data (from will's POV) will continue to be their old user data.
This certainly bit me writing a plugin that expected message.sender.nicks to accurately represent their current nick, not the first nick that the will plugin ever saw them use.
In order of the strength of changes:
- Why is this cache persisted to disk and not just held in memory?
- Assuming it's better/easier to leave it persisted, can it clear the cache on startup? (self.save("will_roster", {}) during boot path somewhere.)
- Assuming that hack is too ugly, can update_will_roster...() check for discrepancies in the cache on each invocation instead of giving up when it checks for existence?
- Assuming this behaviour is intentional or otherwise too ugly to fix, can a ("SERIOUSLY!...") admin-y access be added that drops this cache and forces it to be regenerated?
I have a temporary workaround in the form of hand-editing the "will_roster" on any nick changes (carefully, dotting i's, crossing t's), though it's less than desirable.
Hey @jhermes ,
Thanks for this detailed report and question. You're completely right - as is, the roster can get stale.
If we can come up with an elegant solve, I'm all for it.
A couple quick answers to your questions:
- The persisting to redis is for thread-share purposes, and we will need a cross-thread way to persist the data. At present, that's using redis (or one of the new datastore backends)
- Clearing on startup is possible, but has two problems: 1) We're rate-limited by hipchat, and getting user info ends up being one of the more expensive operations. 2) This will still mean that the roster is stale for orgs that don't restart their wills very often.
- Yes, and that'd be a definite improvement.
- Yeah, but if we can figure out a solve, I'd like to!
Given that, and since you're fairly deep in the code right now anyhow, what would you recommend as a good solve?
Thanks, -Steven
Thanks for the quick reply and some more insight into how will works. I've only really been interacting with will from the plugin layer, so this is my first time diving into will proper. I'm just glad that this is what I thought it was and that I wasn't missing something obvious and making a bogus issue.
What I'm not so sure about is the event handler for "roster_update" which calls join_rooms() (which calls update_roster...()). How often is that called/what sends events? If that is a fairly common event, then sure, the 3rd fix to just have it update the cache on every call will certainly work. If it's only a boot-path/first-join thing, then it's equivalent to the 2nd fix with the same downsides.
(As an aside, any variation of the 2nd fix would be complete for me since I don't have any reservations about restarting will (and often do, sometimes just for fun).)
Overall, I know that changing nicks/names isn't super common so I don't want to suggest anything that will cause undue slowdown on the "normal path".
Hey @jhermes ,
Off the top of my head, I can't remember how often it's called either - I'd have to dig in. I can take a look on my next sprint, or if you've got time, you're more than welcome to take a look!
There's also a fourth option - listening for nick changes if they're broadcast over XMPP or via hipchat's apis.
Broadly though, I'd expect that doing some sort of polling where we update the roster every hour or so will handle most use cases. Would that cover yours?
Polling for cache updates would also be complete for me and a little bit of stale cache won't cause too much issue.
Avoiding staleness at all via acting directly on name change notifications would be optimal, but I have no clue if that's broadcast or not. If so, that would absolutely be the best solution, but I've just assumed that it didn't exist.
Cool. Polling should be simple - in fact, you could actually probably just write a will plugin using periodic
that does it.
As for the broadcasts, doesn't look like it at first glance via the api docs.
Want to give a plugin a shot?
Polling in a periodic in a plugin is certainly doable and I have some code which is getting close to complete for it.
I have access to the "will_roster" value in redis (or the reference held in internal_roster) for reading and modifying, and have easy access to the get_hipchat_user(...) method for getting up to date information from the server.
The only remaining questions are the fiddly bits.
- Should the code for this live in the roster mixin called by a periodic 1-liner plugin, or should it live entirely in the plugin? It feels somewhat dirty to touch "will_roster" from a plugin.
- Do you care about will saying anything at all when it updates, or is it completely silent/backgrounded?
- Given that rate limitation is a concern, periodically spiking requests seems less than ideal. Is it worth it to try to spread out the load over time in any way?
Ouch, that rate-limiting is way more severe than I anticipated.
Per poll, I do N individual user requests. My N is small enough that one poll worked fine, but trying to run it twice in quick succession (and there was probably noise on the bot otherwise) hung the bot waiting on a response from the server. Killing and restarting the bot also is unsuccessful as the first thing it does is try to get room info while it's still on a server hatelist.
If N is large, I would imagine it would never finish even a single refresh, so it's probably going to need to be spread out somehow.
Using bits from the jid or hipchat_id as a bucket_id would be decent, ex.
@minutely
def foo:
for jid in jids:
if current_minute == int(jid.split("_")[1]) % 60: process...
I'll see if I can get this into a PR with example code to start tracking it there and stop mucking up the issue comments.
Yeah, the rate limits are crazy harsh.
Answers to questions - going a bit faster than I like right now, so please speak up if I've missed something:
Should the code for this live in the roster mixin called by a periodic 1-liner plugin, or should it live entirely in the plugin? It feels somewhat dirty to touch "will_roster" from a plugin.
I'd be +1 to living in the mixin, and then have the plugin call it.
Do you care about will saying anything at all when it updates, or is it completely silent/backgrounded?
Silent is ideal, to me. The largest output I'd expect is a logger.info()
.
Given that rate limitation is a concern, periodically spiking requests seems less than ideal. Is it worth it to try to spread out the load over time in any way?
+1 as you found out. However, if you're just checking handle (not jid), there should be a bulk endpoint, if I'm not mistaken.
Ping if you don't find it, and when I've got a few cycles, I'll dig it out!
Thanks for your patience with the short message!
Restarted will many times but the will_roster still have the stale data. Can i restart redis daemon on the server hosting will. Or can i just delete the will_roster ? If i delete the will_roster key, will be it be populated next time i restart will ?
Hey @satish-chef - any more information you've got would be helpful.
Including:
- What version of Will are you running?
- What backends (if you're using 2.x) are you using?
- When you say that the roster still has stale data, what leads you to that conclusion? Any error messages, source code, and examples you can paste would be really helpful.
Thanks much for helping get this figured out!
Hello @skoczen , Thanks for your reply. Here are my answers:
- From will/init.py, i see that the version is 0.9.3.
- We are using python version 2.7.8.
- The will_roster didn't have the updated @mention_name in it for many Hipchat users. I searched in github issues and found this link. At the beginning, i manually changed the @mention_name for myself in will_roster key but that caused will bot not respond to anyone. Later on i truncated the will_roster key entirely but that didn't work s well. Since the issue blocking our developers, i deleted the key will_roster from redis database and restarted will and it came up fine. It seems that the solution for this issue is to delete the key will_roster from redis database weekly or monthly(and restart will bot) so that the user's @mention_name gets updated in will_roster key.
Hey @satish-chef , great, thanks for all the information.
Keeping the hipchat roster updated while not running into hipchat's rate limits has been an ongoing work in progress. The ideal place we'd like to land is having Will keep an updated list (refreshed something like every minute or so) - or as close to that as possible. Having to purge a key and restart the bot isn't a real working solution. :)
From my understanding, though, this is largely working for people in 1.x - there are more than enough Will installs in the wild that if roster was staying massively stale, I'd expect to hear about it.
That said, you're also on a significantly older version, and I do wonder if this would still happen for you in 2.x (or even 1.x) (and if it did, that's where I'd work on putting in a fix anyhow.) 0.9.3
was released over two years ago, and quite a bit has happened since then, including updates that address the area you're having trouble.
What are the chances you'd be able to update your version of Will to something newer? Do note that 2.x is a big change, and might take a few minutes of renaming environment variables/etc (there are docs.) It's also a relatively recent release with big changes, so there might be some bugs that snuck through.
But - in the big picture where Atlasssian is killing HipChat and moving people to Stride (and with the concerns and questions that are coming with their process), 2.x is probably a good call, since it would let you keep using your Will, even if you have to switch to another chat system.
Let me know if an upgrade is possible for you!
Cheers, -Steven
@skoczen Thanks for the detailed information.
The ideal place we'd like to land is having Will keep an updated list (refreshed something like every minute or so) - or as close to that as possible I want to know how this is possible, what code hack would be required to do this.
We are thinking of moving to Slack pretty soon.
I guess the big question for me is - If we upgrade to 2.x, all the will bot's functionality work fine with Slack which used to work with Hipchat ?
@satish-chef - for sure!
I guess the big question for me is - If we upgrade to 2.x, all the will bot's functionality work fine with Slack which used to work with Hipchat ?
Yep - that's exactly why I wrote 2.x. :)
If there are specific things you're doing that aren't supported, open an issue, and they'll get fixed.
Also, Will 2 supports multiple backends - so you can actually just run the same will in both HipChat and Slack at the same time, test it out, and find any issues.
@skoczen Can you please share how to refresh will_roster key with the current will bot code that i am having ?
The ideal place we'd like to land is having Will keep an updated list (refreshed something like every minute or so) - or as close to that as possible
@satish-chef what version are you using? On 2.x, roster is updated based on the platform.
@skoczen The version of will bot that i am using is 0.9.3
@satish-chef to get it working on 0.9.3, you're going to have to write it yourself - but a much, much, much easier and better way is to upgrade to 1.x, or ideally 2.x.
If there are no further comments, I am going to close out this issue since it sounds resolved in 2.x