sopel
sopel copied to clipboard
Bot does not rejoin channels reliably after a netsplit
It works most of the time but often enough it doesn't. Maybe this is the responsibility of the network? In any case, someone more in tune with the IRC protocol might have an elegant way of making sure the bot stays in the channel(s).
Even "someone more in tune with the IRC protocol" can't do anything about this without raw logs of what's happening between Sopel and the IRC server.
The bot recognises that something is wrong and initiates a reconnect
>>1528894738.2752254 PING irc.efnet.nl
<<1528894738.2765565 :irc.efnet.nl PONG irc.efnet.nl :irc.efnet.nl
>>1528894858.3953335 PING irc.efnet.nl
>>1528894918.4552155 PING irc.efnet.nl
>>1528895021.7282913 CAP LS 302
>>1528895021.7284808 NICK botnickname
[server connection stuff]
<<1528895025.5272648 :irc.efnet.nl 001 botnickname :Welcome to the EFNet Internet Relay Chat Network botnickname
>>1528895025.527987 MODE botnickname +B
>>1528895025.528177 JOIN #channelname
[server connection stuff, MOTD]
<<1528895025.5832474 :irc.efnet.nl 437 botnickname #channelname :Nick/channel is temporarily unavailable
[...]
>>1528896821.9092314 PRIVMSG #channelname :example message
<<1528896821.9106731 :irc.efnet.nl 404 botnickname #channelname :Cannot send to channel
The expected message after the JOIN would be:
<<1528925459.723694 :[email protected] JOIN :#channelname
Due to the state of the network, joining a channel is not possible at the time of the connection. Should the bot retry periodically to join the channels?
I think EFNet is one of very few networks that lock channels during a netsplit. Most IRCds (AFAIK) just let users on the lost segment join channels anyway and resolve ops collisions with timestamps and/or services.
Adding this sort of logic to core doesn't seem especially worthwhile. For the majority of users, it would just waste CPU time. Once restarting (#1333) is done, a plugin could probably do it though. Or, run Sopel behind a bouncer (ZNC?) and let the bouncer handle channel joining and retries for free.
Handling numeric 437 (ERR_UNAVAILRESOURCE
) shouldn't be too difficult, as it does include the nick/channel that was unavailable (so there's no need for Sopel to do a lot of complicated state tracking).
I don't think there are any situations where Sopel would receive a 437 for something that isn't a channel, but this feature would definitely need someone to commit to testing it on a network that handles netsplits this way for some time before release. I'm not in a position to do so, realistically.
Punting relatively minor enhancement with an existing workaround.
I suggest to punt even further, to Sopel 8.x.
I suggest to punt even further, to Sopel 8.x.
Belatedly, I agree.
Let's consider this part of the asyncio
rewrite's shakedown, to be revisited when work starts on 8.1.