sopel
sopel copied to clipboard
reload: Reloading packaged modules doesn't work correctly
Remaining known issues with reloading module packages:
- [ ] Removed methods are not cleaned from the bot
- [x] Callables using
@interval
might still be duplicated on reload
These are based on further discussion of this in #1399 and elsewhere (IRC).
Original comment (mostly fixed) from @maxpowa appears below. — @dgw
~~Doesn't reload the actual module and only reloads the __init__.py
. Probably needs some tweaks to loader.py
and reload.py
. Because it only reloads the __init__.py
, when the module loads after the reload, there will be two instances of the package running. Problem exists with both sopel_modules
namespace packages and packages in ~/.sopel/modules
.~~
I have reproduced this on python 2.7, 3.4, and 3.5.
Is there a workaround for this issue? It's making it impossible to use Sopel in production environments. I'm considering merging my out of tree modules into the Sopel source code to prevent it.
It isn't limited to reloads, reconnecting to the server after a network problem causes this issue as well so you have absolutely no idea of the moment when your bot will go haywire.
I guess my bot is blessedly connected to a stable ZNC instance on the same host, so I've never observed this issue after a reconnection (because Sopel never has to reconnect).
Frankly, I'm surprised that a bug like this even made it into a stable release. But we also have other things that should have been caught before release, such as #1136 or (arguably) #1080. Sopel might do with more stringent testing, and more code review, but that would require more maintainers to step up. I'd be willing to help out with that, and with testing releases in advance if the project started having pre-release builds. (@maxpowa and @embolalia, I can write up some thoughts on development/release process in separate issue, or find a time when we can chat on IRC, if you're interested.)
I agree on the points about more stringent testing & code review, something in that direction would be running our tests on commits and PR branches. Unfortunately I don't have permission to set up travis on this repo, but if @embolalia or @elad661 were to, there's already a basic test suite that could be improved upon (it has around 50% coverage, not including modules, at the time of this comment).
This is a blocker for starting #1291.
Is there some kind of a flowchart or a textual explanation of how the reloading mechanism works? I'd like to understand the ideology of how this is supposed to work, and perhaps try to contribute.
I'd have to make one, I think. AFAIK there's no real documentation except the code & comments therein. But the idea is pretty simple, just register callables and module shutdown functions on load & unregister them on unload. Reload = unload followed immediately by load.
The issue is that existing triggers don't get removed for packaged modules, probably because the loader is only looking at a single file, the one reload
calls out when invoked: __init__.py
. When unregistering callables, the bot needs to make sure all callables from all module files (not just __init__.py
) are handled. Or at least, that's my impression from what I've poked at tonight.
Hey @ralienpp, can you test out 142cb4d for me? It all appears to work locally, but I'm developing on py3.6 and getting your input from the py2 side would be really helpful. :)
I'd love to, but how could I?
-
git clone https://github.com/sopel-irc/sopel.git
-
git checkout 142cb4d3559b33c692d021a74abcba75d22f96d3
This results in: fatal: reference is not a tree: 142cb4d3559b33c692d021a74abcba75d22f96d3
Oh, sorry about that. Github masks the source fork when commits are referenced like that, and I didn't push the branch to this repo yet. Add dgw/sopel
as a remote and fetch from it, then you'll find that commit in the 1056-fix-with-recursion
branch.
The test on Python 2.7.12 was successful, I reloaded the module a couple of times via .reload <module name>
and the behaviour was as expected. This was very satisfying, I must admit :-)
On 2.7.3 the installation itself fails with
Running setup.py egg_info for package from file:///home/rama/projects/sopel
error in sopel setup command: 'install_requires' must be a string or list of strings containing valid project/version requirement specifiers
It was quite satisfying to finally fix, too. :-) There's still testing to be done regarding how the new reloading code handles updated methods (seems to work), removed methods (not tested yet), and added methods (ditto), whether in the __init__.py
file or a submodule, but I'm optimistic.
I'm not sure how far back Sopel's 2.7 support is meant to go, but I don't think that error should happen. setup.py
does read requirements.txt
into a list. It could be choking on the environment markers added for ipython (in #1310). What version of setuptools
is installed with your Python 2.7.3 distribution?
distribute 0.6.24
Something like this in setup.py
would do the trick:
if sys.version_info[0] < 3:
requires.append('backports.ssl_match_hostname')
# remove entries with adnotations that are not supported by older versions
# of distutils
requires = [item for item in requires if not item.startswith('ipython')]
requires.append('ipython<6')
This way requirements.txt
can stay unchanged, and there would be no need to create a separate requirements file for older versions of Python.
distribute
is old and deprecated (merged back into setuptools
0.7, which is now at version 20 or 30 something), and you shouldn't be using it… Remove it and install the latest version of setuptools
you can use.
What prevents this update from being part of the current release? Is there something I could to do make it happen?
The tests above were successful and I would love to be able to upgrade my Sopel instances to this version.
Depending on how you're installing Sopel, you can run tag v6.5.3 with PR #1314 applied to it until the next official release. There are a bunch of fixes I want to get done before 6.6.0, and since most of them have already been broken for ages there's no particular reason to rush and push out a new release every time something's fixed.
(I'm also hoping to get a chance to work with #1053's author, because it touches reloading too.)
wolfram alpha module module is affected as well. A .reload will make it respond +1 times
@Duckle29 Of course. The wolfram module is based on the cookiecutter template. The replacement weather and imdb modules @RustyBower put up will be affected, too, I expect.
Did you try #1314 at all? I could hold the release of 6.6.0 for a day or two if you want to test it out.
I just tested against #1314 and it is definitely causing all my modules to respond n+1 times
@RustyBower Merging #1314 causes all your modules to repeat? That can't be right… though I haven't tested it in some time, it did help when I was working on it myself.
~~@dgw, @RustyBower, could this be related to #1399 ? I see that @RustyBower's weather module does a from .weather import *
in the __init__.py
, that if I recall correctly, prompted me to submit #1399 (it should have built off of #1314).~~
nvm, @dgw seemed to have pinned it down
I hijacked @maxpowa's OP to list the remaining known issues, rather than closing this and opening a new ticket, so all the discussion of this very complex problem will remain in (mostly) one place.
Cannot get @interval
methods to duplicate. I made a simple example to test:
def setup(bot):
global interval_count
interval_count = 0
@sopel.module.interval(1)
def interval_callable(bot, extra='a'):
global interval_count
interval_count += 1
bot.say('extra: {}; count: {}'.format(extra, interval_count), '#sopel-dev')
This @interval
was duplicated neither when reloading this specific module nor a full .reload
. The only duplication that was observed was due to the 1 second interval causing my bouncer to buffer messages to the server (1 msg / 2 seconds limit led to quite a large buffer of unsent @interval
-ed messages), resulting in what appeared to be the old version of the method still being called for a while after .reload
Clearing all @interval
-ed methods should certainly not be the preferred solution for unregister
-ing a single @interval
-ed callable; this is the real issue, not duplication.
I can confirm that del_job
from #1053 will fix the issue of all @interval
'd callables being cleared on a single module reload.
@HumorBaby It's looking more and more like #1053 won't ever get merged, because its author has been silent for quite a while now. If you want to adopt some elements from that (cherry-pick or whatever), feel free. I'm sure @calebj wouldn't mind if you pulled in some stuff with his name on the commit.
Just a note here: the [at]interval thing and the JobScheduler have been fixed in Sopel 7.x.
Punting this to 8.0: We need tools that aren't available in older Python versions to fix this, and 8.0 will drop support for all the versions that block implementing saner reload.
De-milestoning for the same reason as #871: No further changes to the plugin system are planned during 8.x development.