Autoreload crashes when using __main__.py without a module
Hi all, love tornado, but I have this issue I've been dealing with for a while and figured it deserved a write up. I looked through all the previous autoreload issues and I don't think it's been reported before. Here's the situation:
I'm running Python 3.7.5 and Tornado 6.0.3. I have a tornado app I run with this command:
python mypackage/myapp
There is a mypackage/myapp/__main__.py file which is where the app is started from. Everything works fine, except when I trigger an autoreload. Then the server crashes with this message:
Error while finding module specification for '__main__' (ValueError: __main__.__spec__ is None)
It looks like autoreload is trying to run my code as a module, even though it's not a module. I believe this is because the autoreload script assumes that just because a spec exists that means the code must be a module. I think this is a bad assumption. Python autogenerates a spec that looks like this when I print it:
ModuleSpec(name='__main__', loader=<_frozen_importlib_external.SourceFileLoader object at 0x123abc>, origin='mypackage/myapp/__main__.py')
So this check for if spec: passes: https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L230 but then when the the autoreload attempt happens it crashes, because there is not actually a module or spec anywhere. I changed that one line to this:
if spec and _original_argv and "-m" in _original_argv:
And everything works as expected, including autoreload, and calling my app every way I could think of. I'm not sure if that change is the best one to make, but it does do a better job of actually confirming that the original command was for a module.
For what it's worth, there is a similar check farther down in the file: https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L298 that creates a mode = "module" variable, and I'm wondering if a better fix is to make that a global and then check for if mode == "module" rather than just if spec.
I think this bug might've been caused by the fix for a similar but different issue: https://github.com/tornadoweb/tornado/issues/2044
For anyone finding this that wants a workaround: I did get it working with a small change, which is hacking in your own spec object. I didn't want to chance anything being off in production (not needed, but I'm paranoid), so I check for a debug flag passed in and only do this in that situation:
from tornado.options import define, options
if __name__ == "__main__":
define('debug', default=False, help='enable debug mode')
options.parse_command_line()
if options.debug:
# this is the part you need to add for autoreload to still work even with this bug present
class Spec(object):
name = 'mypackage.myapp'
__spec__ = Spec()
Hope that helps someone out there. This bug is really hard to search for.
Oh, I had no idea it was even possible to execute a __main__.py by passing in a directory; i thought this only worked with -m and a package name.
Note that there are two potential uses of -m. In a command line like python -m tornado.autoreload -m myapp, the first -m loads tornado.autoreload as a module, and the second one loads myapp as a module. It's possible to load autoreload as a module but myapp.py as a script. The check at line 298 is for this latter -m, so it wouldn't be correct to use it outside of autoreload's __name__=='__main__' block.
And frustratingly, the first -m is removed from sys.argv by python before we can see it (comment).
I got the idea for using __main__.__spec__ from this issue. It looks like multiprocessing has some more elaborate logic for handling cases like this, although I can't tell from a quick look at this commit how exactly you're supposed to detect __main__ run via -m from __main__ run without -m.
After more research and investigation I believe you're right that inspecting __main__.__spec__ is probably the best way to go at the moment. (Why haven't they implemented an immutable sys.argv like was suggested 8 years ago?!?!)
I discovered that the name is always set to exactly "__main__" when invoked without the -m. With -m I could never replicate that, it's always of the form "module.__main__". I even tried running python -m . and Python doesn't allow that, it generates this error: Relative module names not supported. I think modules must have names, and if that's true, then we can use this property to form a solution.
Changing this line https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L230 to:
if spec and spec.name != "__main__":
Solved the issue for me and also still worked with all the various invocations I tried, both using modules and not.
Bump