tornado icon indicating copy to clipboard operation
tornado copied to clipboard

Autoreload crashes when using __main__.py without a module

Open bdoms opened this issue 4 years ago • 3 comments

Hi all, love tornado, but I have this issue I've been dealing with for a while and figured it deserved a write up. I looked through all the previous autoreload issues and I don't think it's been reported before. Here's the situation:

I'm running Python 3.7.5 and Tornado 6.0.3. I have a tornado app I run with this command:

python mypackage/myapp

There is a mypackage/myapp/__main__.py file which is where the app is started from. Everything works fine, except when I trigger an autoreload. Then the server crashes with this message:

Error while finding module specification for '__main__' (ValueError: __main__.__spec__ is None)

It looks like autoreload is trying to run my code as a module, even though it's not a module. I believe this is because the autoreload script assumes that just because a spec exists that means the code must be a module. I think this is a bad assumption. Python autogenerates a spec that looks like this when I print it:

ModuleSpec(name='__main__', loader=<_frozen_importlib_external.SourceFileLoader object at 0x123abc>, origin='mypackage/myapp/__main__.py')

So this check for if spec: passes: https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L230 but then when the the autoreload attempt happens it crashes, because there is not actually a module or spec anywhere. I changed that one line to this:

if spec and _original_argv and "-m" in _original_argv:

And everything works as expected, including autoreload, and calling my app every way I could think of. I'm not sure if that change is the best one to make, but it does do a better job of actually confirming that the original command was for a module.

For what it's worth, there is a similar check farther down in the file: https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L298 that creates a mode = "module" variable, and I'm wondering if a better fix is to make that a global and then check for if mode == "module" rather than just if spec.

I think this bug might've been caused by the fix for a similar but different issue: https://github.com/tornadoweb/tornado/issues/2044

For anyone finding this that wants a workaround: I did get it working with a small change, which is hacking in your own spec object. I didn't want to chance anything being off in production (not needed, but I'm paranoid), so I check for a debug flag passed in and only do this in that situation:

from tornado.options import define, options

if __name__ == "__main__":
    define('debug', default=False, help='enable debug mode')

    options.parse_command_line()

    if options.debug:

        # this is the part you need to add for autoreload to still work even with this bug present
        class Spec(object):
            name = 'mypackage.myapp'
        __spec__ = Spec()

Hope that helps someone out there. This bug is really hard to search for.

bdoms avatar May 06 '20 18:05 bdoms

Oh, I had no idea it was even possible to execute a __main__.py by passing in a directory; i thought this only worked with -m and a package name.

Note that there are two potential uses of -m. In a command line like python -m tornado.autoreload -m myapp, the first -m loads tornado.autoreload as a module, and the second one loads myapp as a module. It's possible to load autoreload as a module but myapp.py as a script. The check at line 298 is for this latter -m, so it wouldn't be correct to use it outside of autoreload's __name__=='__main__' block.

And frustratingly, the first -m is removed from sys.argv by python before we can see it (comment).

I got the idea for using __main__.__spec__ from this issue. It looks like multiprocessing has some more elaborate logic for handling cases like this, although I can't tell from a quick look at this commit how exactly you're supposed to detect __main__ run via -m from __main__ run without -m.

bdarnell avatar May 11 '20 19:05 bdarnell

After more research and investigation I believe you're right that inspecting __main__.__spec__ is probably the best way to go at the moment. (Why haven't they implemented an immutable sys.argv like was suggested 8 years ago?!?!)

I discovered that the name is always set to exactly "__main__" when invoked without the -m. With -m I could never replicate that, it's always of the form "module.__main__". I even tried running python -m . and Python doesn't allow that, it generates this error: Relative module names not supported. I think modules must have names, and if that's true, then we can use this property to form a solution.

Changing this line https://github.com/tornadoweb/tornado/blob/master/tornado/autoreload.py#L230 to:

if spec and spec.name != "__main__":

Solved the issue for me and also still worked with all the various invocations I tried, both using modules and not.

bdoms avatar May 29 '20 20:05 bdoms

Bump

brittbinler avatar Aug 18 '22 10:08 brittbinler