pyflakes icon indicating copy to clipboard operation
pyflakes copied to clipboard

Subtle output flushing issue in Python 3.4

Open timabbott opened this issue 8 years ago • 11 comments

I just tracked down and annoying behavior change that we got in using Pyflakes when we upgraded from Python 2 to Python 3 in the Zulip project. It's related to interleaving stdout and stderr from pyflakes.

Basically we have a little pyflakes wrapper, check_pyflakes, that excludes certain warnings we're not interested in, by sending stdout and stderr to a pipe. It suddenly stopped filtering lines correctly when we switched to Python 3, because we were interleaving the stdout/stderr output from pyflakes, and apparently in Python 3, the newline characters in the pyflakes output got interleaved wrong, and so the filtering filtered the wrong thing. i.e. with code like p = subprocess.Pipe(["pyflakes", ...], stdout=subprocess.PIPE, stderr=subprocess.STDOUT); for ln in p.readlines(): print(p), you'll end up with lines of pyflakes output like this:

b"zproject/settings.py:95: 'from .prod_settings import *' used; unable to detect undefined nameszerver/views/users.py:49:39: invalid syntax\n"

(there should have been a \n in nameszerver).

Here's the commit I just merged into Zulip working around this issue, with details on the specifics: https://github.com/zulip/zulip/commit/6de77db93fe20d747541309d952035f070a157f6

Since I have found that needing to add a sys.stdout.flush() is a common type of new bug when switching to Python 3, I figured I'd report this, since I think the reason this stopped working is that Pyflakes needs such a statement added somewhere, and fixing this might save other folks moving to Python 3 some pain.

I'm happy to provide more details on how to reproduce, but I think it should be easy to do with the subprocess invocation that we have in the pre-6de77db9 world.

Thanks for a great tool, and I hope this report helps!

timabbott avatar Aug 29 '17 17:08 timabbott

I didn't even realize that pyflakes was outputting warnings to standard out and errors to standard error. Weird.

myint avatar Aug 31 '17 01:08 myint

I don't think pyflakes has a distinction between "warnings" and "errors". It would be helpful to know what version of pyflakes you are using, and more specific info on reproduction.

bitglue avatar Aug 31 '17 11:08 bitglue

I think currently Reporter.flake() goes to standard out and Reporter.syntaxError() goes to standard error. Though, I guess I'm not sure how to trigger both. So a code example from @timabbott would be useful.

myint avatar Sep 01 '17 01:09 myint

This was with the lastest pyflakes release (1.6.0.) You can check out the zulip/zulip commit just before the workaround fix commit that I linked above, and do the following:

  • Follow our development environment setup guide at zulip.readthedocs.io/en/latest/dev-overview.html. This might take 20 minutes and will make a Vagrant VM guest with a virtualenv containing Pyflakes and the other dependencies for our linter. It's mostly just network time downloading things. Or you can probably do it faster by just using pip install until tools/lint works.
  • Modify any file to add a pyflakes syntax error (I think I added a newline in the middle of a random if statement in zerver/views/users.py)
  • Run tools/lint. I was able to see it running just tools/lint zerver/views/ (aka just with 1 directory).

It's completely deterministic, so easy to reproduce from that perspective.

If someone is willing to guarantee they'll actually work on debugging this, I can probably extract the tools/linter_lib/pyflakes.py code into a 10-line Python script you can run with a fresh zulip/zulip checkout with a somewhat minimal list of files demonstrating the problem, but I don't want to do that unless someone is actually going to debug it afterwards, since it'll take some time and I'm super busy right now :).

timabbott avatar Sep 01 '17 02:09 timabbott

The willingness of other people to work on the issue is proportional to the effort you put into make a clear, minimal, and easy reproduction scenario. I'm super busy also.

bitglue avatar Sep 01 '17 14:09 bitglue

Oh, I see. You are dealing with multiple files.

  • File A -> Syntax error to stderr.
  • File B -> Flake warnings to stdout.

myint avatar Sep 01 '17 14:09 myint

foo.py:

#!/usr/bin/env python3

import subprocess

process = subprocess.Popen(['pyflakes', '.'],
                           stdout=subprocess.PIPE,
                           stderr=subprocess.STDOUT)
stdout, stderr = process.communicate()

print(stdout.decode())
$ git clone https://github.com/zulip/zulip
$ cd zulip
$ echo / >> zerver/views/users.py
$ ./foo.py | grep syntax
$ ./zilencer/views.py:16: 'zerver.models.UserProfile' imported but unused./zerver/views/users.py:420:1: invalid syntax

myint avatar Sep 01 '17 14:09 myint

Nice, thanks for doing that extraction @myint!

timabbott avatar Sep 01 '17 15:09 timabbott

This behavior seems to originate at 263387a7. I haven't dug yet into the reasons for it, but the notion of an "errorStream" and a "warningStream" seems wrong to me. As a design principle pyflakes doesn't complain about things unless they are almost certainly errors, so there aren't any warnings. (Except possibly the one about import *, but that doesn't seem sufficient to justify bifurcating the output.)

bitglue avatar Sep 01 '17 15:09 bitglue

The distinction between warnings (stdout) and errors (stderr) seems to go as far back as version 0.2 (https://github.com/PyCQA/pyflakes/commit/4fa714a29686be4f31845d5f31a79efa0e692662#diff-3f42f62f5489dcb22e5da7c982fa02c5R16):

def check(codeString, filename):
    try:
        tree = compiler.parse(codeString)
    except (SyntaxError, IndentationError):
        value = sys.exc_info()[1]
        (lineno, offset, line) = value[1][1:]
        if line.endswith("\n"):
            line = line[:-1]
        print >> sys.stderr, 'could not compile %r:%d:' % (filename, lineno)
        print >> sys.stderr, line
        print >> sys.stderr, " " * (offset-2), "^"
    else:
        w = pyflakes.Checker(tree, filename)
        w.messages.sort(lambda a, b: cmp(a.lineno, b.lineno))
        for warning in w.messages:
            print warning

myint avatar Sep 03 '17 23:09 myint

I'd be fine putting everything to stdout rather than having some things on stderr -- should be a pretty easy patch to reporter.py

asottile avatar Jul 11 '22 17:07 asottile