build: GitHub Action to lint python files with pyflakes
pyflakes is probably the easiest Python linter to get started with:
- it's focused on Python content, not style.
- it runs quickly with low resource requirements. This is due to the fact that it does some light static analysis without actually executing any of the code under review. That limits its scope, but also means you don't have to set up on a machine with the right hardware and install all the heaps of binary dependencies.
I haven't used this specific reviewdog action before, but it seems like it's how presenting pyflakes results in a PR request should work?
not sure how to PR request changes to GitHub workflows either. I see the check hasn't run on this commit itself, presumably because randos can't submit PR requests to run random GitHub actions with your GitHub token.
Running Locally
To run pyflakes locally:
pip install pyflakes
pyflakes .
I haven't added this to any environment or requirements files yet, as I'm not sure where to put dev-dependencies. (Packages that support development workflow but aren't required for users who just want to check out the code and run things from scripts/.)
Ah! I found the place where it told me about a workflow failure. For some reason it was in email instead of anything attached to this PR.
At least now this PR says "workflow awaiting approval," which is progress.
The motivation for this was seeing errors like hfFolder or zipfolder not defined: those are the types of things that pyflakes is very effective at catching quickly.
...
or merge conflict >>>>>>>> markers being committed.
@mauwii I think you need to review this.
@lstein first question would be if you already have been thinking about a Linter, f.E. Flake8?
The motivation for this was seeing errors like
hfFolderorzipfolder not defined: those are the types of things that pyflakes is very effective at catching quickly.
If I am not wrong, the zipfolder error appears when you try to run invoke.py without stable-diffusion models available.
What should also be taken into consideration: When implemention a action like this, there should also be a pre-commit-hook which prevents pushing changes that would validate configured rules.
@lstein first question would be if you already have been thinking about a Linter, f.E. Flake8?
I had not thought of using a linter until this PR came in. It seems like it would be useful for catching code issues, but I have no experience with how such tools fit in with the development workflow. Can you advise?
The motivation for this was seeing errors like
hfFolderorzipfolder not defined: those are the types of things that pyflakes is very effective at catching quickly....
or merge conflict
>>>>>>>>markers being committed.
My bad on both counts- I was not working within the code review policies. I’ve learned my lesson.
To be honest: I never integrated a linter in a already grown project, always used them from the beginning, so it is hard to give you good advices. And after running a linter locally (pyflakes as well as flake8) there is a huuuuuuge amount of violations which would need to be fixed 🙈
But since the project is growing more and more it is maybe a good time to start implementing a Linter now.
My bad on both counts- I was not working within the code review policies. I’ve learned my lesson.
I didn't bring those up with the intent to blame and shame anyone. Those both happened to be you, and maybe you learned how to avoid repeating those missteps in the future. But there's surely someone else who hasn't learned that yet, and none of us are immune from making mistakes when we're feeling rushed and fatigued.
The purpose of pointing out those specific incidents was to emphasize that there is an active safety hazard we can address here; the situations where pyflakes can help are not merely hypothetical. It's a tool we can add to the workflow to give developers early feedback about problems before someone else runs in to them.
It's automated and takes some of the burden off of code review -- there's no need for human reviewers to spend time marking notes of something in the automated report.
Would it be more effective if it ran even earlier, i.e. pre-commit? Sure, but it's more difficult to impose a constraint on everyone's development environment, and it's easy to do in a central place here.
Can you still make a mess if you push directly to branches and ignore the result of the GitHub Checks? You can, but at least you have an extra chance with this to maybe notice the green check is missing.
I don't feel blamed in any way and I also don't think that @lstein does 😅 But imho this topic really needs some more discussion instead of just set it live and see what happens (kind of). Are you also available in our Discord Channel?
I am. Feel free to make a thread and ping me there.
I'm not sure where to put dev-dependencies
For the pip-compile driven "unified" installer, they'd go in a 'requirements-dev.in' which would then be run through pip-compile
I'm not sure where to put dev-dependencies
For the pip-compile driven "unified" installer, they'd go in a 'requirements-dev.in' which would then be run through pip-compile
I wonder if it's worth exploring a pyproject.toml-based setup? It can define dev and non-dev dependencies separately. Plus it can include arbitrary dependency groups, platform-specific pins, style/linting settings and much more.
But since the project is growing more and more it is maybe a good time to start implementing a Linter now.
Yeah, the best time to add a linter is always yesterday, the next best time is right now ;)
Even though the codebase is mature, it should not stop this very useful effort. I'd recommend to configure pyflakes and GH Actions to only warn, not fail the build. This way the benefits of linting (such as catching merge conflict markers) are realized early on without causing a massive blocker.
I wonder if it's worth exploring a
pyproject.toml-based setup?
That's the NEXT next-gen installer :D
The effort involved in getting pip-compile to the point where we are now was sufficient to dissuade me from looking any further into pyproject stuff, but IMO it is something which we should at least evaluate
https://github.com/invoke-ai/InvokeAI/pull/1974 reminding us that we need to do this public service announcement again:
UnboundLocalError: referenced before assignment errors are 100% preventable at low cost! Use pyflakes and/or an IDE with built-in inspections.
I recommend PyCharm, but if you'd rather roll your own development environment instead of getting one that comes already assembled, consider Flycheck + flycheck-pyflakes.
When this is enabled and at a point where no errors are detected, we should consider to enable pre-commit hooks, to prevent pushing if checks already fail localy.