charliecloud
charliecloud copied to clipboard
add `pylint` to test suite
Since I started using VSCode, I've noticed that it's easy to add unused imports to our Python code without noticing. Adding a Python linter (e.g. pylint) to our test suite could help catch those instances and other potentially bad coding practices.
One thing worth noting is that pylint really doesn't like our code, e.g.
$ pylint --indent-string=" " lib/*py | wc -l
1656
Here our code is given a rating of 5.38/10 ☹️
pylint messages have a naming convention based on the following categories (seemingly in descending order of severity): Fatal, Error, Warning, Convention, Refactor, Information. The breakdown of the messages we get is as follows:
- Fatal: 0
- Error: 66
- Warning: 131
- Convention: 1333
- Refactor: 111
- Information: 0
So the vast majority of complaints pylint gives us fall under the "Convention" category. Some common messages that show up for us are:
- C0325 (style convention, probably can be ignored)
- C0209 (f-strings, introduced in 3.6 PEP 498, see also #992)
- C0116 (missing docstring)
- W0201 (attribute defined outside of
__init__, mostly inbuild.py) - W0511 (complains about every "FIXME" comment)
This is just a sample, there are obviously many more. I think step 1 for whoever takes this on is to go through the list of messages and figure out what we want to ignore and what we don't.
I propose we do the following.
General comments / notes
-
We should specify a version of Pylint we’re targeting. Debian Bullseye e.g. has 2.14 but upstream latest is 3.2.
-
Pylint may be able to replace
10_sanity.bats:'trailing whitespace'for Python code, but is the exclusion too messy? -
The Pylint message docs are overall light on rationale and occasionally contain flat-out errors, e.g. C0209 consider-using-f-string claims that f-strings are faster, citing a tweet whose immediate replies present more detailed contrary analyses. (Contrast e.g. ShellCheck docs such as SC2053.)
-
We should do Python code everywhere, not just
lib/, and including*.py.in. -
We probably want a configuration, including
py-versionspecifying 3.6 (the default is the version of Python doing the analysis).
Dispositions (what to do)
I went through the 1,641 messages in the text file above and ended up boiling it down to 75 groups with the same disposition. (Note: I would call these “warnings”, but Pylint calls them “messages”, with “warning” being a specific category of message.) These dispositions fell into the six classes below.
I’d like the initial PR for this to yield no messages. We’ll do that by fixing the easy stuff in that PR and deferring the rest.
Now
Each of these is described in an individual comment below. These should all be fixed in this issue’s PR.
-
Ignore globally forever. I just disagree with Pylint, so let’s turn off the warning for the entire project. This configuration happens in the same PR that implements Pylint.
-
Fix now. Real problems that should be fixed and are important/easy enough to do in the same PR that implements Pylint.
-
Case-by-case ignore or fix now. Each of the individual messages needs to be examined and either ignored (with a Pylint exclusion comment) or fixed in the original Pylint PR.
Later
Each of these has its own issue, linked below.
-
Enforce the opposite. I actively want the opposite of Pylint, and because Pylint is extensible, it can enforce that. I suspect that for many of these, we can use the existing inverted checker for reference.
-
Fix later. Real problems where we should do something, but it’s hard enough that a new PR is needed.
-
Unclear. I don’t know what to do.
I wonder if a good approach is to disable all the messages we get (I have a spreadsheet that can give a list), e.g. with --disable X001,X002,... or configuration, then one-by-one re-enable the messages below in this issue. After that the disabled list should be exactly what we decided to defer.
Class names (C0103 invalid-name)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L202
We just use a different naming convention. Pylint doesn’t have a pre-configured style name but it should be easy to configure via regex.
Log function names (C0103 invalid-name)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L467-L469
These functions just have special names. Probably just add ignore comments to each def.
Bad None comparison (C0121 singleton-comparison)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L785
Should be is None.
Bad type check? (C0123 unidiomatic-typecheck)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/registry.py#L88-L89
I think this is OK because this class has a weird notion of equality, but needs a little more though.
Bad range() (C0200 consider-using-enumerate)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L799-L801
These can be easily refactored to not use indexes.
Non-idiomatic class argument (C0202 bad-classmethod-argument)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L287-L288
We use class_ but it’s easy to just change to the idiomatic cls.
Bad __slots__ (C0205 single-string-used-for-slots)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L611
These should all be fixed.
Unnecessary splitting (C0207 use-maxsplit-arg)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L411
This appears to be a performance thing and we don’t use this in places where it matters. I think what we have is more readable, so let’s global ignore.
Modules too long (C0302 too-many-lines)
This probably right, but I don’t see this changing in the foreseeable future and I don’t think Pylint is the right place to enforce it. If the warnings are all solved, then the most likely way it would re-appear is that a small change pushes a module over the threshold, which is the wrong time to demand a refactor.
Global ignore.
Multiple statements per line (C0321 multiple-statements)
- [ ] done
I don’t mind this and we use it a lot:
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L50
but:
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L633-L635
or:
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L839-L842
seem confusion-prone. Let’s just stop using multiple statements per line.
Bad import position (C0413 wrong-import-position)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L26-L33
We have a good reason for this, but probably better to disable on a case-by-case basis.
Inappropriate dunder call (C2801 unnecessary-dunder-call)
- [ ] done
https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L393
False positive; we’re implementing os.PathLike. Suppress case-by-case.
Bad except order (E701 bad-except-order)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L751-L759
This is a real bug that should be fixed.
Nonexistent instance member access in external classes (E1101 no-member)
- [ ] done
This happens twice:
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L384-L385
Genuinely wrong and will crash the error message.
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L737
This one seems like if it’s not present it shouldn't work at all? Investigate.
Non-iterable iterated (E1133 not-an-iterable)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/registry.py#L118
The specific error is a false positive, but it’s a property that need @abc.abstractmethod I think.
Bad % (E1305 too-many-format-args)
- [ ] done
https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L763
Yep, that's wrong.
Merge isinstance() (R1701 consider-merging-isintance)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L77-L79
Should probably fix.
FP complaints about return on ch.FATAL (R1710 inconsistent-return-statements)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L648-L653
This is a FP because ch.FATAL doesn’t return (it goes into an exception procedure that kills the program). Can we teach Pylint that it doesn’t return?
Useless return complaints (R1711 useless-return)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1387-L1389
The return is indeed useless but in this particular case I like it there because related classes’ commit() does return something meaningful, which also might be None. Perhaps looks at these and ignore on a case-by-case basis?
Built-in exit() re-defined (R1722 consider-using-sys-exit, W0622 redefined-builtin)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L638-L639
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L661-L664
This is the same problem. Rename our exit(), and see if we are calling sys.exit() anywhere we should be using our own exit().
Use error-checking ch.open_() (R1732 consider-using-with)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1108
with isn’t a good match for Charliecloud because the key operations are wrapped so they can’t fail. That is, I don’t think we should take the advice, but the flagged code should probably be using fs.Path.open() instead and the warning should be left enabled.
Bad keyword argument defaults (W0102 dangerous-default-value)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L528-L529
I agree. Fix.
Needless pass (W0107 unnecessary pass)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build.py#L449-L452
IMO clearer this way. Ignore globally.
FP on self.__class__ (W0212 protected-access)
- [ ] done
https://github.com/hpc/charliecloud/blob/5ceefa7040c0accb91acf8b2c99e0f7ec6a97429/lib/filesystem.py#L814
Technically this a false positive — we’re accessing a private class (not instance) member — but going though self.__class__ seems weird. Why not just self?
Superclass __init__() not called (W0231 super-init-not-called)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L1374-L1377
This is on purpose b/c the inheritance is a bit strange. I’d like to leave it enabled and justify exceptions on a case-by-base basis.
Superclass argument renamed (W0237 arguments-renamed)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/charliecloud.py#L286
Personally I think value is a bad name but this is probably right, less confusing if we use the same name.
Method that only delegates back to parent (W0246 useless-parent-delegation)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/image.py#L928-L929
Indeed this method seems to be doing nothing. Solution is probably delete it regardless, but I really wonder why.
Bad indentation (W0311 bad-indentation)
- [ ] done
https://github.com/hpc/charliecloud/blob/d67937332094a7d8de8e9bd2e112d30d117c50c0/lib/build_cache.py#L827-L829
This whole method is indented too much, probably due to being moved at some point.