pylint
pylint copied to clipboard
HELP WANTED: Help copy content from ``pylint-errors`` over to our own documentation
A longstanding issue of pylint
has been that there is little documentation about some of the messages and why they are emitted. For some messages the short message that is displayed is not enough to make explicit what needs to be changed or what is considered wrong.
With the closure of #5527 and the merge of #5934 we have now set up a system that allows us to do so!
We have also received the okay from @vald-phoenix to base this additional documentation on their wonderful pylint-errors project.
However, now we need some community help to write this documentation. If you'd like to help out and help future users of pylint
understand what pylint
is telling them, please continue reading.
How to create such pages? The process is quite straightforward.
Within the doc/data/messages
directory you can find various subdirectories. These are the letters of the alphabet. Within these directories you will find directories for each message that pylint
can emit. These directories contain the data from which these pages are build.
To help build a new page:
- Create a new directory for the message you are working on. For example
doc/data/messages/a/abstract-class-instantiated
. - Look at the page for the message in the
pylint-errors
repo. For example here. If the message is not documented yet you will need to come up with examples yourself. - Create a
good.py
document in which you include a short code example that shows of good code that won't trigger the message. You can base this onpylint-error
. - Create a
bad.py
document in which you include a short bad code example. On the line that is supposed to emit the message please add a# [abstract-class-instantiated]
comment. See these examples. - If necessary create a
pylintrc
file. Here you can include configuration options forpylint
like loading of optional checkers. Seedoc/data/messages/c/confusing-consecutive-elif/pylintrc
for example. - If necessary create a
details.rst
file. Here you can place additional details in .rst format that will be displayed on the page. Note that some of the details included on thepylint-error
pages are already included in ourMessage
orDescription
sections. - If necessary create a
related.rst
file. Here you can place .rst style links to external sources that are relevant for the message. Generally we do not include the links toTest Cases
and theIssue Tracker
thatpylint-error
provides as they can break easily. - If you're basing yourself on the work in the
pylint-errors
repo it is appreciated if you add the following line to your commits:Co-authored-by: Vladyslav Krylasov <[email protected]>
(or any of the other contributors to that project you're basing yourself on! - To make sure the code examples are correct, you can run
pytest doc/test_messages_documentation.py
. - Create a PR. A changelog entry is not necessary :)
Two examples of how to do this can be found in #5934.
To make this work somewhat manageable please do not add more than 5 messages in one PR, as otherwise it will be impossible to review the changes correctly.
The big list of messages without any documentation: As of the 2023-09-09.
- [x] i/invalid-character-carriage-return #9042
- [x] i/invalid-character-nul #9042
- [ ] t/typevar-name-incorrect-variance (need a more detailed explanation by a core maintainer)
- [ ] c/c-extension-no-member (This one is going to be hard to trigger, not for beginner)
Related pull requests: #5992, #5994, #5995, #6016, #5993, #5996, #6150, #5997, #6026, #6086, #6000, #6088, #6021, #6002, #6090, #6164, #6166, #6020, #6104, #6103, #6167, #6105, #6106, #6107, #6156, #6157, #6131, #6009, #6139, #6083, #6092, #6200, #6203, #6204, #6237, #6159, #6243, #6250, #6115, #6248, #6249, #6339, #6341, #6342, #6343, #6340, #6344, #6345, #6262, #6263, #6264, #6265, #6080, #6079, #6206, #6245, #6205, #6158, #6114, #6114, #6201, #6160, #6149, #6238, #6152, #6153, #6133, #6244, #6230, #6232, #6151, #6247, #6132, #6239, #6240, #6472, #6699, #6679, #6460, #6697, #6630, #6637, #6698, #6472, #6586, #6541, #6472, #6669, #6618, #6690, #6619, #6620, #6621, #6659, #6576, #6574, #6540, #6700, #6590, #6701, #6591, #6583, #6562, #6585 #9042
https://github.com/pylint-dev/pylint/issues/7897
(List of related pull request is not exhaustive)
I agree that having community support for this would be awesome, but I think we're not ready for that yet. In my opinion we should do a script to do the grunt work of migrating from pylint-error. Then proper reviews of the current examples when we integrate them and creation of missing examples would be super helpful.
Here's what I think we need to do following #5527, we really need community help only at a few steps:
- [ ] Merging git history of pylnt-error (or simply copy pasting) and moving all the relevant content in
doc/data/messages/original
(maintainer job) - [ ] Create a script to migrate pylint-errors format to our structure (maintainer job)
- [ ] Add and check messages by manageable batch until there is nothing left of the "original" pylint-errors files
- [ ] Add functional tests for
good.py
/bad.py
if they exists (maintainer job) - [ ] Add content when pylint-error did not have any
Here's what I think we need to do following #5527, we really need community help only at a few steps:
- [ ] Merging git history of pylnt-error (or simply copy pasting) and moving all the relevant content in
doc/data/messages/original
(maintainer job)- [ ] Create a script to migrate pylint-errors format to our structure (maintainer job)
- [ ] Add and check messages by manageable batch until there is nothing left of the "original" pylint-errors files
- [ ] Add functional tests for
good.py
/bad.py
if they exists (maintainer job)- [ ] Add content when pylint-error did not have any
Sorry, perhaps I should have discussed with you prior.
However, I don't think this is feasible. During the transfer of the two test issues I already noticed a mistake in the examples for pylint-errors
(a bad code example that was actually good code), so just migrating all data doesn't really work. Besides, the work necessary to migrate their data structure to ours isn't really worth the effort (imo). For example, the data that is in their Rationale
sections is a combination of the message that gets emitted by pylint
, the message description that is in the pylint
code and some of their own comments. Often the data that originally came from pylint
will be outdated so we also can't automate the separation of these types of data into details.rst
, related.rst
and not useful.
Basically, if we want to migrate the data we would need to put in so much effort that it is probably much better to do it by hand and just do one review for each message instead of doing multiple across many different steps. Having no code example is better than copying incorrect code examples I think, so we would also need to check the data that gets migrated in any automated step...
That's why I added the line about adding co-authorship: I do want to credit them for any inspiration we use in our documentation.
I already noticed a mistake in the examples for pylint-errors
Yes I noticed that, this is why I proposed to review chunk by chunk. If the migration script is complicated / not worth it, it make sense to do it manually. Did you check the work that need to be done to automate this, or is this a wet finger kind of thing ? I could try a proof of concept to see if it works otherwise (once 2.13 is released of course).
I already noticed a mistake in the examples for pylint-errors
Yes I noticed that, this is why I proposed to review chunk by chunk. If the migration script is complicated / not worth it, it make sense to do it manually. Did you check the work that need to be done to automate this, or is this a wet finger kind of thing ? I could try a proof of concept to see if it works otherwise (once 2.13 is released of course).
No this is wet finger. But if we're going to review by chunks anyway, why then not do this manually? I mean, anybody who is looking to contribute is free to use a script to make their PR, but I think it makes more sense to do one review for each message for a "final" page than do two or three (initial migration, update of test case, add missing content).
I think it makes more sense to do one review for each message for a "final" page than do two or three
Sure, multiple messages in the same PR would be messy, let's do "chunk of one" 😄
anybody who is looking to contribute is free to use a script to make their PR
Maybe it could even be our script and save them a lot of time. I'm going to create a proof of concept to migrate one message from pylint-error to our new doc system (so contributor just have to review/fix and create a PR) and see how it goes. You did the algorithm yourself in the description, I'm going to follow that. The "co-authored by" part in particular look like it would need to be automated to be accurate because it's not that easy to check the git history.
Maybe it could even be our script and save them a lot of time. I'm going to create a proof of concept to migrate one message from pylint-error to our new doc system (so contributor just have to review/fix and create a PR) and see how it goes. You did the algorithm yourself in the description, I'm going to follow that. The "co-authored by" part in particular look like it would need to be automated to be accurate because it's not that easy to check the git history.
Co-authors show in the GitHub interface, luckily 😄 One thing we should check and which is more difficult is that the directories have the correct names, you easily miss that on Github. Let me know if I can update the OP to add a script people can use.
Thing like astroid-error
are internal and there's nothing to do for the user. Maybe we should keep a list of such messages and treat them differently ?
Thing like
astroid-error
are internal and there's nothing to do for the user. Maybe we should keep a list of such messages and treat them differently ?
I think we could still add something in details.rst
for these messages. After this process has been done we can then make a check that each message should have at least one files in its data folder.
Question for assign-to-new-keyword
:
This message only targets async
and await
and is only emitted for Python versions < 3.7.
Should we support message specific configuration files as for the functional tests?
Ah, yeah I hoped to avoid that... I think we will drop that message in a couple of weeks so perhaps not worth to add documentation for it.
We only run the check on Python 3.8 though. Of the top of my head I don't know any messages that do/don't emit for 3.8 but do for others. But we should probably exclude those? Perhaps not with configuration files but with a simple list in the extension which can be updated when we switch the base python for our CI to another version?
Running the documentation check on all versions seems overkill.
If we'd be willing to add support for configuration files, we could maybe make use of the py-version
option.
That way we can still only run the tests with one Python version.
And we could also catch some bugs where the py-version
is not taken into account (assign-to-new-keyword
would be one, although I agree that this particular message is probably not worth working if it will be dropped soon).
If we'd be willing to add support for configuration files, we could maybe make use of the
py-version
option.That way we can still only run the tests with one Python version.
And we could also catch some bugs where the
py-version
is not taken into account (assign-to-new-keyword
would be one, although I agree that this particular message is probably not worth working if it will be dropped soon).
That's good idea. I like that! Do you want to work on this?
Yes, I will take a look.
Hello, I make a script that converts error documentation from pylint-errors
to bad.py
, good.py
, related.rst
, and details.rst
in the appropriate folder, But it's very bad because I am a beginner, for most case, it will work but if the documentation missing some element it will just print error on some file but the majority of the time it will work (not guaranteed 😵💫 )
import pathlib
import re
import os
location = ".\\errors\\classes"
pattern_identifier = re.compile(r"((\#{2}\s)(\w\d{4})\s\((.*?)\))") # detect ## E0001 (error-name-spam)
code_block = re.compile(r"((```python\n)([\w\W]*?)(```))") # detect bad and good code between ``` (triple backtick)
rationale = re.compile(
r"(\#{3}\sRationale:\n\n)([\w\W]*?)(\n\#{3}\sRelated\sresources:\n)"
) # This just detect description so not necessary anymore
related = re.compile(
r"((\[((\w)*?)\])(\(((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z]){2,6}([a-zA-Z0-9\.\&\/\?\:@\-_=#])*\)))"
) # This detect Testcase and Issue Tracker link
for path in pathlib.Path(location).iterdir():
if path.is_file():
current_file = open(path, "r")
test_string = current_file.read()
current_file.close()
result_identifier = pattern_identifier.search(test_string)
result_code = code_block.findall(test_string)
result_rationale = rationale.search(test_string)
result_related = related.findall(test_string)
folder_name = result_identifier.group(4)
folder_initial = folder_name[0]
if not os.path.exists(f".\\result\\{folder_initial}"):
os.makedirs(f".\\result\\{folder_initial}")
if not os.path.exists(f".\\result\\{folder_initial}\\{folder_name}"):
os.makedirs(f".\\result\\{folder_initial}\\{folder_name}")
# Write good.py code
write_good = open(
f".\\result\\{folder_initial}\\{folder_name}\\good.py",
"w",
encoding="utf-8",
)
if result_code == []:
write_good.write("Not Tested")
print(f"Error on {folder_name} - good.py")
write_good.close()
else:
write_good.write(result_code[1][2])
write_good.close()
# Write bad.py code
write_bad = open(
f".\\result\\{folder_initial}\\{folder_name}\\bad.py",
"w",
encoding="utf-8",
)
if result_code == []:
write_bad.write("Not Tested")
print(f"Error on {folder_name} - bad.py")
write_bad.close()
else:
write_bad.write(result_code[0][2])
write_bad.close()
# Write details.rst
write_detail = open(
f".\\result\\{folder_initial}\\{folder_name}\\details.rst",
"w",
encoding="utf-8",
)
if result_rationale == []:
write_detail.write("Not Tested")
print(f"Error on {folder_name} - details.rst")
write_detail.close()
else:
write_detail.write(result_rationale.group(2))
write_detail.close()
# Write related.rst
write_related = open(
f".\\result\\{folder_initial}\\{folder_name}\\related.rst",
"w",
encoding="utf-8",
)
if result_related == []:
write_related.write("Not Tested")
print(f"Error on {folder_name} - related.rst")
write_related.close()
else:
edit = result_related[0][4]
new_link = edit.replace("(", "<")
new_link2 = new_link.replace(")", ">")
write_related.write(
f"`{result_related[0][2]} {new_link2}`_\n"
f"`Issue Tracker <https://github.com/PyCQA/pylint/issues?q=is%3Aissue+%22{result_identifier.group(4)}%22+OR+%22{result_identifier.group(3)}%22>`_",
)
write_related.close()
Hello, I have problems with pre-commit
auto-fix formatting style example for bad.py
. How to fix this problem? I have tried a few error messages that have similar problems with this also, especially for error messages about formatting style that auto fix by black
or pre-commit
- C0301 (line-too-long)
- C0302 (too-many-lines)
- C0303 (trailing-whitespace)
- C0304 (missing-final-newline)
- C0305 (trailing-newlines)
- C0321 (multiple-statements)
- C0325 (superfluous-parens)
- C0326 (bad-whitespace)
- C0327 (mixed-line-endings)
- C0328 (unexpected-line-ending-format)
- C0330 (bad-continuation)
- W0301 (unnecessary-semicolon)
- W0311 (bad-indentation)
- W0312 (mixed-indentation)
efected PR #6202 #6206 Many of this example (not yet try all) will autofix. Any advice for this problem , Thank you :) Hope you all have a nice day
We probably need to turn of some formatters for these messages.
@Pierre-Sassoulas Do we want to disable all formatting on these files or make specific exclusion patterns?
I think we can start with specific exclusion and see of it's still reasonable if we document everything.
@DanielNoord I think bug description misses to mention possible usage of pylintrc
file in message directory.
This option was introduced at a later point, I updated the initial comment.
To give a little status update. We now have data for 102 out of 392 messages. That's a little over one quarter! Thanks to all who have already participated in this effort!
I have also updated the table in the opening post.
I took a look at the analytics from Read the doc. Here are the message sorted by most searched to least searched:
invalid-name 103 missing-module-docstring 43 unspecified-encoding 41 line-too-long 40 logging-fstring-interpolation 40 too-few-public-methods 35 missing-function-docstring 30 import-error 28 fixme 27 too-many-arguments 19 consider-using-f-string 17 inconsistent-return-statements 16 redefined-outer-name 16 trailing-whitespace 14 wrong-import-order 13 duplicate-code 13 too-many-locals 13 consider-using-with 13 useless-suppression 13 unused-argument 12 broad-except 12 raise-missing-from 12 missing-class-docstring 10 too-many-statements 10 global-statement 10 no-member 9
(It probably is more an indication of what messages are annoying vs what messages need to be documented imo) I think it also underline the need to document deleted message as c0326
c0330
and py3k
are popular search terms and do not exists in the doc.
Result gotten by adding the following script in MessageDefinitionStore.messages
and launching pylint on a file:
values = """\
c0103 (3 results) 68 searches
c0114 (4 results) 43 searches
c0115 (2 results) 10 searches
c0116 (2 results) 30 searches
c0209 (2 results) 17 searches
c0301 (2 results) 20 searches
c0303 (3 results) 14 searches
c0326 (0 results) 14 searches
c0330 (0 results) 14 searches
c0411 (2 results) 13 searches
e0401 (2 results) 28 searches
e1101 (5 results) 9 searches
fixme (12 results) 9 searches
invalid-name (107 results) 19 searches
invalid-name (109 results) 16 searches
line-too-long (88 results) 9 searches
line-too-long (90 results) 11 searches
logging-format-style (81 results) 9 searches
logging-fstring-interpolation (17 results) 10 searches
logging-fstring-interpolation (19 results) 10 searches
py3k (5 results) 16 searches
r0201 (2 results) 13 searches
r0201 (3 results) 13 searches
r0801 (2 results) 13 searches
r0903 (3 results) 13 searches
r0903 (4 results) 10 searches
r0913 (2 results) 19 searches
r0914 (2 results) 13 searches
r0915 (2 results) 10 searches
r1710 (2 results) 16 searches
r1732 (4 results) 13 searches
snake_case (3 results) 14 searches
too-few-public-methods (71 results) 12 searches
unspecified-encoding (15 results) 12 searches
useless-suppression (26 results) 13 searches
w0511 (3 results) 18 searches
w0603 (3 results) 10 searches
w0613 (3 results) 12 searches
w0621 (2 results) 16 searches
w0703 (2 results) 12 searches
w0707 (2 results) 12 searches
w1203 (2 results) 20 searches
w1514 (2 results) 29 searches\
"""
from pylint.exceptions import UnknownMessageError
results = {}
for value in values.split("\n"):
splitted_line = [v for v in value.split(" ") if v]
try:
msgid, result_number, result_, searches, *_ = splitted_line
msg_definition = self.get_message_definitions(msgid)[0]
if msg_definition.symbol in results:
results[msg_definition.symbol] += int(searches)
else:
results[msg_definition.symbol] = int(searches)
except ValueError as e:
print(f"Can't treat {value} from {splitted_line} {e}")
except UnknownMessageError as e:
print(f"Can't find {msgid} in message store for {splitted_line} {e}.")
for result in sorted(results, key=results.get, reverse=True):
print(result, results[result])
Edit upgraded result 2023-02-14:
invalid-name 460 missing-module-docstring 196 too-few-public-methods 146 missing-function-docstring 127 broad-exception-caught 126 logging-fstring-interpolation 125 unspecified-encoding 121 too-many-locals 112 consider-using-f-string 109 no-member 104 import-error 101 line-too-long 92 redefined-outer-name 88 too-many-branches 86 missing-class-docstring 80 too-many-arguments 80 inconsistent-return-statements 77 duplicate-code 76 no-else-return 74 too-many-instance-attributes 68 consider-using-with 67 redefined-builtin 66 trailing-whitespace 63 too-many-statements 62 protected-access 58 raise-missing-from 58 wrong-import-order 57 unused-argument 55 dangerous-default-value 50 no-name-in-module 48 pointless-string-statement 48 global-statement 48 fixme 45 unused-variable 44 attribute-defined-outside-init 43 missing-final-newline 42 bare-except 41 consider-iterating-dictionary 37 consider-using-dict-items 36 consider-using-enumerate 35 no-value-for-parameter 35 too-many-function-args 35 unused-import 35 ungrouped-imports 34 too-many-return-statements 33 superfluous-parens 32 anomalous-backslash-in-string 31 wildcard-import 30 cell-var-from-loop 30 logging-not-lazy 29 f-string-without-interpolation 29 super-with-arguments 27 arguments-differ 27 missing-timeout 27 too-many-lines 26 import-outside-toplevel 26 too-many-nested-blocks 26 use-dict-literal 26 unsubscriptable-object 25 consider-using-in 25 chained-comparison 25 unnecessary-pass 25 unnecessary-lambda 24 syntax-error 23 logging-format-interpolation 23 too-many-public-methods 22 consider-using-generator 22 singleton-comparison 21 simplifiable-if-expression 21
I'm working on unused-argument
.
Filed #6834
Working on subprocess-run-check
.
Working on f-string-without-interpolation
.
Working on format-needs-mapping
.
Working on missing-format-attribute
.
Working on overlapping-except
.
Working on redefined-variable-type
.
Working on repeated-keyword
.
Working on keyword-arg-before-vararg
.