black
black copied to clipboard
Inconsistent adding of space after # in inline comments depending on single / double quotes used
Describe the bug
Behaviour of black
in adding a single space after the #
in in-line comments is inconsistent when single or double quotes is used within the comments. Please see the MWE.
To Reproduce
For example, take this code (before black was applied):
# NOTOK:
# single quote -> double quotes, as expected. But black wouldn't introduce
# a space after the `#` character on the commented-out line.
string_single = [
'a',
'b',
#'c',
'd',
]
# OK:
# Space added after `#` as expected.
string_double = [
"a",
"b",
#"c",
"d",
]
# OK:
# Space added after `#` as expected.
ints = [
1,
2,
#3,
4,
]
I ran this on the Black Playground. (22.3.0)
The output is like this:
Expected behavior
Expected output for the first NOTOK case is
string_double = [
"a",
"b",
# 'c',
"d",
]
I don't expect the 3rd line to become # "c"
(double quotes instead of single quotes); I appreciate that black does not wish to modify commented code too much.
But I hope the behaviour of adding a single space after #
can be more consistent.
We have a special case to not add spaces in #'
: https://github.com/psf/black/blob/8ed3e3d07ea3e6d62e3e533e69f96a0ff148cd5d/src/black/comments.py#L26.
I don't remember why; maybe such comments are special to some IDE.
Ah, unfortunate. Yea traced it down to this commit: c6c8ef76, referenced issue #532.
Thanks for finding that! It would be useful to add links to the code to explain why we have all these exceptions. We could also verify whether pweave still needs this.
Agreed on both accounts! @JelleZijlstra
Let me check on the second point. If they're still using #'
then I suppose the decision here would still be to leave this "inconsistency" alone, yea? After all, if we make the change back to including the space # '
then it would break pweave code. But the benefit gained is just "consistency" 😅
Actually pweave seems dead: https://github.com/mpastell/Pweave/issues/161. So maybe we don't need to keep support for it around?
Actually pweave seems dead: mpastell/Pweave#161. So maybe we don't need to keep support for it around?
Yea, apart from the inactivity of Pweave:
- I'll also note that Pweave supports
#+
as a valid marker, butblack
currently doesn't take that into account inCOMMENT_EXCEPTIONS
. So it doesn't seem like supporting Pweave was an important consideration (?) - The alternative sw mentioned in the issue you linked to (zen-knit) as far as I can tell, doesn't use
#'
.
I'll just reopen the issue for now, hope to receive further opinions about this / if anyone has other considerations that we're missing.
Hi,
The presence of COMMENT_EXCEPTIONS in black's code is somewhat weird. I find it a very useful that black left some comments intact. Such comments can be processed by external tools. For them, e.g. the ones dealing with literate programming, it is important that comments were passed to them verbatim. I am working on such a tool and it surprised me that black mangled my comments.
It would satisfy me should black exposed COMMENT_EXCEPTIONS as an option in TOML's configuration. This would then let me start a verbatim comment with #: only but also to have other starters, for example: #>, #<, #!, #{, #}.
By introducing COMMENT_EXCEPTIONS black already departures from PEP-8. That's why I suggest going a step further and accept the fact that comments accompany the actual code and it's not always good to process them as if they were the code.
What do you think?
Ryszard
I wonder if it would be fine to just avoid adding the space if the first character is not an alphanumeric character or a space or a period (this could get tricky with unicode characters). I much prefer if we don't introduce another formatting option just for this and playing duck duck goose with the various integrations and pre-processors out there seems untenable.
The main downside would be commented code could be broken by inconsistent spacing being applied. Although commented out code that uses single quotes would already be broken. I guess this also depends on how the code is uncommented, a simple "remove the first two characters of each line" would break, but you could probably do it in a smarter way.
Closing in favor of #3668.