wemake-python-styleguide
wemake-python-styleguide copied to clipboard
WPS432 should preserve the literal representation
Bug report
This code
import pathlib
path = pathlib.Path('.')
path.chmod(0o700)
causes
WPS432: Found magic number: 448
It's unobvious that it's the same number and it's impossible to use 448 in grep because it won't find anything.
What's wrong
Non-decimal numbers are reported as decimal. Also, maybe for things like chmod() it's okay to allow such "magic numbers" because this form directly relates to a normal syscall or a CLI command one would make.
How is that should be
The formatting should be preserved. The error should've been WPS432: Found magic number: 0o700 instead.
System information
N/A
Yes, I agree. Thanks a lot for the bug report!
Hey! Can I take this one? Thanks.
@hhsdev thanks!
@sobolevn Hi! I am having some trouble implementing this. Could you give me some advice?
tl:dr; I think the implementation needs both ast and tokenizer.
Currently, MagicNumberViolation is checked by WrongNumberVisitor, derived from BaseNodeVisitor. WrongNumberVisitor._check_is_magic uses ast related features to filter out valid magic number usages (like in constructors and whatnot). I can't think of easy ways to implement them using tokenizer.
However, in order to preserve literal representation, we must use tokenizer since by the time the number reaches ast, its representation is lost. And as I said above, we can't really abandon the ast either.
The current solution I'm thinking of is to subclass both BaseNodeVisitor and BaseTokenVisitor. We build a dictionary of all magic number usages and their representation using BaseTokenVisitor's pass. On the BaseNodeVisitor's pass, we check whether a usage really is MagicNumberViolation. If so, we look back on the dictionary we build for its representation.
Problem is, I don't see any visitor that inherits from multiple visitors (apart from BaseNodeVisitor, but that's a different story). Also, my solution above relies that token visitors runs before ast visitors, which I don't think is the case. Furthermore, I feel like it's an internal detail that I shouldn't rely on.
I think I'm missing something. But I don't know what I'm missing. Can you give some advice on how to go about this? Thanks in advance!
tl:dr; I think the implementation needs both ast and tokenizer.
Then you should probably use libcst, see #1068
I am going to merge this PR soon.
Nice! Will wait for the PR.