setup-cfg-fmt
setup-cfg-fmt copied to clipboard
Setup-cfg-fmt doesn't respect newline convention of source file nor read/write it with the correct encoding
Setup-cfg-fmt does not respect the EOL of the setup.cfg
, and instead converts it to a platform-dependent default. I'm not aware of a case where that would be the desirable behavior, whereas many/most projects standardize on one newline convention (typically \n
) for all source files, which this breaks. Furthermore, if using an EoL fixer with pre-commit, it will catch this but aside from extra noise in the output, if it isn't sequenced after this hook in the config file, it will cause the next run to fail too.
Fixing this is very simple—just check the current EoL character when reading in the file, and pass that explicitly on writing it. So at the top of format_file()
, on the line after read()
ing the file, you'd just add newline = f.newlines
, and then pass newline=newline
in the call to open()
when writing it,
In addition, neither open()
call specifies the correct encoding to use, which means the platform and environment-dependent locale encoding will be assumed. As setup.cfg is required to be UTF-8, this means loading will fail on Windows if any non-ASCII characters are present (common in e.g. authors' names), or on any platform that doesn't have UTF-8 mode enabled, and even if loading were to succeed, the file would be written in the incorrect encoding. Furthermore, not specifying an encoding when opening in text mode will be an optional warning in Python 3.10, with a DeprecationWarning planned to follow in later versions. To fix this, simply explicitly pass encoding="utf-8"
to both calls.
Happy to submit a PR to fix both of these if you'd like. Thanks!
adding encoding='utf-8'
seems fine, as for picking a line ending I'd lean towards always using \n
-- I don't think your suggestion of newline = f.newlines
works, that value is always None
and determining mixed newlines or the proper ending seems out of scope and/or difficult (and/or better handled by a tool that actually is intended to solve that problem)
I don't think your suggestion of
newline = f.newlines
works, that value is alwaysNone
That attribute is lazy and you have to do a .read()
first, or it will be None
as you observed, which is why my suggestion added it it after the existing read()
.
determining mixed newlines or the proper ending seems out of scope and/or difficult (and/or better handled by a tool that actually is intended to solve that problem)
Good point on the mixed newlines; since that's a relatively uncommon edge case I hadn't considered it. However, in that case (where per the docs, f.newlines
will return a tuple of line endings, so if f.newlines
is not a str
instance, it can simply fall back to \n
as you suggest.
Other than cases of mixed EOLs where the file must be conformed to something, I'd think arbitrarily changing line ending, generally a per-project default handled by other hooks and orthogonal to the standards and conventions used in setup.cfg
, would be out of scope here.
as for picking a line ending I'd lean towards always using \n
While hardcoding it would work for my particular application, it won't solve this problem for any projects that use another line ending, and potentially create a new one for those that use the platform default. So I'd suggest respecting the existing line ending for the common/simple case, and only conforming if its non-trivial.