pre-commit-hooks
pre-commit-hooks copied to clipboard
`file-contents-sorter` Not working same way everytime
I get strange issue that file-contents-sorter hook does not work same way everytime. I made little test case which trickers the issue. If i run this 10 times it will sometimes pass and sometimes not. I did not have time to look what problem might be.
Add this to file_contents_sorter_test.py
(
b'pre\nPre\n',
['--unique', '--ignore-case'],
PASS,
b'pre\nPre\n',
),
And run it with
for run in {1..10}; do pytest tests/file_contents_sorter_test.py; done
This gives me that about 30% pass and 70% fails
yeah this set call breaks unique sorting -- it should probably just forbid the combination of unique and ignore case since the result is undefined https://github.com/pre-commit/pre-commit-hooks/blob/v4.3.0/pre_commit_hooks/file_contents_sorter.py#L36
@asottile please review if you have time. I'm hoping this PR is simple enough and goes with the spirit of forbidding the combinations of options as you mentioned. Also, I'd appreciate if you could label w/ hacktoberfest-accepted
so I can get a tree planted, thanks.
I'm not going to review something which doesn't pass tests
@asottile Apologies for not seeing the failing tests before asking. I think its ready now.
yeah this set call breaks unique sorting -- it should probably just forbid the combination of unique and ignore case since the result is undefined
v4.3.0
/pre_commit_hooks/file_contents_sorter.py#L36
Could we respect the original word case if the uncased words are equal? For example, let Pre
always proceed pre
because it's upper case. This can be easily implemented with tuple
-based sort keys.
def key(s):
return (s.lower(), s)
ret = sorted(set(word_list), key=key)
Another option is to respect the original order (stable sort). This can use an "OrderedSet
" rather than set
.
def unique(iterable):
return list(OrderedDict.fromkeys(iterable)) # can be replaced with `dict` for Python 3.7+
Both approaches I list above are deterministic.
no, it should forbid the combination as I said already
On Mon, Mar 27, 2023, 4:24 AM Xuehai Pan @.***> wrote:
yeah this set call breaks unique sorting -- it should probably just forbid the combination of unique and ignore case since the result is undefined v4.3.0/pre_commit_hooks/file_contents_sorter.py#L36 https://github.com/pre-commit/pre-commit-hooks/blob/v4.3.0/pre_commit_hooks/file_contents_sorter.py?rgh-link-date=2022-07-29T11%3A55%3A28Z#L36
Could we respect the original word case if the uncased words are equal? For example, let Pre always proceed pre because it's upper case. This can be easily implemented with tuple-based sort keys.
def key(s): return (s.lower(), s) ret = sorted(set(word_list), key=key)
Another option is to respect the original order (stable sort). This can use an "OrderedSet" rather than set.
def unique(iterable): return list(OrderedDict.fromkeys(iterable)) # can be replaced with
dict
for Python 3.7+— Reply to this email directly, view it on GitHub https://github.com/pre-commit/pre-commit-hooks/issues/794#issuecomment-1484715837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN2BH2U7K4LE7U7DJ5BN5TW6FFD3ANCNFSM55AOQ72Q . You are receiving this because you were mentioned.Message ID: @.***>
To make this linter work properly with all type of line endings, we should change the behavior to this:
Because currently the file-contents-sorter
doesn't work consistently on different platforms.
The file-contents-sorter
currently works correctly only on platforms with \n
line ending :(
@Childcity that's unrelated, documented, and intentional
@Childcity that's unrelated, documented, and intentional
So if I working on Windows and have to run file-contents-sorter
in the Windows environment I can't do that!
I think, that file-contents-sorter
should only sort lines (not changing its content).
For fixing line-endings we have separate linter mixed-line-ending
windows has supported posix newlines for decades
please stop replying here as it is off topic