pre-commit-hooks
pre-commit-hooks copied to clipboard
insert-license detection does not ignore spaces after comment symbols
Take python as an example
# Copyright...
and
# Copyright...
It doesn't consider the two pieces of code to be the same, so it doesn't detect # Copyright...
,just because it has two spaces after #
The result is that it will automatically add the license again.
I agree that this is annoying.
Have you tried to fuzzy-match your license? cf. https://github.com/Lucas-C/pre-commit-hooks#fuzzy-license-matching
Of course, but that would make me have to delete the TODO
for each file after running. I'd prefer to be able to automatically skip when a match is reached, rather than forcing the TODO
to be inserted, because I don't think the space indentation issue is worth it Fix it individually.
So, I ended up choosing --skip-license-insertion-comment
to avoid it from being automatically inserted, but this will cause --use-current-year
fail, which is obviously not as reasonable as using fuzzy matching to skip.
So, I ended up choosing
--skip-license-insertion-comment
to avoid it from being automatically inserted, but this will cause--use-current-year
fail, which is obviously not as reasonable as using fuzzy matching to skip.
Does that mean that your problem is solved?
Otherwise, it's not clear to me what solution you suggest?
I think it is more convenient to judge whether to skip insert by fuzzy matching rate. But this feature has not been implemented yet, so I can only use --skip-license-insertion-comment as a substitute for it.
I typically use a Python style format tool like black or similar, meaning I rarely have to worry about # Copyright...
versus # Copyright...
which is nice. That said it seems practical to ignore any space(s) (or other whitespace like tabs?) after the #
for matching the license text.
The way I was expecting this to work would be the comparison is done on the comment block from the file with the comment syntax removed, which might be harder than it seems with assorted different commenting syntax configurations. However, in fact looking at the code, the comparison is done on the actual file contents versus the expected license block with the configured comment marker and one space.
See https://github.com/Lucas-C/pre-commit-hooks/blob/v1.5.5/pre_commit_hooks/insert_license.py#L167 which inserts one space when preparing the expected license block, not just for inserting into the file if missing, but also used for finding the license: https://github.com/Lucas-C/pre-commit-hooks/blob/v1.5.5/pre_commit_hooks/insert_license.py#L549
i.e. The simplest way I can see to fix this is to build a regular expression of "{opening comment}{at least one space}{line of license}" and use that in the search?