typos icon indicating copy to clipboard operation
typos copied to clipboard

Excluding files unexpected behavior.

Open simon-liebehenschel opened this issue 4 years ago • 15 comments

Current settings:

_typos.toml

[files]
extend-exclude = ["*.json", "**/*.json"]

.pre-commit-config.yaml

-   repo: https://github.com/crate-ci/typos
    rev: typos-v0.7.4
    hooks:
      - id: typos

Expected behavior: JSON files must be excluded

Expected behavior: All files are checked.

Steps to reproduce

pre-commit run typos --all-files

simon-liebehenschel avatar Sep 15 '21 18:09 simon-liebehenschel

typos assumes the user knows best when passing in file paths, overriding the exclusions. Unfortunately, this doesn't layer well with tools like pre-commit.

We need to provide a way to control this behavior.

epage avatar Sep 15 '21 20:09 epage

I have the same issue, but only when running typos-src with pre-commit, running typos from the terminal works fine. this is my config (_typos.toml at project root directory):

[files]
extend-exclude = ["src/assets/locales"]

locale files are ignored when running from the terminal, but not from pre-commit run --all-files -v typos-src

  - repo: https://github.com/crate-ci/typos
    rev: v1.3.3
    hooks:
      - id: typos-src
        # args:
        #   - --exclude
        #   - src/assets/locales #uncommenting args works, but its not reading my config file

amitlevy21 avatar Jan 05 '22 06:01 amitlevy21

typos assumes the user knows best when passing in file paths, overriding the exclusions. Unfortunately, this doesn't layer well with tools like pre-commit.

We need to provide a way to control this behavior.

What about the hook attribute exclude ? It can exclude files with specified pattern.

kwanhur avatar Mar 06 '22 09:03 kwanhur

Hi, we have a little bit different situation

typos.toml

$ cat typos.toml 
[files]
extend-exclude = ["Cargo.lock"]

.pre-commit-config.yaml

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
      - id: check-added-large-files
      - id: check-merge-conflict
      - id: check-symlinks
      - id: detect-private-key
      - id: end-of-file-fixer
      - id: trailing-whitespace
  - repo: https://github.com/crate-ci/typos
    rev: v1.4.1
    hooks:
      - id: typos

Running couple of times pre-commit run typos --all-files gives different results. 2 passes and 1 fail

$ pre-commit run --all-files                  
check for added large files..............................................Passed
check for merge conflicts................................................Passed
check for broken symlinks............................(no files to check)Skipped
detect private key.......................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
typos....................................................................Passed
$ pre-commit run --all-files                  
check for added large files..............................................Passed
check for merge conflicts................................................Passed
check for broken symlinks............................(no files to check)Skipped
detect private key.......................................................Passed     
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
typos....................................................................Passed
$ pre-commit run --all-files
check for added large files..............................................Passed
check for merge conflicts................................................Passed                        
check for broken symlinks............................(no files to check)Skipped           
detect private key.......................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
typos....................................................................Failed

After that Cargo.lock was changed

$ git diff Cargo.lock
diff --git a/Cargo.lock b/Cargo.lock
index 938bfc3b47f7..68cca07e67e5 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -296,7 +296,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "d1873270f8f7942c191139cb8a40fd228da6c3fd2fc376d7e92d47aa14aeb59e"
 dependencies = [
  "crypto-common",
- "inout",
+ "input",
 ]
 
 [[package]]
@@ -769,7 +769,7 @@ dependencies = [
 ]
 
 [[package]]
-name = "inout"
+name = "input"
 version = "0.1.2"

As if exclusions from typos.toml weren't applied every time. Has anyone encountered a similar problem? It looks like a similar case that was here https://github.com/crate-ci/typos/issues/278

tomzy-0 avatar Mar 09 '22 13:03 tomzy-0

typos.toml

$ cat typos.toml 
[files]
extend-exclude = ["Cargo.lock"]

This is making me realize we should probably centralize all locks into a single file type and have it excluded.

epage avatar Mar 09 '22 13:03 epage

@TomaszAIR v1.5.0 was just released where we do not check lock files. This is done in a different way than extend-exclude and should work with pre-commit.

epage avatar Mar 09 '22 15:03 epage

@epage thanks for quick fix. Will test that and let you know how it goes.

tomzy-0 avatar Mar 11 '22 09:03 tomzy-0

Yeap, it works. Can confirm that v1.5.0 do not check lock files. None of the extend-exclude configuration is needed. Thanks

tomzy-0 avatar Mar 11 '22 10:03 tomzy-0

Hello, when we can expect the fix for the issue? Just a question, not pushing for any actions 😉

sshishov avatar May 23 '22 19:05 sshishov

Hello, when we can expect the fix for the issue?

The core of this is we need a natural mode where we trust the user and a scripted mode where we don't trust the user. The blocker is defining how to express this in the UI. Bonus points for including any precedence from other tools

Also want to note that #461 has the workaround of duplicating the exclusions

epage avatar May 23 '22 19:05 epage

Thanks @epage , currently as a workaround we started using pre-commit exclude, but added #TODO: with appropriate comment. And also keep the configuration to exclude in the config file

sshishov avatar May 23 '22 19:05 sshishov

Additional workaround is:

  - repo: https://github.com/crate-ci/typos
    rev: v1.8.1
    hooks:
    - id: typos
      pass_filenames: false

Got it from here as the same issue exists in interrogate package

sshishov avatar May 28 '22 16:05 sshishov

The same issue as in isort fixed by PR 939 and black fixed by PR 1032

Same problem was happening with flake8 and happening now with [interrogate](https://github.com/crate-ci/typos/issues/347)

sshishov avatar May 28 '22 18:05 sshishov

@sshishov thanks for those examples. To summarize:

  • isort added --filter-files
  • black added --force-exclude

I wish there was a flag that was common between tools, even if the sample size is just two, as being consistent can be more important than being ideal on its own.

Since these aren't consistent, my next thought is on which more clearly communicates intent. I feel like isort's only makes sense when reading the docs and thinking about what it is trying to say. black's makes a little more sense. "force" make it clear this is a binary flag and that it is overriding some default behavior. I lean towards that somewhat even though something still feels off to me. Really, we are implicitly forcing the the args to be respected independent of the exclusions. I just can't think of a succinct way of saying that, so maybe just mirroring black would be best.

If people know of other examples, I'm still open open on this. At least for me adding the flag, it might be a week or so as I'm trying to wrap up some major work on the clap crate.

epage avatar May 29 '22 01:05 epage

The core of this is we need a natural mode where we trust the user and a scripted mode where we don't trust the user. The blocker is defining how to express this in the UI. Bonus points for including any precedence from other tools

Here's another example - prettier ignores all files that are in it's ignore list, even if you explicitly tell it to format that particular file.

AFAIK, this behaviour exists because you're expected to run prettier on a folder or multiple files and rarely run it on a single file.

That is a pattern I'd like to see in typos as well, but it's up to you. I think it's easier to have 2 configs - one with a blacklist and one without one, than adding multiple cli flags to ignore/force unignore configuration settings.


My original issue was with running typos as a github action on changed files, which then spellchecks blacklisted files as well.

It would have been possible to filter out the blacklist from the changelist, if the exclude config was in a format bash can parse natively. Of course, I can install toml parsers, filter out the blacklist from the changeset and pass that along to typos...


Btw, thanks for closing the issue I opened and redirecting me here. I didn't find this issue initially.

arminrosu avatar Jul 12 '22 07:07 arminrosu

What about the hook attribute exclude ? It can exclude files with specified pattern.

This is what I'm using. For example:

repos:
- repo: https://github.com/crate-ci/typos
  rev: v1.13.10
  hooks:
    - id: typos
      exclude: |
        (?x)^(
          .*\.csv
          |.*\.xml
        )$

You can also explicitly limit to certain file types by overriding the text type selected in the repo's hook definition:

repos:
- repo: https://github.com/crate-ci/typos
  rev: v1.13.10
  hooks:
    - id: typos
      types_or:
      - javascript
      - python

IMO one of the powers of pre-commit is that it already passes the correct filenames, based on hook selection and exclusion, so tools don't have to reimplement undifferentiated inclusion/exclusion features.

adamchainz avatar Feb 07 '23 13:02 adamchainz

IMO one of the powers of pre-commit is that it already passes the correct filenames, based on hook selection and exclusion, so tools don't have to reimplement undifferentiated inclusion/exclusion features.

For tools written exclusively for pre-commit, that works well. However, this tool is written more generally and needs to have its own logic.

epage avatar Feb 07 '23 14:02 epage

Hi @epage.

Would you accept a PR adding a new --force-exclude similar to the one used by black?

--force-exclude Like --exclude, but files and directories matching this regex will be excluded even when they are passed explicitly as arguments. This is useful when invoking Black programmatically on changed files, such as in a pre-commit hook or editor plugin.

I really wish I could use typos with pre-commit without the pass_filenames: false workaround.


On a second thought, the --force-exclude of black seems overkill since it expects a regex. A boolean flag looks more appropriate.

I think by default, we want typos to ignore excluded files even if they're explicitly passed to the command line. This is because using typos *.py is a very common usage, so it makes sense that exclude takes precedence.

Since typos already has some command line options such as --no-ignore, maybe we could just add --no-exclude to ignore the configured extend-exclude list?

Delgan avatar Sep 24 '23 12:09 Delgan

Yes, when looking at black before, I thought it was a bool and not yet another exclude type which seems like it'd be awkward to deal with.

I think by default, we want typos to ignore excluded files even if they're explicitly passed to the command line. This is because using typos *.py is a very common usage, so it makes sense that exclude takes precedence.

Is typos *.py very common? For myself, I run it on directories or on individual files and neither case runs into this problem.

If we went this route, it'd be a breaking change and we'd need to bump to 2.0. Not the end of the world but something to keep in mind.

I also feel like we'd need a way to help the user discover that we ignored what they said and they need to pass in an extra flag. Not thrilled with doing that.

Most of this problem seems to be centered on pre-commit-like workflows where we shouldn't be trusting the external source of information.

epage avatar Sep 25 '23 15:09 epage

I've done more research and compared other tools, and I admit that what I've suggested isn't that common. It depends on people's preferences, but there doesn't seem to be any consensus. Personally, I'm more likely to use typos *.py than to call typos on a file I've explicitly configured to be excluded.

Just for example, there is ESLint which displays a warning when excluded files are passed to the command line and requires an explicit --no-ignore: Ignore Files - ESLint - Pluggable JavaScript Linter.

However if we take into account the issue regarding backwards compatibility, I agree it's definitely preferable not to change the existing behavior. Apologies for not thinking of that.

As a further example, there is also Ruff which implemented --force-exclude as a boolean flag and directly integrated it to their pre-commit hook: ruff-pre-commit/.pre-commit-hooks.yaml#L4. This matches with solutions discussed in this thread and we could copy it for a touch of consistency.

I can understand your hesitations to add such a flag, but its primary purpose is to enable typos to be used programmatically. Not only for pre-commit, but also to any Bash script or VS Code extension that wants to use typos without the burden of maintaining its own exclude list. From my perspective, it's more of a valuable feature than a mere workaround.

I am always willing to open a PR if you wish.

Delgan avatar Sep 26 '23 18:09 Delgan

My concern with --force-exclude was more centered on black having it be a separate list of excludes. Having it be a bool would make sense and I'd be willing to be consistent with ruff in this.

epage avatar Sep 26 '23 18:09 epage

Closed via #837

epage avatar Oct 09 '23 20:10 epage