ruff icon indicating copy to clipboard operation
ruff copied to clipboard

INP001 check should respect `ruff.src` setting or have it configured separately

Open notypecheck opened this issue 3 years ago • 19 comments

A lot of projects nest code under src directory by adding src to PYTHONPATH:

src/
  some_package/
    __init__.py
    ...
  main.py

pyproject.toml
ruff .
src\main.py:1:1: INP001 File `\src\main.py` is part of an implicit namespace package. Add an `__init__.py`.
Found 1 error(s)

Ruff version: 0.0.225 Ruff configuration:

[tool.ruff]
select = ["INP"]
src = ["src"]

A solution to this might be using ruff.src or different settings to exclude files contained directly in that directory
But this might be an issue for packages that would want to use ruff.src for example for isort checks but would still want their package directory to be checked by INP001

notypecheck avatar Jan 18 '23 14:01 notypecheck

Adding __init__.py actually breaks some tools like mypy:

# src/some_package/test.py
def a() -> int:
    return 42
# main.py
from some_package.test import a

reveal_type(a())

Without __init__.py:

$ mypy .
src\main.py:3: note: Revealed type is "builtins.int"
Success: no issues found in 3 source files

With __init__.py:

$ mypy .
src\main.py:1: error: Skipping analyzing "some_package.test": module is installed, but missing library stubs or py.typed marker  [import]
src\main.py:1: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
src\main.py:3: note: Revealed type is "Any"
Found 1 error in 1 file (checked 4 source files)

notypecheck avatar Jan 18 '23 14:01 notypecheck

Hmm. I think omitting directories specified by src makes sense? Trying to think of where that would cause problems.

charliermarsh avatar Jan 18 '23 16:01 charliermarsh

Libraries might use the following structure:

root/ # Which is the project root
  library/
    __init__.py
    ...
  pyproject.toml

In this case developers would want to treat import from library which would be in ruff.src as first party, but it also should not be omitted from INP check

notypecheck avatar Jan 18 '23 16:01 notypecheck

Note that flake8-no-pep420 has the same behavior:

$ tree check_inp001
check_inp001
└── src
    ├── main.py
    └── some_package
        └── __init__.py

$ flake8 --select INP check_inp001
check_inp001/src/main.py:1:1: INP001 File is part of an implicit namespace package. Add an __init__.py?

I'm also not sure how common it is to have modules directly nested under src/, but the single file in the case above could be ignored:

[tool.ruff.per-file-ignores]
"src/main.py" = ["INP001"]

edgarrmondragon avatar Jan 18 '23 17:01 edgarrmondragon

I'd say having src directory is quite common, and specifying multiple files in per-file-ignores is a bit tedious Maybe there is a way to ignore files that are placed directly in a directory? 🤔

notypecheck avatar Jan 18 '23 17:01 notypecheck

I'd say having src directory is quite common

Oh, I'm aware of that, I use it all the time 😄. It's how common it is to have Python files directly nested, e.g. src/main.py that I wonder.

edgarrmondragon avatar Jan 18 '23 17:01 edgarrmondragon

I'd say having src directory is quite common

Oh, I'm aware of that, I use it all the time 😄. It's how common it is to have Python files directly nested, e.g. src/main.py that I wonder.

I'd say it's common, if there can be a package, there can be a module, they're interchangeable in some sense

notypecheck avatar Jan 18 '23 17:01 notypecheck

But in that case, should there not be an __init__.py in that directory?

charliermarsh avatar Jan 18 '23 17:01 charliermarsh

(Genuinely asking. I'm not certain.)

charliermarsh avatar Jan 18 '23 17:01 charliermarsh

In src? src itself is not a package, so, no 🤔
@charliermarsh Maybe there's a way to select direct children in per-file-ignores? This would kind of solve this

notypecheck avatar Jan 18 '23 18:01 notypecheck

Hmm, yeah, right now, this matches any Python file under src, even in nested directories:

[tool.ruff.per-file-ignores]
"src/*.py" = ["INP"]

It's possible to change this internally, such that * doesn't match /, in which case, these would mean two different things:

[tool.ruff.per-file-ignores]
"src/*.py" = ["INP"]
"src/**/*.py" = ["INP"]

(The globbing library supports this as the literal_separator option.)

Let me look at how other tools do this. It would be a breaking change so I'd like to be confident in it.

charliermarsh avatar Jan 18 '23 18:01 charliermarsh

(Flake8 has the same behavior ("src/*.py" matches recursively).)

charliermarsh avatar Jan 18 '23 18:01 charliermarsh

@not-my-profile - Do you have an opinion on enabling literal_separator here? It seems strictly more expressive for users, at the cost of being a breaking change in some cases. (I know you mentioned that setting in https://github.com/charliermarsh/ruff/issues/1545.)

charliermarsh avatar Jan 18 '23 18:01 charliermarsh

If we enable literal_separator, then *.py will no longer match "any Python file, anywhere" -- you'll have to do **/*.py.

charliermarsh avatar Jan 18 '23 19:01 charliermarsh

Regarding the problem described by this issue: I think ideally ruff would be smart enough to understand the packaging config in pyproject.toml to correctly determine the purpose of the src directory, e.g by understanding:

[tool.setuptools.packages.find]
where = ["src"]

(see setuptools-specific configuration)

However since there are a bunch of such packaging tools (setuptools, poetry, flit, etc.) accurately supporting all of them would be quite the challenge (I think it's unfortunate that such configuration isn't better standardized via a PEP).

So yes per-file-ignores does seem like a good way of solving this for now and increasing the expressivity of our glob patterns also seems like a good idea. I want to add that the glob.glob function from the Python standard library also does not have * match /, so following that behavior might actually be what Python developers are more accustomed to.

I do think that such breaking config changes are unfortunate, especially when they can break stuff without you noticing ... perhaps we could introduce something like Rust's package.edition .... e.g. tool.ruff.config-version so changes like these would result in a config version bump with ruff still supporting the older version and supporting an automated config upgrade (which would translate all existing globs to the new syntax). Implementing that would however of course take some effort and such "sneaky" config changes (that just change how values are interpreted) are probably quite rare anyway ... although as ruff continues to gain adoption the collective pain caused by such breaking changes would also grow, so perhaps we should consider it?

not-my-profile avatar Jan 19 '23 03:01 not-my-profile

I'd say since glob uses ** for recursive matching it's a good idea to either make it standard or add an option to choose between the two formats.

notypecheck avatar Jan 19 '23 05:01 notypecheck

Yeah using ** does seem better to me. But it does feel like a disruptive change if it's not coupled with some kind of opt-in. Let me think about how to handle.

It still seems to me that omitting the src directories from this check would solve this problem, since anything is src is assumed to be part of the Python path, effectively.

charliermarsh avatar Jan 19 '23 12:01 charliermarsh

I'd rather not have a glob_syntax setting for the sake of simplicity ... I think it makes more sense to have a versioned config format and deprecate the old syntax in favor of the new one ... which could also be done in a backwards-compatible manner (and let us easily drop support for the old syntax at some point in the future).

not-my-profile avatar Jan 19 '23 15:01 not-my-profile

Yes agreed.

charliermarsh avatar Jan 19 '23 15:01 charliermarsh