xarray
xarray copied to clipboard
Use ruff for formatting
- [ ] Closes #8760
~~- [ ] Tests added~~
~~- [ ] User visible changes (including notable bug fixes) are documented in
whats-new.rst
~~ ~~- [ ] New functions/methods are listed inapi.rst
~~
Note: many inline ...
obtain their own line. Running black .
would have produced the same result
I can't reproduce black .
moving the ...
in type stubs / override to a separate line, so this is an example where ruff
diverges from the black
style (to be fair, I would have done the same if the declaration excluding the override decorator is multiline already). Could it be that black
doesn't have an opinion in this case?
The trailing empty line for blocks (empty ...
) were something we intentionally chose to introduce in #4510 (this was a styling option that the ruff
people intentionally chose to do differently).
Another thing which I'm not too happy about is their assumption of ::
in rst meaning python, then falling back to doing nothing if the code does not parse correctly (which they do for all rst code blocks). This means that invalid code is not caught by the formatter but only by running the doctests, which is usually done at a much lower frequency than running the formatter.
Additionally, xarray
is a small-ish repository, and running black
on the entire code base (which is a rare thing to do outside of CI) completes in less than a second (using the compiled black
on PyPI). In other words, I'm not sure if speed should be the deciding factor.
So as a summary, I'm not a fan of (most of) these changes, and I would vote for waiting until the dust settled a bit more (I feel a bit conflicted as the author and maintainer of one of the replaced hooks, though).
Additionally,
xarray
is a small-ish repository, and runningblack
on the entire code base (which is a rare thing to do outside of CI) completes in less than a second (using the compiledblack
on PyPI). In other words, I'm not sure if speed should be the deciding factor.
I find that the big files such as dataset.py
& test_dataset.py
take a long time to save, with autoformatting in vscode. Xarray is the only python repo I have this problem with. It makes working with a quick iteration loop of tests running when saving a file implausible.
Running echo "" >> xarray/tests/test_dataset.py && time black xarray/tests/test_dataset.py
takes 5000ms!
Running echo "" >> xarray/tests/test_dataset.py && time ruff format xarray/tests/test_dataset.py
takes 50ms
So I would prioritize performance quite highly here
I would vote for waiting until the dust settled a bit more
Not strongly opposed to waiting for a bit if others want to, I do expect some churn in the ruff formatter
Note: many inline
...
obtain their own line. Runningblack .
would have produced the same result
I think this is an issue with black versons. Worth aligning with the pre-commit config
Hello,
Thanks for your answers!
I made this Draft PR as a small example of migrating from black to ruff-format, to have some base material to talk about the parent issue #8760.
I can't reproduce
black .
moving the...
in type stubs / override to a separate lin
and
I think this is an issue with black versons. Worth aligning with the pre-commit config
Regarding black .
: indeed I did not pay attention at the version I used. Here is the version on my project env:
black --version
black, 21.12b0 (compiled: no)
I also see I don't have a compiled version of black installed. I am not sure how to get this version though.
Also I agree that I am not a fan of this forced de-inlining, I had a look inside of the ruff-format documentation but could not find any configuration option allowing to opt-out of this de-inlining. It also brings unnecessary formatting changes to the codebase. Maybe an issue can be raised on the ruff repo itself about this topic, it being a necessary condition before even considering moving to using ruff-format. Edit: https://github.com/astral-sh/ruff/issues/10026
The trailing empty line for blocks (empty
...
) were something we intentionally chose to introduce in #4510 (this was a styling option that theruff
people intentionally chose to do differently).
Regarding the trailing ...
in doctests: I know they can cause issues when using pytest-accept
. See https://github.com/max-sixty/pytest-accept/issues/148 for the discussion on that matter with @max-sixty ; it can even be traced back to a bug in the standard doctest
lib itself. So I would argue in favor of removing them, something that ruff does. Also I am not sure where in #4510 these trailing ...
are mentioned, maybe it is the wrong link?
Another thing which I'm not too happy about is their assumption of
::
in rst meaning python
Is this something that blackdoc
solves? Would it be possible to extract this logic with another formatter, or is it tightly coupled to black
? Or in other words: do you think blackdoc
could become a ruffdoc
at some point? As the author of blackdoc
you would have valuable insight to help improve other formatters for sure.
If I understand well, it formats code contained in documentation. This issue mentions formatting code in documentation + blackdoc: https://github.com/astral-sh/ruff/issues/7146
Additionally,
xarray
is a small-ish repository, and runningblack
I agree with @max-sixty here, I also use VSCode with auto-formatting enabled and the project is laggy from time to time. Two levers of action I can think of: reducing formatting time with a more performant formatter, or reducing the size of the file sizes (this would be a refactoring effort, and I don't know what the xarray team considers to be a "too big" file that would deserve to be split). Quality of life during development is valuable
Maybe an issue can be raised on the ruff repo itself about this topic
Based on the discussion at the end of astral-sh/ruff#5822 I am led to believe that we just need to wait a bit more?
maybe it is the wrong link?
that was https://github.com/pydata/xarray/pull/4510#discussion_r512138255, sorry for not directly linking that
Would it be possible to extract this logic with another formatter, or is it tightly coupled to
black
?
ruff
is entirely written in rust (which is what makes it fast), while blackdoc
is written in python (and built on top of black
). The two are thus pretty much incompatible as far as code goes, but I remember the ruff
people looking at both blackdoc
and blacken-docs
for inspiration when creating the docstring formatter.
In any case, note that formatters should not change the behavior of other tools, if it does then there's something wrong with either the formatter or the other tool. The trouble here, though, is that doctest
is a pretty ambiguous format: what would
>>> print("...")
...
be interpreted as? ruff
and blackdoc
will probably both remove the trailing ...
, and would be wrong to do so.
However, this:
>>> if condition():
... do_something()
...
is not as ambiguous: since the code example starts with a block statement, the ...
can be expected. Obviously that would require an analysis of the first line(s), but this is much better than having to analyze the semantics (and it usually is not a good idea to artificially restrict a style because of the tooling).
I'm not sure how resolve this, besides switching to a different format where you're much more unlikely to encounter the ambiguity (like the IPython format).
I also use VSCode with auto-formatting enabled
I can empathize with that. I personally do the auto-formatting on commit (using pre-commit
), not on save, so I'm not as affected by that (waiting a bit longer on commit is not that much of a deal).
we just need to wait a bit more
Yes indeed, if I understand well black has this concept of "Preview" style: https://black.readthedocs.io/en/stable/the_black_code_style/future_style.html
This is possible to test this "preview" style (see https://github.com/astral-sh/ruff/issues/10026#issuecomment-1951367648 ; I updated the PR accordingly)
When this preview style is used, Ellipses are back onto their single line, but other style changes also come in.
The print("...")
example is perfect!
I'm not sure how resolve this, besides switching to a different format where you're much more unlikely to encounter the ambiguity (like the IPython format).
What is the IPython format, an alternative syntax for doctests? As usual, delimitation is the issue: how to delimitate input from expected output to be unambiguous without becoming too verbose...
but other style changes also come in.
You'd also get these other changes with black
, but I think they pushed those back a year because it wasn't stable enough in time for the new stable style. So I think ruff
just needs to catch up to that change.
What is the IPython format, an alternative syntax for doctests?
just like doctest
it is the output of a REPL, in this case ipython
instead of python
, so we'd (basically) be copying the output of a interactive shell. However, the delimiters chosen for continuation lines ( ...:
, with the number of spaces chosen such that the prompts align) are much less likely to be encountered outside of REPL tooling.
For example:
In [9]: """
...: """
Out[9]:
In [10]: (
...: 1 + 2
...: )
Out[10]: 3
So ipython
is to python
what an idoctest
would be to doctest
! I can see that something already exists: https://github.com/ipython/ipython/blob/main/IPython/testing/plugin/ipdoctest.py
What I can see potentially annoying with IPython format is the numbering of the input-output pairs, if you want to add an example at the top of the docstring, all the following existing doctests would require to change. Also, when going from 9 to 10, digit count goes grom 1 to 2 and adds an offset.
Here I just empty the execution count, replacing it by a space.
Then the docstrings would look like for example:
In [ ]: """
...: """
Out[ ]: '\n'
In [ ]: (
...: 1 + 2
...: )
Out[ ]: 3
In [ ]: print("...")
...
In [ ]: print("Out[ ]:")
Out[ ]:
In [ ]: "Out[ ]:"
Out[ ]: 'Out[ ]:'
In [ ]: None
In [ ]:
I notice that the diff is now quite large? I guess that's the preview
flag?
And that mypy is failing?
If we can reduce the size of the diff as much as possible and get the docs passing, I'd be +1 here. But obv one voice among many. I'm also fine to wait until things settle more if others prefer.
Yes, the preview flag brings too things:
- The single line stubs
- Javascript-style braces
The preview flag cannot be reliable upon, so for now, the only option is to wait for the official integration of single line stubs in the formatting rules
Even with preview = false
the diff is still large (36 files). However most of it is constituted of ellipses. So when it's fixed, the diff should be significantly smaller
There is still also the issue of ellipsis new lines in the dosctrings, but it is has a less significant share in the diff.
I'll try to keep this draft PR updated as a "showcase" of what applying ruff alongside a pre-commit run --all
would look like
I'll try to keep this draft PR updated as a "showcase" of what applying ruff alongside a
pre-commit run --all
would look like
OK perfect!