ruff icon indicating copy to clipboard operation
ruff copied to clipboard

Implement autoformat capabilities

Open charliermarsh opened this issue 2 years ago • 9 comments

This issue is the public kickoff for Ruff's autoformatting effort.

A few comments on how I'm thinking about the desired end-state:

  • The goal is to enable users to replace Black with Ruff. So, ideally, new projects could come in and replace pyflakes, pycodestyle, isort, and black with a single tool (Ruff).
  • As with the rest of Ruff, autoformatting will be entirely optional. You can continue to use Ruff alongside Black. You can also use Ruff just as an autoformatter.
  • I'd like to adhere to Black's formatting choices as much as possible, but exact 1:1 parity with Black is not a goal. I'm comfortable deviating when we can make a compelling case, but I'd like that case to be based on arguments that are unrelated to code style itself (e.g., simpler implementation).
  • I'd like to enable slightly more configuration than is possible with Black (see: #813), using Prettier as a model. Prettier allows you to specify the line-length, tab width, quote style, trailing comma behavior, and one or two other stylistic features. I think it's reasonable to support at least line length, indentation, and quote style, so I plan to start there. (It's not a goal to be as flexible as, say, Rustfmt.)

With big projects like this, I work best by taking some time to hack on possible implementations, so I'd like to do some free exploration before opening up the issue to other contributors. But I'd love to hear feedback on the above (and any tips you may have) as I start working on this.

charliermarsh avatar Jan 16 '23 03:01 charliermarsh

While I'm not sure how to feel about seeing Black being replaced as one of its maintainers (I'm kinda attached TBH!) I'd be happy to answer any questions you might have about Black. Feel free to ask about its code style and its implementation.

I barely know anything about Ruff, but from what I'm hearing, it's probably going to be the future of code quality tooling. There's no way we (Black) can ever compete on speed or being an all-in-one solution. I'm not going to contribute code or whatever, but I do feel like I have a responsibility to our users to build a better formatter even if that means helping arguably a competitor.

Obviously I do have my own Black-influenced opinions on "good code style" so don't expect me to suggest significant deviations from Black's style. I couldn't do that to my project, that's too much of a betrayal :)

ichard26 avatar Jan 16 '23 03:01 ichard26

Thank you for the thoughtful message, and for the kind offer -- I really appreciate it.

To be honest, I was a little hesitant to post this Issue, since I too am a fan and long-time user of Black, and I don't want to send the wrong message by offering an alternative. But, it's just a very natural extension of what Ruff is already doing, and it's something that comes up constantly when talking to users.

(Also: while I do want Ruff to be a viable all-in-one solution, I also want it to be incrementally useable and incrementally adoptable -- so, e.g., even if we were to complete this, my intention is that it'd always be possible to use Black alongside Ruff. Ruff has the same relationship with isort right now.)

I'll keep your offer in mind as I get the ball rolling here :)

charliermarsh avatar Jan 16 '23 03:01 charliermarsh

Hey @charliermarsh!

Can I chime in and ask for maybe something different? Black is really good at what it's doing and it's also fast (enough). What I see as a really good way for ruff to go for auto-formatting would be the way of yapf .... but better. So faster and more customizable. Black exists and we don't really need another black.

What python lacks is a customizable auto-formatter that can be customized by each project to come as close to their preferred style as possile. Only thing close to that is yapf.

yapf is an auto-formatter tool for python based on the ideas behind clang format. So the idea is to make it as customizable as possible for projects to apply their rules via autoformat.

The current problems with yapf are:

  1. It's not as customizable as it can be.
  2. It's super slow for big projects. Takes 4-5 minutes for rotki.
  3. I think it's not in a very active maintenance mode

I really believe ruff can shine here and would love to see it become this kind of fast and customizable auto-format tool.

LefterisJP avatar Jan 16 '23 12:01 LefterisJP

Great to see your thoughts here, and I'll watch closely with a hope of contributing if I have the bandwidth, and at least testing things out when you have stuff to try.

Some thoughts:

  • Quote style is probably the number 1 reason why blue even exists. It's not good enough to just not change quote style; blue deliberately chooses single quotes over double quotes wherever possible, for reasons we think are good, but in keeping with the spirit of this ticket, I won't argue that right now.
  • We think blue does a better job of maintaining spacing between code and any right hanging comments, although not a perfect job.
  • Blue is really difficult to maintain because it monkeypatches black (to gain from all the really great work that black and its maintainers have done!). Monkeypatching of course is extremely fragile. Without proper APIs in black (and I don't fault black for not supporting them), blue will always have a difficult time keeping up.
  • Blue has (had?) better configuration options than black, but even there it's problematic. Having to keep up with the internal implementation details of two external projects now is pretty daunting.
  • I personally don't care about the wide features of YAPF. For me and my projects (both open source and for $work), black gets us almost all the way there. I remember sending Łukasz a list of like 10-11 bullet points about black's choices back in the early days, and he refuted every one of them. Which of course is perfectly okay! 😄 - in fact, I've accepted and/or come around on all of them but the few that I still feel blue makes better choices on.

All that to say that IMO, it's worth having some choice on formatting styles, blue tries to make that possible, but maintaining it is difficult, the black developers are doing a fantastic job, and I still would like to see ruff provide a good, fast alternative!

warsaw avatar Jan 16 '23 15:01 warsaw

@warsaw

Quote style is probably the number 1 reason why blue even exists. It's not good enough to just not change quote style; blue deliberately chooses single quotes over double quotes wherever possible, for reasons we think are good, but in keeping with the spirit of this ticket, I won't argue that right now.

Just wanna point out that you can accomplish this right now with black -S + ruff --fix and the appropriate [tool.ruff.flake8-quotes] config - it's what I use :)

(Will collapse this since it's OT.)

layday avatar Jan 16 '23 17:01 layday

There's one thing that black won't do and I think ruff would be able to address: E501. When a line is too long, black will often ignore that. For example:

hello = "This is a really long string that's really over 80 characters wide, and black won't do anything about it"

It would be kinda weird if ruff doesn't autoformat this and complains that it's over 88 characters long. The solution to the above should be:

hello = (
    "This is a really long string that's really over 80 characters wide, and "
    "black won't do anything about it"
)

WhyNotHugo avatar Jan 17 '23 13:01 WhyNotHugo

@WhyNotHugo - Black can fix that if you use black --preview :) Hopefully Ruff can support this too.

charliermarsh avatar Jan 18 '23 13:01 charliermarsh

This project is very exciting for Zulip! To explain some of the practical benefit, one of the main practical limitations the size of a Python file is that Black takes about a 200ms per 1K line of code, which becomes a very noticeable lag when running that as an on-save hook in an editor. If that part was at Ruff speeds, I'd expect there to be essentially no autoformatter related lag when saving files in an editor, which would be a visible improvement to our Python development experience.

So being able to replace Black with Ruff, as we have with most of our other Python linters would be amazing. I support the design decision to support slightly more configurability than Black has -- while I don't recall the details, I remember our migration to Black being delayed for a long time because they didn't have the options to avoid some things that we considered style regressions.

timabbott avatar Jan 20 '23 21:01 timabbott

@timabbott - That's great to hear and I appreciate you chiming in :)

charliermarsh avatar Jan 20 '23 21:01 charliermarsh

Brief update: I'm continuing to work on this :)

It's going well! Several tests from the Black suite are passing, more are close-to-passing (I'd like to quantify this soon, but I haven't done string normalization yet, and that causes most tests to fail even if the rest of the formatting is correct).

Still a lot of work and cases left to handle. But my current thinking is that I'll OSS it as soon as I'm confident that it can handle all syntax including comment preservation, even if the style drifts a bit over a few releases as we improve compatibility.

charliermarsh avatar Feb 09 '23 23:02 charliermarsh

I have to say I really love the idea of using a single tool for everything related to code quality.
I actually never understood the border between linting and formatting.

On the other side, I think Black is amazing, and I'm not sure Python community needs another new formatter right now, especially if it is very similar to Black.


So I was wondering, how to resolve this contradiction?


This may be a naive idea, but could we include black inside Ruff as a dependency? Like every time a new black version is out, Charlie Marsh magic script can convert this black to "Rust Black" and embed it in the next Ruff.

This way, users only install a single tool, but behind the scene, they are using Black, with the exact same behaviour, and exact same config (but with Rust speed !)

I would love this option, but would totally understand if other users prefer something else.
 Thanks for the amazing work anyway!

ddahan avatar Feb 17 '23 16:02 ddahan

but with Rust speed

That won't happen magically without a new implementation in Rust

On the other side, I think Black is amazing, and I'm not sure Python community needs another new formatter right now, especially if it is very similar to Black.
 So I was wondering, how to resolve this contradiction?

Whenever this happens I'm assuming it would gradually replace Black in the long term

ofek avatar Feb 17 '23 16:02 ofek

Current work-in-progress implementation has been merged into main -- now working off main as we iterate and increase test completion rate.

If you're interested in following along, https://github.com/charliermarsh/ruff/pull/2883 has some details on how it works and what it supports (and doesn't) thus far.

charliermarsh avatar Feb 17 '23 23:02 charliermarsh

Excited for this, and chiming in here on two major benefits I see coming from this as someone working on a large monorepo (Dagster:

  • Speed: ruff is way faster than black on our ~2500 python file repo. Very noticeable when running from the command line.
  • Config resolution flexibility: there are places where we are forced to duplicate black config because not every package in our codebase uses the same line length settings (we use shorter line length for code snippets that will be rendered to docs). Black doesn't support config inheritance. This also sometimes causes problems for us when running black in pre-commit due to the way black resolves config files.

smackesey avatar Mar 11 '23 17:03 smackesey

I don't think you need my opinion here but since I already started typing, here it goes!

First of all, it doesn't ruffle any feathers in Blackland that this effort here is happening. Ruff is a qualitative improvement over previous developer experience tooling for Python, one that we can't really touch with Python in terms of performance, so I'm really curious where this will go.

In fact, I like the requirements spelled out here. In particular, trying to follow the Black style while acknowledging that a 1:1 implementation is probably not feasible. Charlie, if you identify some differences that you think are clear improvements, we can also consider moving toward those in Black. That way migration from Black to Rufffmt (sorry, I had to 😅) would be more seamless.

I'm saying this because my initial urge to create the auto-formatter didn't come from a need to be the king of the Python formatting hill. Rather, I wanted a tool that formats things consistently with a set of rules that I can explain in prose, and recognize in the output. I contributed to YAPF before and tried to adopt it at Facebook where I worked at the time. This failed because the rich configurability and clever "dissatisfaction optimization" approach of YAPF caused its results to be sometimes hard to understand. No configuration file we tried saved us from very suboptimal edge cases, and when changing the weights solved a particular issue, other issues popped up somewhere else. Therefore, I just wanted a tool that does one thing and does it well.

I guess I'm softly advising against creating a fully configurable tool here. I'm not saying you shouldn't let people configure more things than Black. Our tool was created when YAPF was already a well-known and mature piece of software. The goal was to end bikeshedding and to a large extent we did. Clearly, normalizing string prefixes is still dividing people after 5 years of the tool existing so I have to conclude that going there was a mistake. I still stand behind the decision for Black not to indent in any other way than 4 spaces. It's literally the first concrete rule in PEP 8 and I think sticking to it made it easier for everybody to read and copy-paste Python code wherever it comes from.

Anyway, to re-iterate: if the end result of this effort is that the community will migrate to Rufffmt as a tool, I will be a happy man... as long as it doesn't reignite bikeshedding over formatting minutiae. The tools won't (and can't) be 1:1 interchangeable but as long as the broad strokes are same-ish, all is good. You see, not even Black is 1:1 the same between code style editions (please run with --preview if you don't mind some additional churn in exchange for getting better style faster). So I have no qualms about Ruffmt being an alternative "edition" of Black.

To be clear, Charlie, Ruff is your tool and it's ultimately your call to do whatever you want with it. I won't lose sleep over any decision. My comment here is mostly meant as encouragement and to extend the possibility of collaboration on the resulting style. I'm happy to see @ichard26 essentially said as much early on here. He's right. Don't be a stranger, we're happy to talk if you ever need anything!

ambv avatar Mar 15 '23 22:03 ambv

I always saw black as inspired by gofmt: the appeal is in not having configurable options, or as few as possible. (Although it sounds like it might be worth adding some configurable options internally, but not exposed in the commandline arguments or config files, for the blue folks to use, so they can stop monkey-patching!)

zellyn avatar Apr 18 '23 19:04 zellyn

A 2-space indent option would be a requirement for me to use it. prettier also has it.

silverwind avatar Apr 18 '23 19:04 silverwind

I'll throw in my 2c.

Line length, spaces, and quotes are ok to configure so long as the overall style remains the same as black (especially considering blacks style attempts to minimize diffs).

The important thing is that the Python community is slowly converging on one style, meaning code across organizations, repos, and geographical distance all looks the same. That's a huge win for developers, since it means less brain power parsing code stylistically and more power parsing it semantically.

thejcannon avatar Apr 19 '23 11:04 thejcannon

Hello! Thank you for the great project. I would like to mention a little bugbear with the Black code style that I have. Black currently formats functions such that arguments are at the same indentation as the function definition.

def func(
    arg1,
    ...,
)

However, this can be hard to read in many cases and is also different from what the PyCharm auto-formatter does. Though it may use more space, the following is easier to read, especially when there is no IDE to color the function name.

def func(
        arg1,
        ...,
)

I understand that this issue https://github.com/psf/black/issues/1178 in Black caused some very heated exchanges but I think that having an option (or setting the default) to allow argument indentation would be a good option and be in accordance with PEP8.

veritas9872 avatar Apr 20 '23 06:04 veritas9872

Also, the PyInk project https://github.com/google/pyink has some good ideas for which configurations should be allowed.

veritas9872 avatar Apr 20 '23 06:04 veritas9872

Also, the PyInk project https://github.com/google/pyink has some good ideas for which configurations should be allowed.

PyInk was created as a way to ease transition for big projects that haven't used black since the beginning, so changing midway would be too disruptive.

My preference is for ruff to follow black's stance on minimal configuration and encouraging conventions. For transition situations than we have these other tools to make it easier.

Pyink is intended to be an adoption helper, and we wish to remove as many patches as possible in the future.

- Why PyInk?

It goes without saying that there's room for improvements, but I prefer reaching for a convention whenever possible rather than make it configurable and fragmenting styles.

I think it's also valuable to balance readability with reducing noise in diffs, for example in https://github.com/psf/black/issues/1811, which is one of the configurations in PyInk.

helderco avatar Apr 20 '23 09:04 helderco

@helderco I agree that using PyInk as a benchmark would not be a good idea and that having too many options would be a bad idea. Still, I believe that the extra indentation for function arguments would not cause any extra diffs. It is also compliant with PEP8 and the PyCharm formatter. Perhaps the style should be fixed to use the different indentation scheme? However, I also understand that many people would appreciate the slightly larger extra space left for type hinting in the current Black function argument indentation scheme.

veritas9872 avatar Apr 20 '23 10:04 veritas9872

Configurable indentation is I think something we should not support due to how uncommon it is to see anything other than the style Black supports. Things like single quotes I agree with since there is a sizable population that uses them.

ofek avatar Apr 20 '23 14:04 ofek

I want to voice support for indentation config in function argument lists. Different formatters do it differently, and the black style is controversial.

I don't think @ofek is right that it's "uncommon" to see other styles on this point. Formatters before black didn't do it that way. And I've worked with several large professional python codebases, none of which used that style. My experience is by no means exhaustive but I think it points to there being significant variability.

In terms of merits, I find it hard to distinguish between arguments and function body when the indentation is the same. I understand others feel differently and that's ok!

nilsbunger avatar Apr 20 '23 16:04 nilsbunger

Indentation style is often unfairly maligned as bikeshedding, but it’s an accessibility issue. Some people need the indent to be wide so they can see it better. Some people need the indent to be thin because their large font size makes lines go way over to the right otherwise. Eyesight isn’t the same for everybody. This isn’t people being precious about having settings to play with, it’s about avoiding putting barriers to entry in front of people because they don’t have great eyesight.

The solution that solves this problem is tabs for indentation. If you want to avoid configuration, the right solution is to enforce tabs for indentation. But there’s enough established precedent in the Python community that makes this unrealistic, so making it configurable is a decent compromise.

Please don’t build a formatter that people can’t use for accessibility reasons. It’s bad enough that Black shuts down the possibility of using tabs in all the projects that use it, but its maintainer coming here and telling you should do the same just makes the problem worse. There’s no reason everybody can’t get what they want, except for the people who want to stamp out different indentation styles altogether regardless of the accessibility problems. Stamping out different indentation styles is not worth causing problems for people who don’t have great eyesight.

What PEP8 actually says is:

Spaces are the preferred indentation method.

Tabs should be used solely to remain consistent with code that is already indented with tabs.

That is not a concrete rule, that allows for reasonable flexibility. There are plenty of codebases out there that already use tabs. Enforcing spaces for indentation with your formatter would mean that they can’t do as PEP8 says and “use tabs to remain consistent with code that is already indented with tabs”. The attitude of “if you use tabs you should switch to spaces” is going against what PEP8 says. Also, PEP8 is not a religion – if I have to choose between causing accessibility problems and ignoring PEP8, I’m going to ignore PEP8 every time. Ruff is great, so I’d rather not have that prevent me from using it for formatting.

Yes, I 100% agree that configuration options should be kept to a minimum, and yes, I know everybody has different ideas about what that minimum should be, so if you don’t keep a tight grip on things it can spiral out of control. I’m completely sympathetic to the argument that configuration should be minimised and the default answer should be “no”. But that doesn’t mean being completely inflexible altogether in the face of real problems it causes. Please don’t listen to the people making this out to be bikeshedding. Other formatting preferences don’t cause accessibility problems, this one does.

This has been discussed in greater detail in Black #2798, where an accessibility-based organisation makes a good argument for it. If you’re going to participate in the discussion about this, I would recommend reading their comments first.

JimDabell avatar Apr 20 '23 17:04 JimDabell

From what I could gather from the conversation in https://github.com/psf/black/issues/1178, Black continues to use its current indentation for arguments as changing it would be too disruptive, even though it diverges from PEP8. Perhaps setting the indentation differently is feasible in a new project as perfect parity with Black is not a goal. However, there may be many good reasons to keep the Black behavior.

veritas9872 avatar Apr 21 '23 01:04 veritas9872

I'd love to see auto-wrapping of text in docstrings.

I don't really care if it's 80 characters or 88. Using the already-in-use maximum width should be fine.

WhyNotHugo avatar Apr 25 '23 18:04 WhyNotHugo

I'd love to see auto-wrapping of text in docstrings.

That won't be quite that easy. No current tool I know of does it correctly and they all tend to break things from time to time (think references, tables, custom markdown syntax elements). For markdown in particular this is an issue since it does not have a formal specification, which means there's on one clear way to determine where it's okay to break a line.

If you're trying to do this "the right way", you'll inadvertently end up implementing a basic parser for these languages. I vaguely remember talks about extending ruff to support "Python adjacent" formats (e.g. markdown and ReST) as well, so maybe the docstring things would fit in there?

provinzkraut avatar Apr 25 '23 18:04 provinzkraut

My 2c: My main gripe with Black is how it forcibly unwraps method parameters (both for method calls and definitions), method chaining, comprehensions (list, sets, generators), ternary conditions, etc. when I put them on multiple lines for readability reasons to begin with. Even worse when a couple back to back methods are right under/over the line limit as it makes the code look inconsistent .

It also ironically goes against the idea of reducing git changes and conflicts when a list/param count is suddenly just short enough. (It's fine for stubs where the definition is mainly interacted with through IDE/intellisense/type-checking anyway.)

Hence why I'm still using autopep8, even if I dislike the 2-blank-lines spacing. Similar reasons why I'm using pdrint instead of prettier for javascript/TypeScript.

So the one configuration option I'd really be looking for is: Don't unwrap short lines

Also referencing this here since it's more formatting than linting: https://github.com/astral-sh/ruff/issues/3713 Edit: Some examples: https://github.com/jsh9/cercis/discussions/18#discussioncomment-6224080

Avasam avatar Jun 10 '23 21:06 Avasam

Hi @charliermarsh : here are my unsolicited two cents:

There are 2 main pain points to Black, which are completely orthogonal:

  1. Speed (Black is native Python which is naturally slower)
  2. Lack of configurability (which resulted in many heated and controversial conversations)

When it comes to the Ruff auto-formatter, it may be worth making a decision early on about which Black pain point you'd like to address. The 1st one, the 2nd one, or both?

I personally do not believe in Black's philosophy that "lack of configurability ends bike-shedding", which is why I created my own fork of Black and implemented many configurable options for myself and other people.

I'm not the only one who forked Black. There are quite a few of them out there (such as pyink, black-ex, tan, cblack).

Therefore, I would strongly recommend the Ruff formatter to provide configurability as well -- not just double/single quotes and indentation styles, but potentially more. Otherwise, people will either raise strong-worded issues again, or fork again, or simply not use it.

jsh9 avatar Jun 19 '23 19:06 jsh9