perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Add a tool which can clean up whitespace in a commit intelligently

Open demerphq opened this issue 3 years ago • 4 comments

Add a tool which can clean up whitespace in a commit intelligently based on git-blame.

Will ONLY modify lines that were changed unless asked to do the whole file.

Will not change files marked with "DO NOT EDIT THIS FILE".

Will convert tabs to spaces in files ending in .pl, .pm, .xs, .c, .h

Will remove trailing whitespace from lines.

Will remove blank lines at EOF.

demerphq avatar Feb 19 '22 13:02 demerphq

Add a tool which can clean up whitespace in a commit intelligently based on git-blame.

@demerphq, can you point to a commit where you used this program to clean up messy whitespace in the course of the commit?

jkeenan avatar Aug 07 '22 22:08 jkeenan

On Mon, 8 Aug 2022 at 00:45, James E Keenan @.***> wrote:

Add a tool which can clean up whitespace in a commit intelligently based on git-blame.

@demerphq, can you point to a commit where you used this program to clean up messy whitespace in the course of the commit?

I use it on something like 95% of my patches, it would be 100% but sometimes I forget. It would be easier to answer the opposite question, and find commits where I forgot to use it. If a comment has no whitespace issues it is almost certain that I used it.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq avatar Aug 08 '22 05:08 demerphq

On Mon, 8 Aug 2022 at 07:40, demerphq @.***> wrote:

On Mon, 8 Aug 2022 at 00:45, James E Keenan @.***> wrote:

Add a tool which can clean up whitespace in a commit intelligently based on git-blame.

@demerphq, can you point to a commit where you used this program to clean up messy whitespace in the course of the commit?

I use it on something like 95% of my patches, it would be 100% but sometimes I forget. It would be easier to answer the opposite question, and find commits where I forgot to use it. If a comment has

Obviously I meant "commit" there.

no whitespace issues it is almost certain that I used it.

I have used it on all my recent patches.

cheers. Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq avatar Aug 08 '22 05:08 demerphq

On Mon, 8 Aug 2022 at 00:43, James E Keenan @.***> wrote:

In most of the Perl code I've seen in the core distribution, 4 spaces is the equivalent of a tab.

No it not. The indent was 4 spaces, but a tab was always 8.

I'm not going to get dogmatic about 8-space tab, but can you point us to where that's documented?

I am not aware it is specifically documented in the project.

The history of the tab character is long and messy. "Tab Stops" on typewriters were user settable but usually started off at 5 characters. When the concept of the tab moved into computing it was handled in different ways in different context. Your terminal is almost certainly using a 8 character "Tab stop", meaning a tab advanced to the next column that is a multiple of 8. You can test this with a program like this:

perl -e'print "\t#\n", " " x 8, "#\n", "Foo\t#\n";'

the three '#' character should line up. You can override this in your terminal using the "tabs" command, note the help:

$ tabs --help Usage: tabs [options] [tabstop-list]

Options: -0 reset tabs -8 set tabs to standard interval

So the tabs program in my terminal considers 8 to be the standard tab stop. I bet yours does too.

In the sense of converting tabs to spaces, every tool or script I have seen has converted them to 8 spaces without respecting tab stops. So that "Foo\t" ends up being 11 characters long, not 8 that would be required for true "tab stop" behavior.

I found this on wikipedia:

"Despite five characters were the typical paragraph indentation on typewriters at that time, the horizontal tab size of eight evolved because as a power of two it was easier to calculate with the limited digital electronics available. Using this size tab to indent code results in much white space on the left, so most text editors for code, such as IDEs https://en.wikipedia.org/wiki/Integrated_development_environment, allow the size of the tab to be changed, and some (in particular on Windows) default to four instead of eight. Disagreements between programmers about what size tabs are correct https://en.wikipedia.org/wiki/Indentation_style, and whether to use tabs at all, are common.[7] https://en.wikipedia.org/wiki/Tab_key#cite_note-7 Modern text editors usually have the Tab key insert the user-defined indentation and may use heuristics to adapt this behavior to existing files."

Anyway, I am pretty confident that you will not find any code in our repo that expects a tab to line up on anything other than 8 character tab stops. People would have yelled at them a long time ago. We might not document it explicitly as tab = 8 spaces, but I believe it is implicit in our code all over the place.

For what it is worth I am extremely reluctant to patch this to use a different tab size, I would never use it myself, and I think it would just cause trouble. If someone really wants it and is going to use it then they can write the patch. I would rather not do so just because you speculate someone might want it. There are other tools that are intended to convert tabs to spaces which people can use, this one is tailored specifically to our needs:

  1. Do not touch lines the developer did not touch.
  2. Do not touch autogenerated files
  3. Do not touch files that are whitespace sensitive (eg Makefile or some of out config files).
  4. Find the files that need to be fixed using git.

Point 1 was the main reason I wrote the tool. I tend to be sloppy about whitepace changes, and I often leave trailing whitespace when I work. When I used other tools to fix my trailing whitespace, they fixed the whole file and people would get angry and upset (and rightly so!). I wanted something that would only "fix" the lines I personally changed. Hence the tool. It is not meant to be a generic "tabs2spaces" of which there are many tools. It is meant to be a "clean up your changes before you commit", similar but different.

cheers, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq avatar Aug 08 '22 06:08 demerphq