purebasic icon indicating copy to clipboard operation
purebasic copied to clipboard

IDE: Add Remove Trailing White-Space Functionality

Open tajmone opened this issue 4 years ago • 7 comments

During the various PRs in this project, we've all witnessed the pains of trailing whitespace causing hard-to-read diffs and spurious changes in commits.

We've finally managed (PR #57) to add to the Travis CI build a script that will detected any trailing whitespace (on non-empty lines) in PB sources and fail the checks in case there are any (other files are handled via EditorConfig/EClint).

But some users have (quite rightly so) asked how they're going to get rid of trailing whitespace in PB sources? — as it turns out, the PB IDE re-formatter doesn't remove trailing whitespace, nor it provides a similar function anywhere.

It would be important to add new trailing whitespace functionality to the PureBasic and SpiderBasic IDEs:

  • [ ] Add functionality to remove trailing whitespace (tabs or spaces) on lines that contain at least one non-whitespace character (but preserve indented empty lines added by the IDE/reformatter).
    • [ ] Add new "Edit > Remove trailing spaces" menu to activate.
    • [ ] Don't apply automatically when "Edit > Format indentation" is used.
    • [ ] Add New "Edit > Reformat source file" that re.formats indentation and remove all trailing WS in whole source file.

The main difference between "Edit > Format indentation" and "Edit > Reformat source file" is that the former currently works only on selections (forcing users to select the whole file to apply to the whole source), whereas the latter works on the whole file without requiring any selection.

(In both cases, selections and bookmarks are lost on reformatting, and the caret is placed at the beginning of the file, since this how "Edit > Format indentation" currently operates, for some reason.)

tajmone avatar Feb 08 '20 08:02 tajmone

I would very much welcome this functionality.

I have created a feature request for it in the English PB forum since several years. There the case of trailing spaces within macro blocks is noted and these should perhaps be kept untouched there. Meanwhile, I think this is a bad programming style, because surely some editors used for programming are pre-configured to remove trailing spaces automatically.

If someone is working on this feature, please note this here so that several contributors do not do the same work more than once.

SicroAtGit avatar Feb 08 '20 09:02 SicroAtGit

I think this is a bad programming style, because surely some editors used for programming are pre-configured to remove trailing spaces automatically.

I'd say that most editors support this feature, which is usually available as a menu command and as a per-file type option. In those editors that don't support it natively, it's usually available via third-party add-ons, or can be simply done via RegEx Search & Replace (s/(?<\S)[\t ]$//) — unfortunately, RegEx S&R is not available on the PB IDE (which is rather odd, not only because basically all editors except maybe Windows "Note Pad" support it, but mostly because PureBasic has an excellent native RegEx library via PCRE).

the case of trailing spaces within macro blocks is noted and these should perhaps be kept untouched there.

Any real-case scenario examples of macros that need trailing spaces preservation?

As a norm, code-style conventions are usually enforced strictly, a feature is either used or ignored. It's difficult to handle exceptions, both in IDEs and editors as well as in sources validation.

EditorConfig settings (the .editorconfig file) have a dual purpose:

  1. Inform editors and IDEs about the code conventions adopted in the project.
  2. Allow validation of submitted PRs via EditorConfig tools like EClint.

Most modern editors either support EditorConfig natively or via a plugin add-on, in which case end users don't have to do anything for the editor is made aware of the required settings via the pattern-matching rules in the .editorconfig file (like Git settings, they can target files extension, folders or even single files).

Having said that, the number of settings actually available via EditorConfig are limited (although its specs are still under development), and its not granted that editor supporting EditorConfig are able to handle all features as expected (due to internal limitations).

Since there's a delicate balance at play, between asking editors and IDEs to support those features, as well as validation tools, these settings are strict — indent_style is either tab OR space, no mixing allowed (not even extra spaces after tabs); and trim_trailing_whitespace is either true OR false (no exceptions allowed, except by ignoring specific files).

My guess is that exceptions to the rules wouldn't be supported in practice by most editors.

So, if there are real cases in which macros require trailing spaces at the line end, we'd have no choice except skipping the whole file in the repo Travis CI validation. Then, on the editor side, how are these exceptions going to be handled? We could make a special rule for the PB IDE linter, but we have no guarantees that end users are not working with other editors of their choice.

IMO, keeping good and open standards, which work on a wide variety of editors, is preferable — at the cost of some compromises.

tajmone avatar Feb 08 '20 11:02 tajmone

Any real-case scenario examples of macros that need trailing spaces preservation?

I don't know. As you can see in the forum thread, I am also not a proponent of this feature, I just agreed to a compromise. But in the meantime I definitely oppose the support of this feature, as I have written above. Perhaps you misunderstood me, because I possibly formulated my post above wrong.

SicroAtGit avatar Feb 08 '20 12:02 SicroAtGit

I don't know.

It's not clear if the examples provided in the thread are proof of concept or real cases — this seems more a PoC to me:

Macro a()
  "x
EndMacro
Macro b()
  x"
EndMacro
s.s = a() b()
Debug s

... with no practical context provided. Besides, as @#NULL has pointed out in the same thread, there are workarounds:

Is such code often in use?

I don't think so, I've never seen or used it. But that the behavior of the program might change by such a modification is something that should be considered. It can also be avoided by using

Macro a()
  "x         "+"
EndMacro

For all practical purposes, the frequency of such examples does matter, for if they are rare then the best solution is just skipping the whole file from validation (instead of revolutionising a well established standard that serves millions of projects created in every language and syntax we could think of).

Also, discussions on the PB Forum are usually contextualized to a PB-oriented niche, were people use the PB IDE and don't work collaboratively via version control. In this repository different rules might have to apply for the sake of better integration in the echo system — over there (in the PB Forum) it's "family and home", here (on GitHub) it's more "strangers in a foreign land", so we need to adapt and compromise.

What is clear from that thread, is that these edge cases do exist and that, if this feature is added to the IDE, end users should be able to toggle it on/off, and it might require to make an exception for macros (again, toggle on/off).

Personally, I think that if there's a workaround (like the "x "+" hack above) then it's good grounds to implement the feature and expect users to adopt the workaround for the edge cases, because integration into the wider echo system is always preferable to ending up walled up in a niche of exceptions to the rules — and I think that all the struggles that we're facing in trying to adapt PB IDE projects to make them Git compliant are a proof that making language-specific decisions, without taking into account the wider scenario can ultimately turn into a drawback.

Frankly, these edge-cases arguments appear to me exactly the kind of reasoning that tends toward cornering PB into a self-referential niche, instead of moving toward the main stream of the open source development echo system — widening the gap, instead of reducing it.

In any case, I've now edited the initial proposal from "Aplly …" to "Don't apply automatically when "Edit > Format indentation" is used.", because I agree that these edge cases should be respected in the PB IDE — maybe an option to enable/disable this automatism might be a good idea.

As you can see in the forum thread, I am also not a proponent of this feature, I just agreed to a compromise. But in the meantime I definitely oppose the support of this feature, as I have written above. Perhaps you misunderstood me, because I possibly formulated my post above wrong.

Yes, it's a bit confusing, for it starts with

I would very much welcome this functionality.

Also, I interpreted your sentence

Meanwhile, I think this is a bad programming style, because surely some editors used for programming are pre-configured to remove trailing spaces automatically.

as meaning that leaving behind trailing space is a bad programming habit and style, further more so since most editors offer functionality to clean it up automatically. But maybe I understood this correctly?

Indeed, leaving behind trailing spaces for no reason but laziness is bad programming habits, and it's also bad style for it increasingly affects commits diffs, and without enforced rules that ban this it can soon result in a full-fledged style war where each contributor's editor tries to enforce its own style-settings and take over the code, commit after commit, submerging them in meaningless whitespace changes.

Also, in the majority of languages and syntaxes, extra whitespace outside strings is meaningless, and with the exception of alignment adjustments (outside indentation) there's no valid reason to preserve trailing spaces. And when there are reasons to do so, usually it's done via spaces escaping (\ ) or SML entities (&nbsp;, etc.) if supported.

As that same PB Forum thread points out, macros are an exception to the rule, for PB is a line-oriented syntax that doesn't allow strings to carry on across multiple lines (unlike Python and others), and the idiomatic way of handling these edge cases seems to me the one suggested by @#NULL ( "+").

In any case, it seems to me that adding the trailing WS removal feature to the PD IDE is not going to harm anyone, for it wouldn't be enforced on anyone, and especially so if it was controllable via settings. On the contrary, I was quite surprised that the IDE didn't offer this natively.

Back to the point of this whole Issue, which came about since a user pointed out that we're asking contributors to reformat the PB sources using the PB IDE "Reformat indentation" feature, and at the same time expect all PB sources not to contain trailing spaces (which were slipping by before PR #57), except that the IDE doesn't allow that.

Maybe, the solution would be to create an IDE Add-On designed specifically for contributors to this project — which could handle trailing spaces and other checks as well, like if there are IDE settings at the end of the file, etc.

This last solution would serve our purpose, sparing us the nightmares of trailing WS proliferation, and end users from the troubles of having to find a way to pass all Travis checks — and at the same time not offend any of the edge-case paladins, who see them as a matter of purity-principle taking precedence over the practicalities of interacting with the wider echo system.

tajmone avatar Feb 08 '20 13:02 tajmone

I don't think it would be a problem. People could just turn it off and have the previous behavior if they really need it. And note that preceded whitespace isn't handled literally either, i.e. if you add more whitespace in macro b() it will be ignored already.

pb-null avatar Feb 08 '20 18:02 pb-null

Sorry, now I have totally confused you.

Here I was referring to the feature that you suggest in this issue:

I would very much welcome this functionality.

And here I was referring to the feature (trailing spaces within macro blocks [...] these should perhaps be kept untouched):

As you can see in the forum thread, I am also not a proponent of this feature, I just agreed to a compromise. But in the meantime I definitely oppose the support of this feature, as I have written above. Perhaps you misunderstood me, because I possibly formulated my post above wrong.

In summary:

  • I would very much welcome this functionality (the feature you suggest).
  • Within macro blocks there should be no rule exception — trailing spaces should be removed here as well

And note that preceded whitespace isn't handled literally either, i.e. if you add more whitespace in macro b() it will be ignored already.

Yes, only trailing spaces within macro blocks are taken into account by the PB compiler. Otherwise, change the indentation settings of the PB IDE would also change the content of the macro blocks, which would be fatal if in the overall code the preceding spaces within macro blocks are important.

SicroAtGit avatar Feb 09 '20 01:02 SicroAtGit

Stray Spaces Added by PB IDE?

It's also worth mentioning what Keya wrote in that PB Forum thread:

i dont know why but the PB IDE seems to add a lot of extra spaces after some lines for no reason (somewhat intermittently), which frustrates me a lot when im trying to quickly navigate to the end of a line (often) and have to waste time clicking over useless spaces i never added in the first place

I actually had the same impression too. It's not something I've actually tried to pin-down by trial and error, but rather something that I noticed but was never able to work out when and why it occurs, so I just thought I was imagining it, but now it seems that others have experienced this too.

Maybe it's worth trying to better understand when and why this happens in the IDE, and whether it's an undesirable side effect that shouldn't happen in the first place (and needs fixing) or the predictable result of end users carrying out erroneous typing of reformatting operations.

Visualizing Whitespace Entities in PD IDE

Various editors, e.g. Notepad++, have an option to show invisible or hidden characters like spaces, tabs and EOL sequences, which is very helpful to spot whitespace in source files.

Scintilla actually provides built-in options and functions to visualize spaces as dots, tabs as arrows, and EOLs as CR, LF or CRLF — if I remember correctly, it uses some built-in icons to do this, so that it doesn't depend on special fonts containing non-standard glyphs..

Currently the PB IDE does have an option to show whitespace:

  • Preferences > Editor > Indentation : Show whitespace characters.

But it only shows tabs and spaces as small dots, which are very hard to see, and doesn't show EOLs at all — nothing like Notepad++ way of showing these whitespace entities. I've tried changing fonts, but these dots are still far too small.

Enhancing this option in the PB IDE to fully leverage Scintilla's whitespace rendering (like in Npp) would make it easier to visually spot any problems with trailing whitespace and EOLs (e.g. mixing CRLF and LF in a same source).

tajmone avatar Feb 09 '20 13:02 tajmone