vhdl-style-guide icon indicating copy to clipboard operation
vhdl-style-guide copied to clipboard

Reactivate whitespace_001 and whitespace_002

Open staerz opened this issue 3 years ago • 49 comments

Is your feature request related to a problem? Please describe. VSG used to check if the code uses tabs vs. spaces and also removed trailing whitespace. This seems to have been removed (whitespace_001 and whitespace_002 are deprecated).

This was a very desired feature, I want it back!

Describe the solution you'd like Possibly simply re-activating is not adequate, but allowing whitespace_001 to have a configuration to either use tabs or spaces would be useful (which would be complementary to the global indentation setting).

The check for trailing whitespace (whitespace_002) should simply be re-activated.

Describe alternatives you've considered Implementing these sort of checks via stand-alone scripts (please no!)

Additional context VSG used to be great. I just discovered that a few features that made it great have been discontinued when preparing a talk for another community considering using VSG.

Not checking for these very basic things is a big bummer.

staerz avatar Apr 27 '21 23:04 staerz

Hey @staerz ,

VSG used to check if the code uses tabs vs. spaces and also removed trailing whitespace.

VSG automatically converts tabs to 2 spaces and removes trailing spaces when it reads in the file. So whitespace_001 and whitespace_002 never found anything.

I can work on implementing those rules again.

--Jeremy

jeremiah-c-leary avatar Apr 27 '21 23:04 jeremiah-c-leary

I would appreciate a rule that checks if indentation is done with tabs, not sure if the rules discussed cover that or if I should create a separate issue.

BasF0 avatar Apr 28 '21 15:04 BasF0

Hi @BasF0,

what you ask for is precisely what I was suggesting to have whitespace_001 to do.

@jeremiah-c-leary, I came to create the issue since I didn't find any info how VSG would treat that or allow a user to set/activate it.

As with any rules, I would highly encourage that no underlying assumption is done. Although I fully agree with you that trailing whitespace should be removed, no matter what, that is still our opinion and preconception but someone else might not want to bother.

The similar applies to tabs vs. spaces: There is an open war on that question in almost all programming languages, so again, the user should be free to decide.

Maybe it is somewhere documented, then please point me to it and then I'll likely make a suggestion how to improve the documentation to better find it. Otherwise, please consider reinstating whitespace_001 and whitespace_002.

staerz avatar Apr 28 '21 15:04 staerz

Ah, nice! There is some gray area with indentation vs alignment though, it would be best if that could be separated from alignment. I just found some examples in https://www.norbertdejonge.nl/pdf/Code_Indentation_Tabs_Versus_Spaces.pdf . I'm referring to "Smart Tabs", but I'm not sure if that's easily added too..

Although I fully agree with you that trailing whitespace should be removed, no matter what, that is still our opinion and preconception but someone else might not want to bother.

Counterpoint: it could be desirable to leave whitespace in place up to the current indentation-level. That way someone can just click an empty line and not have to start mashing space/tab until the indentation is right :)

BasF0 avatar Apr 28 '21 16:04 BasF0

oh, so I mis-understood. It sounds like whitespace_001 should insert or remove tabs. I do not think it would be too difficult to implement the "Smart Tabs". Maybe whitespace_001 goes away and we add an indent method with the following options: soft tab, hard tab, smart tab. Unfortunately I have individual indent rules, but the indent style could be set with a global assignment:

rules:
  global:
     indent_style: `smart tab`
     indent_size: 2

Then all of the alignment rules would understand which indent method is used and the size of the indent. I would then know how to implement the indent.

jeremiah-c-leary avatar Apr 28 '21 22:04 jeremiah-c-leary

Hi @jeremiah-c-leary,

yes, that is what I was looking up yesterday: I found the indent setting, but I didn't find any setting for using tags or spaces. Then I replaced an indent that was using 2 spaces by a tab and ran VSG again and to my surprise VSG didn't complain.

But certainly I would want the code style checker tool to be able to check for the ultimate IT question "tabs or spaces" and take my preference.

Hm, and now as I think of it as a rule, yes, I like it to be a global setting, but I could imagine that some projects could also take files from different code bases and have the idea to apply tabs in some, spaces in some others, and don't care for yet some others.

How could VSG represent that?

If that indent_style would be applicable to a given set of files only, then it would be fine (and all rules relying on that setting also active for that set of files would derive from that. And if I understand VSGs configuration correctly, that would be the case, right, @jeremiah-c-leary?

So then I agree, that sounds like a good solution and whitespace_001 can remain retired.

I though don't quite get the implications of soft tab vs. hard tab vs. smart tab, could you explain a bit what you have in mind? (For me it would be either tabs or spaces and the third options doesn't appear evident.)

staerz avatar Apr 28 '21 22:04 staerz

Hey @staerz ,

How could VSG represent that?

You could define it at the file level in a configuration:

rules:
   global:
      indent_style: `smart tab`
      indent_size: 2
file_list:
    - fifo.vhd:
       rule:
         global:
           indent_style: `hard tab`
    - ram.vhd:
       rule:
         global:
           indent_style: `soft_tab`

I though don't quite get the implications of soft tab vs. hard tab vs. smart tab, could you explain a bit what you have in mind?

The following paper makes the distinction between the three options: https://www.norbertdejonge.nl/pdf/Code_Indentation_Tabs_Versus_Spaces.pdf

I must admit the smart tab makes sense, but I think it would be hard to implement as I type.

jeremiah-c-leary avatar Apr 28 '21 23:04 jeremiah-c-leary

OK, I just learned something. Smart tab. Fancy idea, but seriously not for me. Then you might also consider elastic hard tabs (just another thing I learned today :wink: ) ... no, please not :rofl:

Thanks for the example, yes, that is what I had in mind.

So I would then say implement the 2 major and most obvious (and consistent) options of tabs and spaces (and I'd really not use the term soft_tab and hard_tab as I think it's still ambiguous: I still need to look up what it means (which is what) while if I read spaces as an option, then sure, I get spaces, as many as given by indent_size per tab while I'll get tabs when using tabs and the alignment will then also use those tab).

staerz avatar Apr 28 '21 23:04 staerz

Yeah smart tabs is nice because then everyone can configure their editor to display tabs as however many spaces they want.

I agree that the term soft_tab is going to be vague, I think spaces, tab, smart_tab (or plural for tabs) makes more sense. Maybe the term soft_tab makes more sense in an editor that inserts spaces when the user presses the tab-key, if so configured.

BasF0 avatar Apr 29 '21 08:04 BasF0

Hey @staerz and @BasF0,

I just pushed an update for whitespace_001 which will detect and fix trailing whitespaces.

When you get a chance, could you give it a spin and let me know if it works.

I need to re-read this thread on how we want to tackle the tabs. It looks like we decided to do spaces, tab and smart-tab. The last one may take a bit to figure out.

jeremiah-c-leary avatar Oct 02 '21 16:10 jeremiah-c-leary

Hi @jeremiah-c-leary,

unfortunately, I get the same behaviour as reported here: VSG is getting stuck in an endless loop.

I also get this when just cherry-picking this commit on top of 62a86502. Please do not just push stuff on master, this is a bad habit and makes it very difficult to disentangle problems.

staerz avatar Oct 04 '21 20:10 staerz

I found the issue. I was misclassifying the double quoted string in the comment as not a comment.

jeremiah-c-leary avatar Oct 06 '21 00:10 jeremiah-c-leary

Hey @staerz and @BasF0,

I am coming back to this issue to implement the whitespace_002 rule. I was looking at the three options: spaces, tabs, and smart tabs. It doesn't seem to be that tabs would be very useful if attempting to indent multi line statements when trying to align by parenthesis. I create the following example to try to understand how indenting would look like with the three options:

Image

The tabs formatting shows inconsistency in aligning closing parenthesis (line 70 and 72) and even the else option (line 73). Am I missing something? Why would there be a tabs option?

--Jeremy

jeremiah-c-leary avatar Jul 10 '22 01:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

thanks for coming back to this!

I agree with your observation that the indent of the closing tabs doesn't seem consistent. I do though not agree with your explanation: If you ask me, then it's not the tabs which cause the problem, but the convention where to put the opening and closing parenthesis. On the other hand, your example is well chosen as it's super difficult to decide where they should go in the first place.

All this to say that I think what you have is good for the 3 cases.

For the smart tabs (which I personally find the worst), just to make sure that the tab size is a configuration value. (Things might look very different with a tab size of 8 for example.)

Note that this is what I would make this example look like with (hopefully) consistent indenting:

wr_en <=
	'0' when (
		read_en = '0' and fifo_status(
			fifo_full_status(
				read_address,
				write_address
			),
			fifo_empty_status(
				read_address,
				write_address
			)
		)
	) else
	'1';

(As you see I do not subscribe to the idea of aligning the content with the opening parenthesis. There is a line break and proper indenting already to show the structure ...)

staerz avatar Jul 11 '22 13:07 staerz

Hey @staerz,

Things might look very different with a tab size of 8 for example.

yes...my examples were incorrect. For the tab only it would look like this:

image

...and the smart tab would look like this assuming you wanted to align with parenthesis:

image

One take away is that you can not use tab and configure rules to align with parenthesis.

Another take away is the number of tabs is equal to the indent level when using just tab.

--Jeremy

jeremiah-c-leary avatar Jul 11 '22 17:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

Another take away is the number of tabs is equal to the indent level when using just tab.

Yes, that's the idea of the entire indenting thing :rofl: (as one would use tabs to indent in the first place)

But let's not drift off here.

As our previous discussion on that matter dates back quite a while, what's the summary of the status for this issue?

I'd understand that whitespace_002 is meant to check for the proper usage of tabs or spaces (on the "left" side of the code). I also want a rule to check (and remove) trailing whitespace, do we have that?

staerz avatar Jul 11 '22 18:07 staerz

Hey @staerz,

I also want a rule to check (and remove) trailing whitespace, do we have that?

The rule whitespace_001 checks for trailing whitespace and will remove it.

--Jeremy

jeremiah-c-leary avatar Jul 11 '22 18:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

The rule whitespace_001 checks for trailing whitespace and will remove it.

Ah, so it is back to life? I remember it was deprecated at some point...

staerz avatar Jul 11 '22 19:07 staerz

Hey @staerz,

Things might look very different with a tab size of 8 for example.

yes...my examples were incorrect. For the tab only it would look like this:

image

...and the smart tab would look like this assuming you wanted to align with parenthesis:

image

One take away is that you can not use tab and configure rules to align with parenthesis.

Another take away is the number of tabs is equal to the indent level when using just tab.

--Jeremy

Yeah this is the smart tab form that makes sense if you wanted to align it like that. Personally, I wouldn't align with parenthesis but that's a different discussion I guess. I see smart tabs mostly useful to align statements such as this one, which could be several indentation levels in:

b <= "1000" when a = "00" else 
     "0100" when a = "01" else 
     "0010" when a = "10" else 
     "0001" when a = "11";

By the way, maybe a useful rule generally is to check on every line that every tab only follows another tab (or beginning of the line). I don't think there is such a rule at the moment. That should be compatible with all three styles and be a nice style check.

BasF0 avatar Jul 13 '22 08:07 BasF0

Hi @BasF0,

thanks for your feedback!

maybe a useful rule generally is to check on every line that every tab only follows another tab

Yes, that's a great explicit check to add! I'm not sure if VSG currently does that check or not as we only use spaces in general, so we don't have that problem, but I totally see the use of it (and remember me facepalming myself when looking at other people's code which had intermixed spaces and tabs in the same line...

(And concerning your example, I guess how I would indent it, the "1000" would already go to the next line ...)

staerz avatar Jul 13 '22 14:07 staerz

Hey @BasF0,

maybe a useful rule generally is to check on every line that every tab only follows another tab

I was wondering what we should do with random tabs in the file. Should we expand the rule to state: A tab shall be preceded by another tab or a new line.

That would only allow series of tabs to exist at the beginning of a line.

--Jeremy

jeremiah-c-leary avatar Jul 13 '22 14:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

A tab shall be preceded by another tab or a new line.

That might be too strict: Imagine somebody puts a table into comments and uses tabs to separate the columns (maybe on top of some markup character for the column separator), then this rule wouldn't allow that.

I think what the rule should check instead is that a tab is not preceded by a space.

staerz avatar Jul 13 '22 14:07 staerz

Hey @staerz,

That might be too strict: Imagine somebody puts a table into comments and uses tabs to separate the columns (maybe on top of some markup character for the column separator), then this rule wouldn't allow that.

I was thinking tabs in comments would be ignored. When the tokenizer hits -- then anything after it is a comment and combined into a single token.

I am concerned about cases like this:

case address is
  when "000"\t\t\t=>
  when "001110000"\t\t=>
  when others\t\t\t\t=>
end case;

a <= some_signal\twhen b = '0' else
     other_sig\twhen c = '1' else
     '0'\t\t\twhen d = '0' else
     '1';

The final column of the => and the when keyword is dependent on where the next tab stop is located.

So in the code itself, I was thinking tabs should only exist at the beginning of the line.

I guess the question is: should tabs only be used for indenting?

--Jeremy

jeremiah-c-leary avatar Jul 13 '22 14:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

I would personally not distinguish between comments and code for this question. For my perspective, tabs vs. spaces applies on an even more basic level.

I think you'd only complicate your life ...

The cases you mention are totally fine for me: If people want to use tabs there, let them do so. I wouldn't. And yes, if VSG had rules to check the distance to the =>, then the rule should be obeyed.

I guess the question is: should tabs only be used for indenting?

I haven't read the details about smart tabs, but I would generally answer this question with a clear No!, they could also be used internally to align something as that's the nature of tabs, they come with a fixed grid with is what some developers may want to call to in their code. [And I think that would apply to hard tabs usage.]

So I propose to keep it simple and only forbid the following constructs:

case address is
  when "000"\t  \t=>
  when "001110000"  \t\t=>
  when others\t\t\t   =>
end case;

a <= some_signal \twhen b = '0' else
     other_sig\t when c = '1' else
\t    '0'\t\t \twhen d = '0' else
  \t  '1';

So simply forbidding sequences of "space-tab" and "tab-space". Don't care for the rest.

staerz avatar Jul 13 '22 15:07 staerz

Hey @BasF0,

maybe a useful rule generally is to check on every line that every tab only follows another tab

I was wondering what we should do with random tabs in the file. Should we expand the rule to state: A tab shall be preceded by another tab or a new line.

That would only allow series of tabs to exist at the beginning of a line.

--Jeremy

Yeah; I can't come up with a valid use case to use tabs anywhere other than the leading characters of a line right now.

I guess the question is: should tabs only be used for indenting?

I definitely think so. But rest assured there are people who disagree! and I think we have one! :)

@staerz Can you show an example of when you would want to keep non-leading tabs ? I'm having a hard time imagining when you would want to allow it. If you want your code to be read by people with different nr-of-spaces-per-tab, then using tabs to align always breaks alignment, no ?

I recall accidentally having a tab instead of one space in the middle of my code. It had the width of 1 space due to alignment, but definitely wasn't the right character, so I would be in favor of my own suggestion obviously.

BasF0 avatar Jul 13 '22 17:07 BasF0

Hi @BasF0,

sure, an easy example would be a table in the comments, or some concatenation that someone tries to align via tabs:

-- This is my documentation, see the table
-- | Head1\t| Head2\t|
-- | ---\t| ---\t|
-- | part1\t| part 2\t|

So here the tab would be used to align the columns vertically.

sig <= sig_a\t\t& sig_b\t\t& sig_c
         \tsignal_x\t& signal_y\tsignal_z;

Here, let's assume different lengths of signal names that are concatenated. Somebody decides to want a line break after a certain width (sig_c here) but then wants all the & characters aligned vertically.

Note that I'd be using any of these constructs, but I'm just letting my imagination run freely.

So there are your use cases.

I'm certainly not against adding rules to forbid those things, but let's please disentangle that from the very basic rule that tab and space should never be allowed to follow each other (which is the only consideration I'd make for whitespace_001!)

staerz avatar Jul 13 '22 17:07 staerz

I think both your examples are valid ones to want to align. But neither align correctly with tabs, not with a tab width of 2, 3, 4 nor 5 spaces. See figure for tab width=4: examples_tabwidth4

Seems there are too many spaces in the second line of the second example. So allow me to fix that by aligning the variables of the second line using spaces and removing the leading tab:

sig <= sig_a\t\t& sig_b\t\t& sig_c
       signal_x\t& signal_y\t& signal_z;

With a tab-width of 3, these are aligned. But if the tab width is changed to 4, alignment is broken again. See figure. example2

Now I'm not trying to make this a discussion about tabs vs spaces, but you'll understand I don't see the use-case of using tabs to align.

I'm certainly not against adding rules to forbid those things, but let's please disentangle that from the very basic rule that tab and space should never be allowed to follow each other (which is the only consideration I'd make for whitespace_001!)

Did you mean whitespace_002 ? I don't know what that did. 001 is trailing whitespace only. Sure there shouldn't be trailing spaces or tabs, but this already happens because VSG translates tabs. Spaces follow tabs in smart_tabs so unless I misunderstood I disagree :)

BasF0 avatar Jul 14 '22 07:07 BasF0

Hi @BasF0,

I didn't test how my 2 examples actually render, I just typed them up to give you a conceptual idea. We see that more tabs here and there would actually be needed, but that's a detail, the general statement still holds that tabs could be used for internal alignment.

Spaces follow tabs in smart_tabs so unless I misunderstood I disagree

OK, I was forgetting about smart tabs (things are so much simpler if you just use spaces only :rofl:), so then a tab shouldn't follow a space, that's I think we'd then agree on.

staerz avatar Jul 14 '22 13:07 staerz

Hey @staerz and @BasF0,

I would propose the following rule changes:

whitespace_002

This rule will check for the existence of tabs in the file. It will have the following options:

replace_with_spaces : all tabs are replaced with spaces. This includes tabs in comments. keep_tabs_at_beginning_of_line: only tabs which are at the beginning of a line are preserved. All other tabs are replaced with spaces.

More options could be added later if requested.

The default would be replace_with_spaces as this matches the current implementation.

whitespace_014

This rule would check for tabs after a space:

\t\t\ta <= \t'1';
\t\t  \tb <= '0';

and would remove preceding spaces:

\t\t\ta <=\t'1';
\t\t\tb <= '0';

The rules would execute in the following order:

  1. whitespace_002
  2. whitespace_014
  3. Indent rules
  4. Alignment rules

I think the indent rules are where we would convert leading whitespace to either:

  • spaces
  • tabs
  • smart tabs

which could be done on a rule by rule basis, although that could get messy.

Anyway, what do you think of this proposal?

Regards,

--Jeremy

jeremiah-c-leary avatar Jul 15 '22 12:07 jeremiah-c-leary

Hi @jeremiah-c-leary,

I'm generally OK with your proposal.

In order to make it more consistent with what we've discussed earlier, could the option names for whitespace_002 be closer to the idea of "tabs" vs. "smart tabs" vs. "spaces"?

Let's assume I have it configured to "tabs", then I would this rule to actually replace leading spaces by tabs ... in conjunction with the indenting rules. Or would you rather see the indenting rules than to make that conversion?

whitespace_014 looks good to me, but also here, depending on the setting "tabs" vs. "smart tabs" vs. "spaces", if "tabs" are chosen, one could additionally check for spaces after tabs (as that would be forbidden in a "tabs" configuration).

That's my 2 cents.

staerz avatar Jul 15 '22 13:07 staerz