vhdl-style-guide
vhdl-style-guide copied to clipboard
Reactivate whitespace_001 and whitespace_002
Is your feature request related to a problem? Please describe. VSG used to check if the code uses tabs vs. spaces and also removed trailing whitespace. This seems to have been removed (whitespace_001 and whitespace_002 are deprecated).
This was a very desired feature, I want it back!
Describe the solution you'd like Possibly simply re-activating is not adequate, but allowing whitespace_001 to have a configuration to either use tabs or spaces would be useful (which would be complementary to the global indentation setting).
The check for trailing whitespace (whitespace_002) should simply be re-activated.
Describe alternatives you've considered Implementing these sort of checks via stand-alone scripts (please no!)
Additional context VSG used to be great. I just discovered that a few features that made it great have been discontinued when preparing a talk for another community considering using VSG.
Not checking for these very basic things is a big bummer.
Hey @staerz ,
VSG used to check if the code uses tabs vs. spaces and also removed trailing whitespace.
VSG automatically converts tabs to 2 spaces and removes trailing spaces when it reads in the file. So whitespace_001
and whitespace_002
never found anything.
I can work on implementing those rules again.
--Jeremy
I would appreciate a rule that checks if indentation is done with tabs, not sure if the rules discussed cover that or if I should create a separate issue.
Hi @BasF0,
what you ask for is precisely what I was suggesting to have whitespace_001 to do.
@jeremiah-c-leary, I came to create the issue since I didn't find any info how VSG would treat that or allow a user to set/activate it.
As with any rules, I would highly encourage that no underlying assumption is done. Although I fully agree with you that trailing whitespace should be removed, no matter what, that is still our opinion and preconception but someone else might not want to bother.
The similar applies to tabs vs. spaces: There is an open war on that question in almost all programming languages, so again, the user should be free to decide.
Maybe it is somewhere documented, then please point me to it and then I'll likely make a suggestion how to improve the documentation to better find it. Otherwise, please consider reinstating whitespace_001 and whitespace_002.
Ah, nice! There is some gray area with indentation vs alignment though, it would be best if that could be separated from alignment. I just found some examples in https://www.norbertdejonge.nl/pdf/Code_Indentation_Tabs_Versus_Spaces.pdf . I'm referring to "Smart Tabs", but I'm not sure if that's easily added too..
Although I fully agree with you that trailing whitespace should be removed, no matter what, that is still our opinion and preconception but someone else might not want to bother.
Counterpoint: it could be desirable to leave whitespace in place up to the current indentation-level. That way someone can just click an empty line and not have to start mashing space/tab until the indentation is right :)
oh, so I mis-understood. It sounds like whitespace_001 should insert or remove tabs. I do not think it would be too difficult to implement the "Smart Tabs". Maybe whitespace_001 goes away and we add an indent method with the following options: soft tab
, hard tab
, smart tab
. Unfortunately I have individual indent rules, but the indent style could be set with a global assignment:
rules:
global:
indent_style: `smart tab`
indent_size: 2
Then all of the alignment rules would understand which indent method is used and the size of the indent. I would then know how to implement the indent.
Hi @jeremiah-c-leary,
yes, that is what I was looking up yesterday: I found the indent setting, but I didn't find any setting for using tags or spaces. Then I replaced an indent that was using 2 spaces by a tab and ran VSG again and to my surprise VSG didn't complain.
But certainly I would want the code style checker tool to be able to check for the ultimate IT question "tabs or spaces" and take my preference.
Hm, and now as I think of it as a rule, yes, I like it to be a global setting, but I could imagine that some projects could also take files from different code bases and have the idea to apply tabs in some, spaces in some others, and don't care for yet some others.
How could VSG represent that?
If that indent_style
would be applicable to a given set of files only, then it would be fine (and all rules relying on that setting also active for that set of files would derive from that.
And if I understand VSGs configuration correctly, that would be the case, right, @jeremiah-c-leary?
So then I agree, that sounds like a good solution and whitespace_001 can remain retired.
I though don't quite get the implications of soft tab
vs. hard tab
vs. smart tab
, could you explain a bit what you have in mind?
(For me it would be either tabs
or spaces
and the third options doesn't appear evident.)
Hey @staerz ,
How could VSG represent that?
You could define it at the file level in a configuration:
rules:
global:
indent_style: `smart tab`
indent_size: 2
file_list:
- fifo.vhd:
rule:
global:
indent_style: `hard tab`
- ram.vhd:
rule:
global:
indent_style: `soft_tab`
I though don't quite get the implications of soft tab vs. hard tab vs. smart tab, could you explain a bit what you have in mind?
The following paper makes the distinction between the three options: https://www.norbertdejonge.nl/pdf/Code_Indentation_Tabs_Versus_Spaces.pdf
I must admit the smart tab
makes sense, but I think it would be hard to implement as I type.
OK, I just learned something. Smart tab. Fancy idea, but seriously not for me. Then you might also consider elastic hard tabs (just another thing I learned today :wink: ) ... no, please not :rofl:
Thanks for the example, yes, that is what I had in mind.
So I would then say implement the 2 major and most obvious (and consistent) options of tabs
and spaces
(and I'd really not use the term soft_tab
and hard_tab
as I think it's still ambiguous: I still need to look up what it means (which is what) while if I read spaces
as an option, then sure, I get spaces, as many as given by indent_size
per tab while I'll get tabs when using tabs
and the alignment will then also use those tab).
Yeah smart tabs
is nice because then everyone can configure their editor to display tabs as however many spaces they want.
I agree that the term soft_tab
is going to be vague, I think spaces
, tab
, smart_tab
(or plural for tabs) makes more sense. Maybe the term soft_tab
makes more sense in an editor that inserts spaces when the user presses the tab-key, if so configured.
Hey @staerz and @BasF0,
I just pushed an update for whitespace_001
which will detect and fix trailing whitespaces.
When you get a chance, could you give it a spin and let me know if it works.
I need to re-read this thread on how we want to tackle the tabs. It looks like we decided to do spaces
, tab
and smart-tab
. The last one may take a bit to figure out.
Hi @jeremiah-c-leary,
unfortunately, I get the same behaviour as reported here: VSG is getting stuck in an endless loop.
I also get this when just cherry-picking this commit on top of 62a86502. Please do not just push stuff on master, this is a bad habit and makes it very difficult to disentangle problems.
I found the issue. I was misclassifying the double quoted string in the comment as not a comment.
Hey @staerz and @BasF0,
I am coming back to this issue to implement the whitespace_002
rule. I was looking at the three options: spaces
, tabs
, and smart tabs
. It doesn't seem to be that tabs
would be very useful if attempting to indent multi line statements when trying to align by parenthesis. I create the following example to try to understand how indenting would look like with the three options:
The tabs
formatting shows inconsistency in aligning closing parenthesis (line 70 and 72) and even the else
option (line 73). Am I missing something? Why would there be a tabs
option?
--Jeremy
Hi @jeremiah-c-leary,
thanks for coming back to this!
I agree with your observation that the indent of the closing tabs doesn't seem consistent. I do though not agree with your explanation: If you ask me, then it's not the tabs which cause the problem, but the convention where to put the opening and closing parenthesis. On the other hand, your example is well chosen as it's super difficult to decide where they should go in the first place.
All this to say that I think what you have is good for the 3 cases.
For the smart tabs (which I personally find the worst), just to make sure that the tab size is a configuration value. (Things might look very different with a tab size of 8 for example.)
Note that this is what I would make this example look like with (hopefully) consistent indenting:
wr_en <=
'0' when (
read_en = '0' and fifo_status(
fifo_full_status(
read_address,
write_address
),
fifo_empty_status(
read_address,
write_address
)
)
) else
'1';
(As you see I do not subscribe to the idea of aligning the content with the opening parenthesis. There is a line break and proper indenting already to show the structure ...)
Hey @staerz,
Things might look very different with a tab size of 8 for example.
yes...my examples were incorrect. For the tab
only it would look like this:
...and the smart tab
would look like this assuming you wanted to align with parenthesis:
One take away is that you can not use tab
and configure rules to align with parenthesis.
Another take away is the number of tabs is equal to the indent level when using just tab
.
--Jeremy
Hi @jeremiah-c-leary,
Another take away is the number of tabs is equal to the indent level when using just
tab
.
Yes, that's the idea of the entire indenting thing :rofl: (as one would use tabs to indent in the first place)
But let's not drift off here.
As our previous discussion on that matter dates back quite a while, what's the summary of the status for this issue?
I'd understand that whitespace_002
is meant to check for the proper usage of tabs or spaces (on the "left" side of the code).
I also want a rule to check (and remove) trailing whitespace, do we have that?
Hey @staerz,
I also want a rule to check (and remove) trailing whitespace, do we have that?
The rule whitespace_001
checks for trailing whitespace and will remove it.
--Jeremy
Hi @jeremiah-c-leary,
The rule
whitespace_001
checks for trailing whitespace and will remove it.
Ah, so it is back to life? I remember it was deprecated at some point...
Hey @staerz,
Things might look very different with a tab size of 8 for example.
yes...my examples were incorrect. For the
tab
only it would look like this:
...and the
smart tab
would look like this assuming you wanted to align with parenthesis:
One take away is that you can not use
tab
and configure rules to align with parenthesis.Another take away is the number of tabs is equal to the indent level when using just
tab
.--Jeremy
Yeah this is the smart tab form that makes sense if you wanted to align it like that. Personally, I wouldn't align with parenthesis but that's a different discussion I guess. I see smart tabs mostly useful to align statements such as this one, which could be several indentation levels in:
b <= "1000" when a = "00" else
"0100" when a = "01" else
"0010" when a = "10" else
"0001" when a = "11";
By the way, maybe a useful rule generally is to check on every line that every tab only follows another tab (or beginning of the line). I don't think there is such a rule at the moment. That should be compatible with all three styles and be a nice style check.
Hi @BasF0,
thanks for your feedback!
maybe a useful rule generally is to check on every line that every tab only follows another tab
Yes, that's a great explicit check to add! I'm not sure if VSG currently does that check or not as we only use spaces in general, so we don't have that problem, but I totally see the use of it (and remember me facepalming myself when looking at other people's code which had intermixed spaces and tabs in the same line...
(And concerning your example, I guess how I would indent it, the "1000"
would already go to the next line ...)
Hey @BasF0,
maybe a useful rule generally is to check on every line that every tab only follows another tab
I was wondering what we should do with random tabs in the file. Should we expand the rule to state: A tab shall be preceded by another tab or a new line.
That would only allow series of tabs to exist at the beginning of a line.
--Jeremy
Hi @jeremiah-c-leary,
A tab shall be preceded by another tab or a new line.
That might be too strict: Imagine somebody puts a table into comments and uses tabs to separate the columns (maybe on top of some markup character for the column separator), then this rule wouldn't allow that.
I think what the rule should check instead is that a tab is not preceded by a space.
Hey @staerz,
That might be too strict: Imagine somebody puts a table into comments and uses tabs to separate the columns (maybe on top of some markup character for the column separator), then this rule wouldn't allow that.
I was thinking tabs in comments would be ignored. When the tokenizer hits --
then anything after it is a comment and combined into a single token.
I am concerned about cases like this:
case address is
when "000"\t\t\t=>
when "001110000"\t\t=>
when others\t\t\t\t=>
end case;
a <= some_signal\twhen b = '0' else
other_sig\twhen c = '1' else
'0'\t\t\twhen d = '0' else
'1';
The final column of the =>
and the when
keyword is dependent on where the next tab stop is located.
So in the code itself, I was thinking tabs should only exist at the beginning of the line.
I guess the question is: should tabs only be used for indenting?
--Jeremy
Hi @jeremiah-c-leary,
I would personally not distinguish between comments and code for this question. For my perspective, tabs vs. spaces applies on an even more basic level.
I think you'd only complicate your life ...
The cases you mention are totally fine for me: If people want to use tabs there, let them do so. I wouldn't. And yes, if VSG had rules to check the distance to the =>
, then the rule should be obeyed.
I guess the question is: should tabs only be used for indenting?
I haven't read the details about smart tabs, but I would generally answer this question with a clear No!, they could also be used internally to align something as that's the nature of tabs, they come with a fixed grid with is what some developers may want to call to in their code. [And I think that would apply to hard tabs usage.]
So I propose to keep it simple and only forbid the following constructs:
case address is
when "000"\t \t=>
when "001110000" \t\t=>
when others\t\t\t =>
end case;
a <= some_signal \twhen b = '0' else
other_sig\t when c = '1' else
\t '0'\t\t \twhen d = '0' else
\t '1';
So simply forbidding sequences of "space-tab" and "tab-space". Don't care for the rest.
Hey @BasF0,
maybe a useful rule generally is to check on every line that every tab only follows another tab
I was wondering what we should do with random tabs in the file. Should we expand the rule to state: A tab shall be preceded by another tab or a new line.
That would only allow series of tabs to exist at the beginning of a line.
--Jeremy
Yeah; I can't come up with a valid use case to use tabs anywhere other than the leading characters of a line right now.
I guess the question is: should tabs only be used for indenting?
I definitely think so. But rest assured there are people who disagree! and I think we have one! :)
@staerz Can you show an example of when you would want to keep non-leading tabs ? I'm having a hard time imagining when you would want to allow it. If you want your code to be read by people with different nr-of-spaces-per-tab, then using tabs to align always breaks alignment, no ?
I recall accidentally having a tab instead of one space in the middle of my code. It had the width of 1 space due to alignment, but definitely wasn't the right character, so I would be in favor of my own suggestion obviously.
Hi @BasF0,
sure, an easy example would be a table in the comments, or some concatenation that someone tries to align via tabs:
-- This is my documentation, see the table
-- | Head1\t| Head2\t|
-- | ---\t| ---\t|
-- | part1\t| part 2\t|
So here the tab would be used to align the columns vertically.
sig <= sig_a\t\t& sig_b\t\t& sig_c
\tsignal_x\t& signal_y\tsignal_z;
Here, let's assume different lengths of signal names that are concatenated.
Somebody decides to want a line break after a certain width (sig_c
here) but then wants all the &
characters aligned vertically.
Note that I'd be using any of these constructs, but I'm just letting my imagination run freely.
So there are your use cases.
I'm certainly not against adding rules to forbid those things, but let's please disentangle that from the very basic rule that tab
and space
should never be allowed to follow each other (which is the only consideration I'd make for whitespace_001
!)
I think both your examples are valid ones to want to align. But neither align correctly with tabs, not with a tab width of 2, 3, 4 nor 5 spaces. See figure for tab width=4:
Seems there are too many spaces in the second line of the second example. So allow me to fix that by aligning the variables of the second line using spaces and removing the leading tab:
sig <= sig_a\t\t& sig_b\t\t& sig_c
signal_x\t& signal_y\t& signal_z;
With a tab-width of 3, these are aligned. But if the tab width is changed to 4, alignment is broken again. See figure.
Now I'm not trying to make this a discussion about tabs vs spaces, but you'll understand I don't see the use-case of using tabs to align.
I'm certainly not against adding rules to forbid those things, but let's please disentangle that from the very basic rule that tab and space should never be allowed to follow each other (which is the only consideration I'd make for whitespace_001!)
Did you mean whitespace_002 ? I don't know what that did. 001 is trailing whitespace only. Sure there shouldn't be trailing spaces or tabs, but this already happens because VSG translates tabs. Spaces follow tabs in smart_tabs
so unless I misunderstood I disagree :)
Hi @BasF0,
I didn't test how my 2 examples actually render, I just typed them up to give you a conceptual idea. We see that more tabs here and there would actually be needed, but that's a detail, the general statement still holds that tabs could be used for internal alignment.
Spaces follow tabs in smart_tabs so unless I misunderstood I disagree
OK, I was forgetting about smart tabs (things are so much simpler if you just use spaces only :rofl:), so then a tab shouldn't follow a space, that's I think we'd then agree on.
Hey @staerz and @BasF0,
I would propose the following rule changes:
whitespace_002
This rule will check for the existence of tabs in the file. It will have the following options:
replace_with_spaces
: all tabs are replaced with spaces. This includes tabs in comments.
keep_tabs_at_beginning_of_line
: only tabs which are at the beginning of a line are preserved. All other tabs are replaced with spaces.
More options could be added later if requested.
The default would be replace_with_spaces
as this matches the current implementation.
whitespace_014
This rule would check for tabs after a space:
\t\t\ta <= \t'1';
\t\t \tb <= '0';
and would remove preceding spaces:
\t\t\ta <=\t'1';
\t\t\tb <= '0';
The rules would execute in the following order:
- whitespace_002
- whitespace_014
- Indent rules
- Alignment rules
I think the indent rules are where we would convert leading whitespace to either:
- spaces
- tabs
- smart tabs
which could be done on a rule by rule basis, although that could get messy.
Anyway, what do you think of this proposal?
Regards,
--Jeremy
Hi @jeremiah-c-leary,
I'm generally OK with your proposal.
In order to make it more consistent with what we've discussed earlier, could the option names for whitespace_002
be closer to the idea of "tabs" vs. "smart tabs" vs. "spaces"?
Let's assume I have it configured to "tabs", then I would this rule to actually replace leading spaces by tabs ... in conjunction with the indenting rules. Or would you rather see the indenting rules than to make that conversion?
whitespace_014
looks good to me, but also here, depending on the setting "tabs" vs. "smart tabs" vs. "spaces", if "tabs" are chosen, one could additionally check for spaces after tabs (as that would be forbidden in a "tabs" configuration).
That's my 2 cents.