Rubberduck
Rubberduck copied to clipboard
Inspection for redundant instruction separators
Redundant instruction separators are a performance hit, and debugging impediment. They should be removed. A trailing instruction separator does not produce a trailing "blank" instruction, and is not a performance hit, but should also be removed.
Sub test()
'Line numbers shouldn't have trailing colons....
10: Beep 'There are 2 steppable instructions on this line, but should only be one
20: Beep: 'There are 2 steppable instructions on this line, but should only be one
'A leading instruction separator is a leading "blank" instruction"
: Beep '2 instructions on this line
: '1 instruction on this line, but it could be a blank line
'Consecutive instruction separators are just ugly...
:::: Beep:::: 'There are 8 steppable instructions on this line
'Randomize can't be a label...
Randomize: 'There is only 1 instruction on this line, but the ":" can still be removed
End Sub
So, a parse tree inspection that finds:
- instructions separator preceding the first instruction on a logical line of code (with or without a line number).
- instruction separator following the last instruction on a logical line of code.
The edge cases are interesting though. Exactly what identifiers aren't allowed as line labels? All keywords, obviously. But do we have a list somewhere?
all keywords, obviously
Nope. I haven't done a thorough search, but...
These are all valid statements:
Do:
Loop:
Wend:
Next:
Else:
End:
Close:
These are valid labels:
LongLong: 'probably only in a 32-bit host
Text:
Binary:
Database:
Compare:
Explicit:
Module:
Base:
Property:
And of course, Rem invalidates the label colon, and makes the entire line a comment:
Rem:
Here's a list of tokens (in their original order) extracted from an Excel 2013 32-bit memory footprint (using HxD)...
Presumably a 64-bit host would add LongLong, CLngLng and DefLngLng to the list.
IDK why 0 is a token...
EDIT
This list seems to exclude Format, Debug and Print
Enum
Object
\
.
#
Module
)
Error
Integer
Long
Single
Currency
Date
String
Boolean
Decimal
Array
(
,
"
Me
?
0
Abs
Access
AddressOf
Alias
And
Any
Append
As
Assert
B
Base
BF
Binary
ByRef
Byte
ByVal
Call
Case
CBool
CByte
CCur
CDate
CDec
CDbl
CDecl
ChDir
CInt
Circle
CLng
Close
Compare
Const
CSng
CStr
CurDir
CurDir$
CVar
CVDate
CVErr
Database
Date$
Declare
DefBool
DefByte
DefCur
DefDate
DefDec
DefDbl
DefInt
DefLng
DefObj
DefSng
DefStr
DefVar
Dim
Dir
Dir$
Do
DoEvents
Double
Each
Else
ElseIf
Empty
End
EndIf
Eqv
Erase
Error$
Event
Exit
Explicit
F
False
Fix
For
Format$
FreeFile
Friend
Function
Get
Global
Go
GoSub
GoTo
If
Imp
Implements
In
Input
Input$
InputB
InputB$
InStr
InStrB
Int
Is
LBound
Left
Len
LenB
Let
Lib
Like
Line
LINEINPUT
Load
Local
Lock
Loop
LSet
Mid
Mid$
MidB
MidB$
Mod
Name
New
Next
Not
Nothing
Null
On
Open
Option
Optional
Or
Output
ParamArray
Preserve
Private
Property
PSet
Public
Put
RaiseEvent
Random
Randomize
Read
ReDim
Rem
Resume
Return
RGB
RSet
Scale
Seek
Select
Set
Sgn
Shared
Spc
Static
Step
Stop
StrComp
String$
Sub
Tab
Text
Then
To
True
Type
TypeOf
UBound
Unload
Unlock
Unknown
Until
Variant
Wend
While
Width
With
WithEvents
Write
Xor
#Const
#Else
#ElseIf
#End
#If
Attribute
VB_Base
VB_Control
VB_Creatable
VB_Customizable
VB_Description
VB_Exposed
VB_Ext_KEY
VB_HelpID
VB_Invoke_Func
VB_Invoke_Property
VB_Invoke_PropertyPut
VB_Invoke_PropertyPutRef
VB_MemberFlags
VB_Name
VB_PredeclaredId
VB_ProcData
VB_TemplateDerived
VB_VarDescription
VB_VarHelpID
VB_VarMemberFlags
VB_VarProcData
VB_UserMemId
VB_VarUserMemId
VB_GlobalNameSpace
_
CLngPtr
DefLngPtr
PtrSafe
LongPtr
!
&
'
*
+
-
/
:
;
<
<=
<>
=
=<
=>
>
><
>=
^
:=
Maybe 0 is an offset? Or the least to find label?
Based on that list....
These are all valid statements (although some are compile-time errors depending upon context)
Close:
Do:
DoEvents:
Else:
End:
Loop:
Next:
Randomize:
Rem:
Resume:
Return:
Stop:
Wend:
And these are all valid labels:
Object:
Module:
Error:
Access:
Alias:
Append:
Assert:
B:
Base:
BF:
Binary:
ChDir:
Compare:
CurDir:
CVDate:
Database:
Dir:
Explicit:
F:
Format:
FreeFile:
Go:
InStr:
InStrB:
Left:
Lib:
Line:
Load:
Mid:
MidB:
Name:
Output:
Property:
Random:
Read:
RGB:
Step:
StrComp:
Text:
Unload:
Unknown:
Width:
PtrSafe:
It seems 0 is sometimes a token because when used in the statements On Error Goto 0 and Resume 0, the 0 is treated as a keyword, and not as a line-label or integer-literal. Whereas a 0 used in On 2 Goto 2, 0, the 0 is a line-label.