vblang
vblang copied to clipboard
Option Infer Turbo (aka Smart Variable Typing)
I often get asked why when you perform a type-test the variable doesn't magically get that type inside the If
.
Dim obj As Object = "A string"
If TypeOf obj Is String AndAlso obj.Length > 0 Then
obj.StringStuff()
End If
There's a list of reasons why it's not that simple but I think I have a design that addresses them.
Back-compat
This is worth burning an Option
statement on. When we added local type inference it would have been a breaking change so we added Option Infer
for back-compat reasons, leaving it Off
on project upgrade but On
for new projects.
Caveats
- Only works on non-static local variables and value (
ByVal
) parameters.
This avoids problems where a property would return a different object of a different type on subsequent invocations, or a field or ByRef
parameter is mutated on a different thread or even the same thread. Given that both the current pattern of first TryCast
ing the value into a local and then testing it for null, as well as pattern matching also require this it's not a detractor vs alternatives.
How does it work under the hood?
I think of it like a leaky binder. When you have constructs which have boolean conditionals there's an opportunity for a binder (context) to "leak" out either on the "true path" or the "false path" depending on the operators involved. In this context there's a sort of shadow variable with the same name as the variable being tested with the type of the type test.
So, for example take the expression TypeOf obj Is String AndAlso obj.Length > 0 OrElse obj Is Nothing
the binder "leaks" into the right operand of the AndAlso
operator so in that context 'obj' refers to the String typed 'obj' variable, not the original one. It doesn't leak into the right side of the OrElse
because that's not on the "true path". By contrast in the expression TypeOf obj IsNot String OrElse obj.Length = 0
the binder does leak into the right hand of the OrElse
because TypeOf ... IsNot ...
leaks on the "false path".
This is what lets guard statements work:
If TypeOf obj IsNot String Then Throw New Exception()
' obj has type 'String' here.
The "scope" of the binder is everything after the If
statement (within the same block). This means that within that scope overload resolution will always treat obj as a String
.
This leaking has to apply to the short-circuiting logic operators, the ternary conditional operator, If
, Do
, and While
statements and maybe When
clauses on exceptions. So, for example:
' This code has a bug in it, I know.
' Or maybe this should have been 'Do While TypeOf node IsNot StatementSyntax'
Do Until TypeOf node Is StatementSyntax
node = node.Parent
Loop
' At this point, node has the type StatementSyntax.
This all happens during "initial binding"; it's not based on flow-analysis.
What about Where
clauses in queries?
We can go one of two ways.
-
You only get the strong typing within the where if the expression is joined with a boolean operator or conditional because we can't know if the
Where
clause actually executed the lambda and the use of this feature should never result in exceptions. -
We could translate the
Where
into aLet
, aWhere
, and then aSelect
. It's a big of a stretch but we're already doing magic on this feature so...
Does it automagically upcast?
This doesn't happen if the type test would widen the type of the variable so:
Dim str As String = ""
If TypeOf str Is Object Then
' str is NOT reduced to 'Object' here.
End If
What if the same variable is tested multiple times?
The types are intersected. We actually support intersection types in generic methods today when a type parameter has multiple constraints. It's the one place in the language where you can say something is an IDisposable
AND an IComparable
so we should follow all the same rules there.
What about value types?
The idea is that this feature creates strongly typed aliases to objects. So the scenario for testing for a value type necessarily requires a boxed value type on the heap. Today when you unbox a value type from the heap we immediately copy it into a local variable so that any mutation to the copy doesn't change the value on the heap. For this feature we want to preserve the idea that it's just a strongly-typed reference, not a copy, and IL lets us do this. The unbox
IL instruction actually pushes a managed reference on the stack. Instead of copying the value type we can copy this reference into a "ref local` (and this would be transparent to the user) so a mutation to that value either through say an interface method or the typed value will be consistent. It's critical to preserve identity.
Are the variables immutable?
No. But here are the rules for mutation:
-
Within that scope you can assign the variable a value of the same type or more derived as long as the invariants at that point aren't broken. Under the hood we'd have to reassign every alias up to that point, I guess.
-
You can also assign things of a wider type (anything assignable to the original variable). This does not cause an implicit narrowing conversion. Instead, from that point it's illegal to use that variable in a manner which relies on the type guard having succeeded. That's where flow analysis comes in. So even if you re-assign an
Object
variable which has been promoted to aString
variable with anInteger
value you can still use it like anObject
. It's just that any code which used it like aString
, including overload resolution, type inference, member accesses, etc, will report an error.
Dim obj As Object = ""
If TypeOf obj Is String Then
Console.WriteLine(obj) ' Calls String overload.
GC.KeepAlive(obj) ' Calls ' Object overload. No error.
obj = 1
GC.KeepAlive(obj) ' Calls ' Object overload. No error.
Console.WriteLine(obj) ' Still calls String overload but reports an error.
End If
This way flow analysis doesn't have to feed type information back into initial binding. It sort of works on the idea that the Object
alias of obj
gets re-assigned, but the String
alias of obj
becomes unassigned. So flow analysis just tracks reads of String
that are unassigned. In theory one could reassign the String
alias of obj
to fix this. And any usage of obj
and an Object
(e.g. by calling members of Object
or implicit widening conversion) really reads from the Object
alias so doesn't count as a read from unassigned.
The solution in this situation is either to remove the write to the variable, re-guard the code that requires obj to be String
, or explicitly cast obj to Object
. While all of those workarounds seem ugly they're also the only legitimate code to write in those situations.
This idea that flow analysis reports an error rather than "downgrading" the type is super important to avoid silently changing the meaning of code with shadowed members:
Class C
Public Shadows ToString As Integer = 5
End Class
Dim obj As Object = New C
If TypeOf obj Is C Then
Console.WriteLine(obj.ToString) ' Calls Integer overload.
obj = New Object
' Still calls Integer overload but reports an error.
' Doesn't silently start calling Object overload when you
' add the line of code above.
Console.WriteLine(obj.ToString)
End If
What about Goto
s?
The same asignment analysis applies. If the reference is reachable at a point where the alias is unassigned an error is reported and the same solutions apply:
Dim str As Object = ""
If TypeOf str Is String Then
1:
Console.WriteLine(str) ' Error reported.
End If
Goto 1
Does an assignment cause re-inference if a narrower type is assigned?
That would be madness. We should discuss it!
What about Select Case
on type?
I've always thought of the principle function of Select Case
being to use the same "left" operand for multiple tests without repeating the name over and over. So if TypeOf
is the operator, the natural syntax for Select Case
would look like applying it multiple times.
Select Case TypeOf obj
Case Is String
' obj has String type here.
Case Is Integer
' obj has Integer type here.
End Select
Or
Select Case obj
Case TypeOf Is String
' obj has String type here.
Case TypeOf Is Integer
' obj has Integer type here.
End Select
The advantage of the first form is it has a little less repetition of the TypeOf
keyword and reads very straightforwardly--"What's the syntax in VB for doing a Select Case on the type of an object?" Select Case TypeOf obj
.
The advantage of the second form is it doesn't put Select Case
into any special mode and so you can still use all the other kinds of Case
clauses in the same block. I don't know how often that's actually a scenario though.
Both forms reuse a concept already in the language (TypeOf
) and don't add a whole new thing (Match
) for a common scenario. In a lot of ways the Case s As String
design was a consolation prize to semantics like this.
How would this work in the IDE?
I imagine we'd use a slightly different classification to indicate that the variable is "enhanced" at that point in the program. So let's say your identifiers are black by default, in a region where the type has been re-inferred it'll be purple. Then, if you loose the enhancement somehow it'll go back to black. Maybe if you hover over it the quick type will say something like "This variable has been enhanced with String
type and can be used like a string here." or something.
Summary
I think this is the most "Visual Basic" feature ever! It's very "Do what I mean" and is fairly intuitive. The last time a developer asked me why when he checks the type it doesn't automatically get that type and I sat down to write a whoe blog essay about all the technical reasons that won't work and for VB, as much as we can, it's nice to avoid a first-time programmer needing to read an essay from some compiler nerd about threading and overload resolution and shadowing (like what are any of those things?) to explain why their very reasonable intuition doesn't work.
And this is nothing particularly innovative or out there; this is actually how TypeScript and other languages work already.
I also like the idea of rehabilitating the very readable TypeOf
operator which I've felt has suffered a lot since the introduction of TryCast
. It's like TypeOf
is so self-explanatory but we have this sort of inside baseball gotcha that "Ah-ha, FxCop will tell you that really TypeOf
uses the isinst
instruction which pushes a casted value on the stack and checks it for null so doing a castclass
after that is really just casting twice so you shouldn't do it and instead you should use the TryCast
operator and check for null for performance or FxCop and people on forums will laugh at you--THEY'RE ALL GOING TO LAUGH AT YOU!". From the same folks who brought you "Ah-ha! Lists start with 0 here because of pointer arithmetic :)"
@AnthonyDGreen Does this strategy work with multiple types? (eg #23)
If TypeOf obj Is T0 OrElse TypeOf obj Is T1 OrElse TypeOf obj Is T2 ... Then
' What's the type of obj ?
End If
No , because the only type obj could have would have to be a union type, which we don't have in VB/C# yet. And even then to interact with it would require separate type checks.
I guess given two class types we could compute the most derived common ancestor. That could be pretty neat, actually.
I thought more on it, I think nearest common ancestor would be very complicated implementation wise and in that case it's enough to explicit type obj as the nearest common ancestor to get the same effect. I would like nearest common ancestor for ternary If though...
Is this a real type change of the local variable (which implies either a change to the runtime, IL or lots of injected type conversions each time its referenced) or a new variable using the same name?
If its a new variable, what happens if you assign to the variable inside the block and then reference it outside the block?
Dim obj As Object = "Initial Text"
If TypeOf obj Is String Then
obj = "Different Text"
End If
Console.WriteLine(obj)
What text gets written?
See the section "Are the variables immutable?". I think it answers your questions.
Thank you for an awesome long weekend holiday read. Really fancy that language inferences from works in Typescript (and typeless JavaScript) has an influence in how we can resurrect an old keyword like TypeOf
Q: Does intellisense not get confused when you hover over different variables and it decides that there's a more specific type in play within the block?
@johnnliu,
IntelliSense doesn't get confused so long as the compiler doesn't. It just asks the compiler for its understanding of the identifier under the cursor and displays the result. Sometimes there's some extra smarts to link up related but otherwise separate entities.
@AnthonyDGreen Greatest Common Type (GCT) is already used in in array literals, so it should be possible to use it in the multiple possible type scenario.
@AnthonyDGreen Or am I thinking of Dominant Type.
That is dominant type.
Kind of reminds me pattern matching, although it's not really the same thing. Is this supposed to be "VB version" of pattern matching or would VB eventually get both features?
@esentio
Pattern matching is a general term that can mean different things. What I've referred to in the past as "Type Case":
Select Case obj
Case t As T
End Select
could be described as a special-case of pattern matching or it could not. For Visual Basic 2015 we were originally looking at doing it stand-alone (as a simple extension to Select Case
). In Visual Basic 2017 we were thinking of implementing it as a special-case of a broader pattern-matching infrastructure (which didn't exist) as C# 7 did. For VB 16 we're thinking of addressing the scenario of concisely checking and casting an object as a stand-alone scenario without subsuming it under the umbrella of pattern matching.
That said, there are still scenarios beyond type-checking for which pattern matching could add value. Proposals #101, #140, #139, #141, #160, and #124 discuss those scenarios.
So it's not so much that this is pattern matching or an alternative to all pattern matching. It is one approach to addressing a common programming scenario which could also be solved by some forms of pattern matching. Some languages take this approach and others rely solely on pattern matching, however they are not mutually exclusive.
That said, whatever pattern matching VB does get will depend on the merits of those other scenarios and right now #140, and to a lesser extent #139 and #101 are the only scenarios that I feel would significantly move the needle for VB users (myself included). The rest seems neat but uncommon. What do you think?
If TypeOf obj Is s As String AndAlso s.Length > 0 Then
End If
I think the perfect syntax can result from combining the Anthony's proposal with mine, so, we need to declare no new variables to deal with the target type. The Select TypeOf
will be the indication to the compiler to do this trick:
Select TypeOf O
Case Nothing
Console.WriteLine("Nothing")
Case String
Console.WriteLine(O[0])
Case Date
Console.WriteLine(O.ToShortDateString( ))
End Select
Which can be lowered to:
If O is Nothing Then
Console.WriteLine("Nothing")
ElseIf TypeOf O is String Then
Dim O1 = CType(O, String)
Console.WriteLine(O1[0])
ElseIf TypeOf O is Date Then
Dim O2 = CType(O, String)
Console.WriteLine(O2.ToShortDateString())
End Select
which can avoid any complications in Anthony's proposal.
any complications in Anthony's proposal.
Please clarify what complications you are referring to. Your syntax is irrelevant to this proposal, which discussing aliasing the current variable to the type described by the TypeOf ... Is ...
test.
This:
If TypeOf O Is Nothing Then
Console.WriteLine("Nothing")
ElseIf TypeOf O Is String Then
Console.WriteLine(O(0))
ElseIf TypeOf O Is Date Then
Console.WriteLine(O.ToShortDateString( ))
End If
could also be lowered in the way you describe, without introducing any new syntax.
And if what's bothering you is the clunkiness of the TypeOf ... Is ...
syntax, then that should certainly be addressed.
When me mental health allows I am still investigating a TypeClauseSyntax
for Select Case
, I after see how compatible it is the current IsClause
.
Select Case obj
Case Is Nothing
Console.WriteLine("Nothing")
Case Is String Into S
Console.WriteLine($"Is String {S}"
Case Is Date Into D
Console.WriteLine($"Is Date {D}")
Case Else
End Select
It would be a great fit with when clauses
eg
Case Is String Into S When S.Length > 2
Console.WriteLine($"Is String {S}"as well