linter Make omit_local_variable_type permit non-trivial declared types

This issue is a proposal that omit_local_variable_types is mildened such that it only flags local variable declarations with a declared type in the case where the variable has an initializing expression with a type which is not 'obvious' (as defined below).

This is a non-breaking change, because it makes omit_local_variable_types flag a smaller set of situations.

It should still flag cases like int i = 1; (which should be var i = 1;), List<int> xs = [1, 2, 3]; (which should be var xs = [1, 2, 3];), C<num> c = C(1.5); where C is a class (it should be var c = C<num>(1.5);), C<double> c = C(1.5) (which should be var c = C(1.5), assuming that it infers as C<double>(1.5)),etc, because the initializing expression has an 'obvious' type.

But it should allow Foo<int> foo = someFunction(a, b()); even though that might also be expressible as var foo = someFunction<Whatever, It, Takes, To, Make, It, A, Foo_int>(a, b<Perhaps, More, Stuff>());. It should actually also allow it even in the case where the same thing is expressible as var foo = someFunction(a, b());, because the type of the initializing expression isn't obvious.

The motivation for doing this is the following:

The declared type of a local variable may be considered to be a simple, self-documenting, declarative approach to determine which actual type arguments to provide at various locations in the initializing expression. If, alternatively, we pass some actual type arguments in the initializing expression, we might infer the same declared type, but (1) it is not documenting the type of the variable (which means that it may be hard to understand how to use it correctly), (2) it is not simple (because we may need to look up various declarations in order to determine the type of the initializing expression), and (3) it is not declarative (because we provide feed-forward information in the initializing expression, rather than specifying the desired end result and relying on type inference to sort out the details). For example:

// Some other libraries contain various declarations that we're importing and using.
// The following declarations play that role.

Map<X, List<X>> f<X>(A<X> a) {
  var ax = a.x;
  return ax != null ? {ax: []} : {};
}

X g<X>(Map<X, List<X>> map) => map.keys.first;

class A<X> {
  X? x;
  A(this.x);
}

// We are using the above declarations.

void main() {
  // We can omit the variable type, because it is inferred.
  var v1 = g<int>(f(A(null)));

  // We can also declare the variable type.
  int v2 = g<int>(f(A(null)));
  ...
}

The lint omit_local_variable_types will flag v2, because the declared type is seen as unnecessary. The perspective that motivates this issue is that it is indeed possible to infer exactly the same declared type, but it may still be helpful for all of us to have that declared type when we're reading the code. In particular, we'll at least need to look up the declaration of g before we can even start guesstimating the type of v1. So the proposal here is to make omit_local_variable_types stop linting these non-obvious cases, thus allowing developers to document the type where it's actually needed.

(Interestingly, omit_local_variable_types does not flag int v2 = g(f(A(null)));, which seems to contradict the documentation of the lint, so maybe the lint is already doing some of the things which are proposed here. It clearly does not flag every local variable declaration that has a declared type which is also the inferred type of the initializing expression).

Note that it is assumed that omit_local_variable_types will never flag a declaration T v = e; where the declared type T differs from the type of e which is inferred with an empty context type, or where the type inference on e with an empty context type produces a different result (different actual type arguments, anywhere in e) than type inference on e with the context type T. This proposal is not intended to change anything about that, and it should not affect any of the issues where it is reported that omit_local_variable_types has a false positive, because it lints a case where it does make a difference. This proposal is only about not shouting at people when they have a declared type that may arguably be helpful.

Obvious Expression Types

In order to have a mechanically decidable notion of what it takes for a type to be 'obvious' we need a precise definition. The point is that it should recognize a number of types that are indeed obvious, and then all other types are considered non-obvious.

This means that omit_local_variable_types will allow (not force) developers to declare the type of a local variable with an initializing expression e when e has a non-obvious type, but it will continue to flag T v = e; when the type of e is obvious.

An expression e has an obvious type in the following cases:

e is a non-collection literal. For instance, 1, true, 'Hello, $name!'.
e is a collection literal with actual type arguments. For instance, <int, bool>{}.
e is a list literal or a set literal where at least one element has an obvious type, and all elements have the same type. For instance, [1, 2], { [true, false], [] }, but not [1, 1.5].
e is a map literal where at least one key-value pair has a key with an obvious type and a value with an obvious type, and all keys have the same type, and all values have the same type. For instance, { #a: <int>[] }, but not {1: 1, 2: true}.
e is an instance creation expression whose class part is not raw. For instance C(14) if C is a non-generic class, or C<int>(14) if C accepts one type argument, but not C(14) if C accepts one or more type arguments.

Discussion

We may need to add some more cases to the definition of an obvious type, because some other types may actually be obvious to a human observer. However, it is not a big problem that some types are obvious but are not recognized as such, because this means that developers have the freedom to declare a type when they want to do that. If the set of obvious types is enlarged later on, it will be a breaking change, but if the given type is actually obvious then there will be few locations where the new, more strict, version of omit_local_variable_types will flag a declaration, because few developers have actually declared the type in that case. So the changes that are actually breaking are also the ones that may be questionable, in which case the type might not be so obvious after all.

One outcome that we could hope for if omit_local_variable_types is mildened as proposed here is that this lint could become a broader standard (for example, it could be a member of the recommended set of lints), because a larger percentage of the community would be happy about the way it works.

Jun 29 '22 11:06 eernstg

I agree that the results of type inference can sometimes be difficult for a human reader to predict, and that this can significantly impact the user experience. And I agree that adding explicit type annotations is one way to solve that issue by removing the need for the reader to predict the results. My concern with this proposal is that this lint currently enforces the style guide, and I wouldn't want to change the lint without changing the style guide to match.

Jun 29 '22 17:06 bwilkerson

@bwilkerson wrote:

this lint currently enforces the style guide, and I wouldn't want to change the lint without changing the style guide to match.

That would indeed be nice! The whole point with this issue is that it can be counterproductive to ban all declared types on local variables, because some of them are actually helpful (good for code readability and maintainability). I'm just asking that developers are allowed to use good judgment in this matter.

By the way, the current version of omit_local_variable_types does not lint every declaration of the form T v = e; where var v = e; yields T as the inferred type and doesn't change the semantics of e, it only lints some of them:

// Define `f`, `g`, and `A` as before.

int v2 = g(f(A(null))); // No lint.

I don't know exactly where the threshold is, but my proposal is simply to omit the lint in a larger set of cases, such that we only push developers towards var v = e; when the type of e is obvious. The proposed change will never push them in the opposite direction, so anyone who wants to use var v = e; where the type is completely beyond comprehension is free to do so.

I'm afraid we can't say that this is a bug, because the lint is implemented in a way that corresponds pretty well to the documentation and the style guide. So I'll remove the 'bug' label for now. But I'll be very happy if we decide to change the recommended approach slightly as proposed here, and in that case it will become an implementation request.

Jun 29 '22 19:06 eernstg

Sounds good.

@munificent Just FYI.

Jun 29 '22 19:06 bwilkerson

I actually quite strongly prefer the current guidance in Effective Dart. I wouldn't support the lint deviating from it. My arguments are:

Billions of lines of correct code are written and maintained in dynamically typed languages where no variables are annotated. It can't be that necessary to allow redundant type annotations on local variables.
Most statically typed languages have or are moving to inferring locals. It has always been idiomatic in Scala, Kotlin, Go, Swift, and many other languages I'm not thinking of to omit the types on all locals. C#, Java, and even C++ (where type annotations actually change behavior) have moved decidedly towards it.
This has been the well established convention for many years in Dart with very few complaints aside from a small number of people (who seem to disproportionately be on the Dart team). All Google internal Dart code has been expected to omit type annotations for all correctly inferred local variables for years. It's a requirement to earn readability. This has never showed up as a major style complaint.
I have worked on several teams across multiple languages as they have gone from fully type annotated locals to type inferred locals. Every time, I have seen a number of users feel extremely uncomfortable with omitted types. They would eventually grudgingly accept omitting types on "obvious" initializers. Over time their definition of "obvious" would grow and grow until eventually, they get tired of remembering which initializers are obvious and just use it on all of them. I have never seen a team go backwards away from inferred locals towards more types.
My experience is that type annotations on locals tend to encourage worse names for local variables.
Scope of local variables almost always is and certainly should be small. If you can't understand a short function without redundantly type annotating a local variable, you likely have bigger readability problems and should focus on those.
Users will argue about the heuristics for which kind of names are "obvious" until the end of time. (This is why the annotation guideline for non-local variables is deliberately vague and leaves it up to individual taste.)
Trivial refactorings will often change an initializer from "obvious" to "non-obvious". For example, turning a named constructor into a static method with the same name should according to these rules require every local variable that calls that method in its initializer to be fixed to have a type annotation. Conversely, when refactoring a static method to a named constructor, this would require users to go back and remove all of the type annotations.
Allowing redundant type annotations on locals makes it harder to notice non-redundant type annotations. I think it's more valuable to have upcasts stand out visually.

I think we should just leave the lint and rule as they are.

Jul 14 '22 23:07 munificent

@munificent, this is of course a topic where we disagree substantially, so I'll have to comment on it. ;-)

Billions of lines of correct code are written and maintained in dynamically typed languages

But I wouldn't expect you to argue that we should drop all static typing? I thought an argument against this proposal should be something which is specific to local variables.

Most statically typed languages have or are moving to inferring locals

Sure, and I would certainly expect that to be the typical approach in Dart. I even think we should nudge people in that direction in as many cases as possible when the type doesn't provide useful information, that is, when the type is obvious in the initializing expression.

If you can't understand a short function without redundantly type annotating a local variable ..

So what's the type of var y = foo(x);?

No, I'm not going to tell you which type arguments foo accepts, and even the tools aren't going to tell you how type inference proceeded to infer types for that expression. The point is that you need to look it up, and foo is of course in some other package. (OK, you can hover in an IDE, but we do encounter code in other contexts as well.)

So the type is not redundant for a reader of the code, and I don't understand why we need to shout at people who prefer to inform readers about such non-obvious types, in some carefully selected particularly complex situations.

Users will argue about the heuristics for which kind of names are "obvious" until the end of time

Any user who wishes to omit the type of a local variable can freely do so because the lint will never ask for the type to be added, it will only ask to get the type removed (when it's obvious).

So the only case where anyone would want to change the heuristic is when they want to have a type annotation in a case where the lint says that the type is obvious.

I don't think that's very likely to happen, presumably the obvious types are obvious, and will remain so.

turning a named constructor into a static method with the same name should according to these rules require every local variable that calls that method in its initializer to be fixed to have a type annotation

No, we never require a type annotation, we just stop complaining about a type annotation which isn't obvious.

But it is true that the opposite change (static method turns into constructor) would cause the lint to ask for the type annotations to be removed, if any.

harder to notice non-redundant type annotations

That's an interesting point!

It could be useful to be able to detect the cases where the specification of a declared type changes the run-time semantics of the initializing expression (because type inference, coercion, generic function instantiation, etc. have a different outcome).

However, I don't see how a mere upcast would be more valuable to document than a non-obvious type. After all, if we seriously don't care about the type then why is there an upcast in the first place? And if we do care about the type then why must it be impossible to document that type just because a potentially very complex type inference, coercion etc. step produced exactly the same type as the one that we wish to document?

Jul 15 '22 15:07 eernstg

If you can't understand a short function without redundantly type annotating a local variable ..

So what's the type of var y = foo(x);?

In:

bar(foo(x))`;

What's the type of argument passed to bar()?

No, I'm not going to tell you which type arguments foo accepts, and even the tools aren't going to tell you how type inference proceeded to infer types for that expression. The point is that you need to look it up, and foo is of course in some other package. (OK, you can hover in an IDE, but we do encounter code in other contexts as well.)

All of this applies to my example too. We seem to be perfectly happy with not seeing type annotations on subexpressions, so I don't see a very compelling argument for them on local variables.

I don't understand why we need to shout at people who prefer to inform readers about such non-obvious types, in some carefully selected particularly complex situations.

The linter shouts at people for lots of reasons that absolutely don't matter at all when it comes to understanding their code. The answer for that is almost always that consistency has its own value. If we let people redundantly annotate local variables, then readers will be presented with some code that chooses to do that and some code that chooses to not. That difference will get their attention and distract them from other aspects of the code. They may wonder, "Is this an upcast?" (Or worse, it will be an upcast and they won't realize it because they think it's just a redundant annotation.) They may wonder why the author chose to annotate this one and not others nearby. Was there a reason for that, or was it just that one author had one preference and other authors of nearby code had others?

Overall, I haven't seen any compelling evidence that allowing variation here carries its weight.

Any user who wishes to omit the type of a local variable can freely do so because the lint will never ask for the type to be added, it will only ask to get the type removed (when it's obvious).

They can opt out of annotating in code they write, but they can't opt out of being presented with code written by others who made a different choice.

I don't think that's very likely to happen, presumably the obvious types are obvious, and will remain so.

I have literally watched this happen on large teams multiple times over my career, including in Dart.

I don't see how a mere upcast would be more valuable to document than a non-obvious type.

Because it tells the reader that the variable's type is not what they would intuit from the initializer expression.

After all, if we seriously don't care about the type then why is there an upcast in the first place?

So that the variable can be assigned a value later whose type isn't a subtype of the initializer's type. So seeing the type annotation on the local sends a useful signal of "hey this is going to be mutated later to a different type".

And if we do care about the type then why must it be impossible to document that type just because a potentially very complex type inference, coercion etc. step produced exactly the same type as the one that we wish to document?

It's not impossible. This is just a lint. But the lint does reflect what we think leads to the best, most consistently readable code across the ecosystem.

Jul 21 '22 18:07 munificent

In:
bar(foo(x));
What's the type of argument passed to bar()?

[it is not trivial to compute such types]

All of this applies to my example too.

Certainly, but the point is that software engineering quality includes readability and comprehensibility as core elements, and one of the foundational tools used to improve on the readability of a snippet of code is to split up large expressions by evaluating subexpressions and storing their value in local variables.

So let's say that bar(foo(x)) is sufficiently complex to justify some refactorings. The first one would be this:

var name = foo(x); // Where `name` is chosen such that it helps the reader.
... bar(name) ...

One difficulty with this kind of refactoring is that the type inference process works differently when foo(x) has no context type, so we may have to add a declared type to name in order to preserve the semantics:

SomeType name = foo(x);
... bar(name) ...

However, the declared type can be helpful for a reader of the code, because it helps them understand the meaning and purpose of name, and that could make it easier to read and understand expressions like bar(name).

I'm simply saying "let's stop shouting at the developer" in the case where SomeType has been declared for readability reasons.

If we let people redundantly annotate local variables, then readers will be presented with some code that chooses to do that and some code that chooses to not.

This would be a likely outcome if half of the community enables omit_local_variable_type and the other half enables always_specify_types.

What I'm proposing is that the people who agree with me that documenting local types can be good for readability in a few cases can enable omit_local_variable_type, because it won't shout at them when they do want a type annotation.

But in their daily work they'll have lots of little nudges in the direction of omitting that type, because there are so many cases where the type is obvious, and omit_local_variable_type will (rightfully) complain in those cases.

So the few local variables with a type will stand out, and it's a rather safe bet that the developer who wrote it wanted to have the type in that particular case because they needed to think hard about that particular type when they wrote the code, and then it's probably helpful to have it there when we read the code.

So the proposal will very specifically not nudge developers in the direction of writing the types just because that's a habit, each local variable with a type is there because someone actively made that declaration different from most others.

They can opt out of annotating in code they write, but they can't opt out of being presented with code written by others who made a different choice.

Today, those others would simply not enable omit_local_variable_type, and then there's no limit on the number of local variable types in their code.

With this proposal, they might accept enabling omit_local_variable_type because it allows them to use type declarations when they are helpful, and the net result would be that the community as a whole would move in the direction of omitting the declared type most of the time, rather than being more or less divided.

But the lint does reflect what we think leads to the best, most consistently readable code across the ecosystem.

We don't quite agree. ;-)

It just rubs me the wrong way to force developers into lowering the software engineering quality of their code, just because we refuse to give them a little bit of extra freedom to use human judgment when needed.

Jul 22 '22 12:07 eernstg

So far, omit_local_variable_type hasn't made the cut for the recommended set of lints in package:lints because some feel it is too strict and removes type annotations that are actually useful in understanding the code (see also https://github.com/dart-lang/lints/issues/24). I have to read a lot of code in environments (e.g. on github) where I don't have an IDE handy that can infer the types for me - and I certainly appreciate the inclusions of types there. I find that reviews of google-internal code are actually harder to do due to the missing types because I often have to context switch to look up the non-obvious return type of a function.

The proposal in this issue sounds like a nice compromise to me between never specifying types and always specifying types since the latter can get tedious in the obvious cases as well. If a middle ground is implemented, it would probably be easier to get that included in package:lints and push people towards removing type annotations from more lines of code, where they are not hindering readability.

Aug 01 '22 23:08 goderbauer

linter linter copied to clipboard

Make omit_local_variable_type permit non-trivial declared types

Obvious Expression Types

Discussion

linter
linter copied to clipboard