go icon indicating copy to clipboard operation
go copied to clipboard

proposal: Go 2: universal zero value with type inference

Open geraldss opened this issue 5 years ago • 98 comments

I propose a universal zero value with type inference. Currently nil is a zero value with type inference for pointers and built-in reference types. I propose extending this to structs and atomic types, as follows:

{} would represent a zero value when the type can be inferred, e.g. in assignments and function call sites. If I have a function:

func Foo(param SomeLongStructName)

and I wish to invoke Foo with a zero value, I currently have to write:

Foo(SomeLongStructName{})

With this proposal, I could alternatively write:

Foo({})

For assignments currently (not initializations; post-initialization updates):

myvar = SomeLongStructName{}

With this proposal:

myvar = {}

This proposal is analogous to how nil is used for pointers and reference types.

The syntax allows type names and variable types to be modified without inducing extraneous code changes. The syntax also conveys the intent "zero-value" or "default" or "reset", as opposed to the actual contents of the zero value. Thus the intent is more readable.

geraldss avatar Dec 04 '19 18:12 geraldss

Or perhaps underscore as the zero designator. Either would be readable.

geraldss avatar Dec 04 '19 19:12 geraldss

Related issues #19642 now closed which proposed a universal zero value and #12854 which would allow type names to be elided in composite literals allowing all the examples in the first post.

jimmyfrasche avatar Dec 04 '19 19:12 jimmyfrasche

Interesting idea. Sorry to bike shed, but perhaps the reusing the default keyword would be more readable?

Foo(default) myvar = default

beoran avatar Dec 05 '19 14:12 beoran

Foo(default, default, default)
Foo({}, {}, {})
Foo(_, _, _)

I find the latter two more readable, but default or other keyword is also fine with a syntax highlighter. As jimmyfrasche pointed out, the {} and _ syntax have been proposed previously.

Here's another argument for the proposal. These calls highlight the values that are being passed, which is good:

Foo(10, "xyz", nil)
Foo({})

This call highlights the type that is being passed, including its fully qualified name. This shifts the cognitive effort.

Foo(SomeLongStructName{})

geraldss avatar Dec 05 '19 14:12 geraldss

i think Foo({}, {}, {}) is more readable, than default, FMIO, cause 1) default has a bit more letters... cause, you know, less letters -> better code 🙃 2) create new reserved word is not a good idea, i know so many projects with default variables, so, it can break a lot of code base. Also _ symbol is for another things, like /dev/null in golang universe. so, with _ symbol as empty struct definition is not a good idea, i think.

but as a concept of language proposal, i like your idea.

quenbyako avatar Dec 08 '19 03:12 quenbyako

This seems to be a restatement of #19642 with a different spelling of the zero value. Given that the earlier proposal was not accepted, what has changed since then?

ianlancetaylor avatar Dec 10 '19 22:12 ianlancetaylor

It's not stated why the previous proposal was closed, and I had not seen it when I searched and filed this proposal.

I raised this proposal from direct and repeated experience. In addition to comments in this issue and the previous issues, I'll add another:

There are up to three items of information in a Go expression or assignment: name, type, and value.

  1. Names are inferred using uniform rules across all Go datatypes. That is, names are inferred in assignments, function calls, and return statements, and the inference behavior is uniform across all Go datatypes.

  2. Values are not inferred. This is also uniform.

  3. Type inference is not universal and uniform, and it's not clear to me why that is.

The function calls FooInt32(0), FooInt64(0), FooPtr(nil), FooChan(nil), FooMap(nil) will all infer the argument type correctly. Presumably Golang believes that type inference is beneficial or ergonomical. These could all require explicit typing, e.g. int32(0).

geraldss avatar Dec 11 '19 02:12 geraldss

0 is an untyped constant, as are "", true, and false, and, for that matter "abc", 100, and 1+2i. Untyped constants may be used with any compatible type. If there is no compatible type, as in a := 0, they have a default type.

nil is the zero value of pointer, slice, channel, map, and function types. It is not an untyped constant: a := nil is an error. nil is in effect an overloaded term for the zero value of certain types. This overloading is problematic; see #22729. Note that for the types with which nil can be used, there is no other way to write the zero value.

This proposal, and #19642, is something else again. It proposes a way of writing a value that can be converted to the zero value in a type context. Writing a := {} would be an error. But we could write F({}), a = {} (for an already defined a), a == {}, return {}. For ordered types we could write a > {}. And while {} could be used with any type, it would always be an alias for the actual zero value of that type (0, false, nil, S{}, etc.).

You could presumably write 0 == {}, which would always be true: the 0 would have no type context so it would default to int, at which point {} would default to the value 0 in type int. Maybe you could write {} == 0. I'm not sure. I'm also not sure about 1 + {} and {} + 1. Or "a" + {} and {} + "a".

So I don't agree with your suggestion that there is some missing aspect to type inference. Untyped constants, nil, and {} are three different kinds of things.

ianlancetaylor avatar Dec 11 '19 05:12 ianlancetaylor

Per your comment, untyped constants do support type inference, and overloaded nil does support type inference (issue with nil interfaces noted).

The net effect of this is that type inference is neither uniform nor universal across data types. This is the impetus for my proposal and the earlier proposals. I also like #12854, and would consider any of these a positive step.

geraldss avatar Dec 11 '19 05:12 geraldss

I think we must mean different things by "type inference". I tried to describe exactly how untyped constants and nil behave, to show that they are different from each other. I agree that if you describe both untyped constants and nil as "type inference", then "type inference" is neither uniform nor universal across data types. But I don't see how this proposal changes that fact.

ianlancetaylor avatar Dec 11 '19 05:12 ianlancetaylor

By type inference, I mean the omission of the type name in the text of the value.

Your example of a == {} is interesting. I write these all the time:

if ptr == nil
if v == 0

Would be useful to write

if v == {}

where type of v is SomeLongStruct.

This proposal says that {} is treated uniformly as the zero value in all contexts where type can be inferred / determined. That seems uniform and universal. The concept of "zero value" is already universal, i.e. defined for all types.

geraldss avatar Dec 11 '19 06:12 geraldss

OK, omitting the type in the text of the value is what I would call an implicit conversion. Untyped constants support an implicit conversion to a set of related types, and also support an implicit conversion to a default type. The value nil supports an implicit conversion to any pointer, slice, etc., type. This proposal is suggesting that the value {} support an implicit conversion to any type.

Another case where implicit conversion occurs in Go is that any type that implements an interface type may be implicitly converted to that interface type.

ianlancetaylor avatar Dec 11 '19 15:12 ianlancetaylor

A better way to do this (in my opinion) would be to allow for constant struct expressions, which would hopefully include "untyped struct literals". #21130 gets close to this but isn't very specific, I might try to type up something a little more formal.

deanveloper avatar Dec 13 '19 19:12 deanveloper

Const-ness is orthogonal to type inference.

geraldss avatar Dec 13 '19 19:12 geraldss

Untyped constants are not, however. What I am proposing is that we should be able to do var x MyStruct = {} just as we can do var y time.Duration = 0

deanveloper avatar Dec 13 '19 19:12 deanveloper

I think #12854 and #21182 would fill most of the gaps where this hurts in most code. Comparing a struct to its zero would still be a little awkward with this proposal or #12854 since you'd need to write if v == ({}) {.

Generating code or, in the future, writing generic code that uses zero values is still going to be awkward, as you don't know which form the zero value takes, though #21182 would knock out the most painful case.

You can always do var zero T but that gets a little awkward if you need zeros for more than one type in the same scope. You can avoid naming the zeros and use the expression *new(T) but that's a bit weird looking, especially since new isn't used that much.

In most cases, you could probably get away with generalizing and having the user pass in a value, zero or not: for example, writing Filter(type T)(s []T, v T) []T instead of RemoveZeros(type T)(s []T) []T.

In generic code, comparing to zero also has a little wrinkle in that some incomparable types have a special case for comparing against zero that can't be matched in type constraints where you can only specify comparable or not. If there were some universal zero value, then #26842 could be accepted since there would always be a way to write a statically guaranteed to be all-bytes zero. But, if that's the only major case left and it would still be awkward to see if comparable structs are zero, maybe it would suffice to have a predeclared func zero(type T)(T) bool that worked on comparable and incomparable types alike?

jimmyfrasche avatar Dec 22 '19 23:12 jimmyfrasche

Yes, it's possible to do less. But I haven't seen any argument for why less is more in this case, or any downside to the universal zero.

geraldss avatar Dec 25 '19 04:12 geraldss

Let's consider what we can do with a specific, typed zero value, var zero T:

  1. Reset a variable to zero: v = zero
  2. Define a new variable: u := zero
  3. Send it to a channel: c <- zero
  4. If T is comparable, compare another variable to it: v == zero
  5. If T has operators, use it as an operand: v < zero or v + zero
  6. Use it in a composite literal: []T{u, zero, v}
  7. Call a method on it: zero.M()
  8. Return it from a function: return zero, err

If we had a universal zero value, then defining a new variable and calling a method are out, as a specific type is required for each. Using it as an operand isn't really a problem since any type with operators already has a concise zero value.

That leaves:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. Use it in a composite literal: []T{u, zero, v}
  5. Return it from a function: return zero, err

For the majority of these, there's only really a problem if T is composite, as they have verbose zero values. Use in a composite literal is only sometimes an issue as the types of composite literals in composite literals can be elided in a number of cases. #12854 could expand elision to all the remaining cases and allow you to write return {}, f({}) for example. This would also allow quite a bit more since you could also write c <- {k: v} or f({}, {X: 1}, {2, 3}).

For comparable T, comparison against zero would still have the issue that we could write

p := v == {}
if p { // ...

but we couldn't write

if v == {} { // ...

due to the ambiguity and we would instead have to write

if v == ({}) { // ...

All of this assumed that we knew upfront what T is. That goes away when generating code or (hopefully soon) writing generic code. Even if every type has a concise zero value, we will not necessarily know which one to use, unless the contract of the type parameter is sufficiently strict.

The most common case would be returning some zero values and an error. #21182 would allow that and also improve the readability and editability of non-generated/generic code as a bonus.

That leaves us with a different set of possible problems:

  1. Reset a variable to zero: v = zero
  2. Send it to a channel: c <- zero
  3. If T is comparable, compare another variable to it: v == zero
  4. If T has operators, use it as an operand: v < zero or v + zero
  5. Use it in a composite literal: []T{u, zero, v}

A universal zero value would be useful here, but I think the majority of these will be relatively uncommon, though I could be wrong. A good way to make a case for this proposal would be to write reasonable generic code using the latest generics draft that is very awkward without a universal zero. Finding code generators that have a lot of special cases or past/known bugs because of this would be another.

The one that seems like it would be most likely to cause problems is the split between incomparable types that are totally incomparable versus those that can be compared against nil. (#26842) If there were a universal zero value, all types, comparable or not, could be compared against it regardless of the specificity of the type constraints. It would also help to avoid the ambiguity when comparing struct values to their zero. But if it's just this one case that's left over that predeclared zero predicate would suffice.

jimmyfrasche avatar Dec 25 '19 23:12 jimmyfrasche

As a detail, I don't see any ambiguity with

if v == {}

Every binary operator requires expressions on both sides, not statement blocks.

geraldss avatar Jan 04 '20 19:01 geraldss

That's true. I was thinking about how you have to write if v == (T{}) { but you have to do that because of the T not the {}.

jimmyfrasche avatar Jan 05 '20 01:01 jimmyfrasche

type T = func()

func Default(a, b T) T {
	var zero T
	if a != zero {
		return a
	}
	return b
}

This code doesn't compile because zero is a variable, not a constant, so it gives error "invalid operation: a != zero (func can only be compared to nil)". const zero T doesn't work because "const declaration cannot have type without expression". If default meant "zero value for type", you could write a != default and the code would work.

This doesn't matter much now, but in a world with generics, not being able to write (type T) IsZero(t T) bool would be a pain.

earthboundkid avatar Mar 01 '20 03:03 earthboundkid

@carlmjohnson there's also #26842. Consider type T = struct { f func() }. A universal zero value wouldn't help with that unless it was also allowed to be compared against universally. Another way to solve that problem would be to make a function like IsZero a builtin.

jimmyfrasche avatar Mar 01 '20 16:03 jimmyfrasche

Currently the language permits writing a simple expression, without specifying a type, for the zero value of most types: 0 for numeric types, "" for string types, nil for function, pointer, interface, channel, slice, and map types, false for boolean types. The exception is structs and arrays.

The raises the possibility of, rather than inventing a generic zero value, extending nil to be usable with struct and array types. Then nil would be the zero value for any composite type, which could arguably be a simplification of the spec.

The idea here is that we could assign nil to a variable of struct or array type, which would mean to zero all the elements. And we could compare a value of struct or array type to nil, which would report whether the value were the zero value.

ianlancetaylor avatar Mar 31 '20 21:03 ianlancetaylor

Extending nil to all composites would address the convenience issues.

However, a universal zero would work for all types, and should have additional benefits for tooling, generics, etc. Also, a universal zero always compiles as an assigned value or an argument, hence callers / assigners are protected from type changes to variables they don't care about.

Either option would impose some cognitive change on golang developers, so the question is how desirable is universality / generality. As a reference point, I think the universal underscore serves golang 1.x very well.

geraldss avatar Mar 31 '20 22:03 geraldss

You could also extend nil to any type and lint against using it with a known type with a "better" zero value like numbers and strings. That would let generic/generated code use nil for the zeroes of unknown types.

jimmyfrasche avatar Mar 31 '20 22:03 jimmyfrasche

Yes, the spelling of the universal zero is mostly aesthetic, unless we want the zero for structs to coincide with partially valued structs.

geraldss avatar Mar 31 '20 22:03 geraldss

For basic types Go already has special syntax for their values: We can write numbers (incl. 0) for numeric types, strings (incl. "") for string types, true and false for boolean types. With the exception of false, the respective zero values for these types tend to be short (shorter than nil or zero) and evocative.

Thus, at least with existing Go I don't see a good reason for introducing an alternative way of writing those zero values differently. That may change when we have generics where we may want to introduce a zero value that can be written in a type-independent way.

But I do like @ianlancetaylor's idea of generalizing nil to all composite types. It will take a bit of getting used to, but as @bradfitz has pointed out (verbally, during the proposal review mtg), we already use nil as the zero value for a slice, and a slice is basically a struct with three fields (pointer to underlying array, length, and capacity). It's really a small step to generalize this and it would be nice to be able to write in the spec that nil is simply the zero value for all composite types.

This should be a backward-compatible change. In generic code we might go the extra step and permit nil as the zero value for all variables of generic type.

griesemer avatar Mar 31 '20 23:03 griesemer

Generated code exists today, will exist after generics, and can get annoying when zeroes are involved. You either need to write it in an unnatural manner or figure out which zero value to use based on type analysis, but at least the latter would be reduced to selecting from {0, "", false, nil}. Allowing nil universally would fix that and the corresponding future issues with generic code. I would trust authors to avoid using nil when there was a better candidate just because it's simpler to write and reads better. And it is easily machine-checkable so the occasional slip up would be trivial to lint without issue as generated code is not linted.

Still, just allowing nil for composites would be a very nice improvement in non-generated/non-generic code and would allow addressing #26842 so :+1: even if I think it should go further.

jimmyfrasche avatar Apr 01 '20 00:04 jimmyfrasche

Extending nil to be assignable to more types (or even all types) sounds reasonable to me.

mdempsky avatar Apr 01 '20 01:04 mdempsky

0 for numeric types, "" for string types, nil for function, pointer, interface, channel, slice, and map types, false for boolean types. The exception is structs and arrays.

The fact that these all vary depending on the underlying type suggests to me that structs and arrays deserve their own different zero expression instead of overloading the meaning of nil. Or you could double down and allow any type to be assigned to nil, setting numerics to zero, strings to the empty string, etc.

fogleman avatar Apr 01 '20 01:04 fogleman