csharplang
csharplang copied to clipboard
Champion: "Partial Type Inference"
- [ ] Proposal added
- [ ] Discussed in LDM
- [ ] Decision in LDM
- [ ] Finalized (done, rejected, inactive)
- [ ] Spec'ed
See #1348
Currently, in an invocation of a generic method, you either have to specify all of the type arguments or none of them, in the latter case depending on the compiler to infer them for you. The proposal is to permit the programmer to provide some of the type arguments, and have the compiler infer only those that were not provided.
There are (at least) three possible forms this could take:
- Named type arguments (#280, #1348), e.g.
M<TElement: int>(args)
- Omitted type arguments separated by commas, e.g.
M<int, >(args)
- Types to be inferred specified by
var
(#1348), e.g.M<int, var>(args)
Design Meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-02-07.md#partial-type-inference
Pros/cons for 2 vs 3?
Both are DRY but I prefer 2... you don't buy much visually with 3.
1 is okay but verbose. I prefer 2 because I would like this potential progression in levels of needed disambiguation for Foo<TKey, TValue>
, assuming Foo<>
or Foo
also existing or not existing:
Foo<TKey,>
, Foo<,>
, Foo
.
I'm pretty sure there was an issue regarding recursive generic, but I can't find it. Without recursive type parameters, I think 2 is better because is less verbose, since verbosity is part of the issue here. But with recursive type parameters, I think 1 would fit better.
I think that either 2 or 3 is the best one. 2 is certainly the least verbose (which is nice), but 3 looks more complete and full. I don't think the slight verbosity of 3 would really be an issue because it still requires no thinking of what the type name is, and is only three keystrokes.
I slightly prefer 3, but 2 is also perfectly fine. 1 is a bit too verbose.
I support 1 even with the inclusion of 2 or 3. It helps to document the parameters.
I believe, F# is using _
:
> open System.Collections.Generic;;
> let s = [ 1; 2; 3 ] :> IEnumerable<_>;;
val s : IEnumerable<int> = [1; 2; 3]
> let l = List<_>([1; 2; 3]);;
val l : List<int>
(So we could consider using it too.)
Isn't 2 already used in other constructs?
typeof(IDictionary<,>)
@qrli
typeof(IDictionary<,>)
gets you open generic type. However this proposal is about infering fixed type parameters. So I think symmetry is not necessary.
@zippec typeof(IDictionary<string,>)
is also valid open generic.
@qrli Say what? Type
objects cannot have partially-open generics.
@jnm2 IIRC typeof expressions cannot have partially-open generics but you can construct such Types via reflection.
@federicoce-moravia I don't believe either is possible. Can you show me?
Btw, closing over string, TValue
is still 100% closed, not partially closed. There is an important difference between an open generic and a generic that is closed over TKey, TValue
.
@jnm2
You can do it by grabbing the generic type parameter from the open generic type:
Type type = typeof(Dictionary<,>);
Type[] args = type.GetGenericArguments();
Type type2 = type.MakeGenericType(typeof(string), args[1]);
@HaloFour That's typeof(Dictionary<string, TValue>)
which is completely closed over those two types. For example.
@jnm2
It's something that cannot be expressed in C#. It is typeof(Dictionary<string,>)
because the second type parameter is still the open generic type argument from the open generic type Dictionary<,>
. The Type.ContainsGenericParameters
property returns true
(as it does with open generic types) and you cannot instantiate that type.
@HaloFour I understand that it inexpressible in C#. Here's the dillema: it is fully constructed to close over typeof(Foo<TValue, TKey>)
. Why should it not also be fully constructed closing over typeof(Foo<TValue, TValue>)
or typeof(Foo<TKey, TValue>)
?
public static class Program
{
public static void Main()
{
new Foo<string, int>().CloseForwards();
new Foo<string, int>().CloseBackwards();
}
}
public class Foo<TKey, TValue>
{
public void CloseForwards()
{
// This does the equivalent of:
_ = typeof(Foo<TKey, TValue>);
var openGeneric = typeof(Foo<,>);
var fooTypeArguments = openGeneric.GetGenericArguments();
var closeOverFooTypeArguments = openGeneric.MakeGenericType(fooTypeArguments[0], fooTypeArguments[1]);
Console.WriteLine(openGeneric == closeOverFooTypeArguments); // true (!)
Console.WriteLine(closeOverFooTypeArguments.IsConstructedGenericType); // false (!)
}
public void CloseBackwards()
{
// This does the equivalent of:
_ = typeof(Foo<TValue, TKey>);
var openGeneric = typeof(Foo<,>);
var fooTypeArguments = openGeneric.GetGenericArguments();
var closeOverFooTypeArguments = openGeneric.MakeGenericType(fooTypeArguments[1], fooTypeArguments[0]);
Console.WriteLine(openGeneric == closeOverFooTypeArguments); // false
Console.WriteLine(closeOverFooTypeArguments.IsConstructedGenericType); // true
}
}
So I'm not talking about being able to instantiate from a separate class. I'm talking a level more abstract: what can you specify using typeof
inside the definition of the type which declares TKey
and TValue
?
@jnm2
I think we're walking in a weird gray area not well handled by the BCL and more academic than practical.
The documentation for IsConstructedGenericType
states the following:
Gets a value that indicates whether this object represents a constructed generic type. You can create instances of a constructed generic type.
Whereas the remarks of ContainsGenericParameters
states the following:
Since types can be arbitrarily complex, making this determination is difficult. For convenience and to reduce the chance of error, the ContainsGenericParameters property provides a standard way to distinguish between closed constructed types, which can be instantiated, and open constructed types, which cannot. If the ContainsGenericParameters property returns true, the type cannot be instantiated.
Per the documentation it would seem that it wouldn't be legal for those two properties to both return true
. I'd almost call it a bug.
Either way, I don't think this should hold up any conversation about a placeholder syntax (or lack thereof) to denote a generic type argument that is inferred.
Either way, I don't think this should hold up any conversation about a placeholder syntax (or lack thereof) to denote a generic type argument that is inferred.
Certainly not. 😁
@gafter Is omitting the type args altogether another option? F# supports the syntax let d = Dictionary()
for generic dictionary.
@FunctionalFirst We infer the type arguments to methods, not to types. This is about method type argument inference.
@gafter You mean this doesn't cover constructors? I think there are a few other proposals about that and type inference in general (for instance: https://github.com/dotnet/csharplang/issues/92). I hope this can led to considering those as well.
Any thoughts on methods that have more than 2 type parameters and omitting multiple trailing ones? My personal feeling: No, since they're not optional like default-parameters
With M<T1, T2, T3, T4>(...)
, is any of the following permitted?
-
M(...)
(no angle-brackets at all, we can use this today if type parameters can be fully inferred) -
M<>(...)
(no mention of any type parameters, not sure if this has any use whatsoever) -
M<int>(...)
(T1
isint
, infer all the rest. Chances are we meant the first parameter, but did we really?) -
M<int, >(...)
(T1
isint
, infer all the rest. Looks more like we meant the first parameter, but would probably be ambiguous if there are multiple overloads with a different number of type parameters) -
M<T1: int>(...)
(T1
isint
, infer all the rest. Specific, concise, but again the potential overload issue) -
M<int,,,>(...)
(T1
isint
, infer all the rest, empty space with comma for each of them. Number and placement of parameters is clear) -
M<int, var, var, var>(...)
(T1
isint
, infer all the rest,var
for non-specified ones. Number and placement of parameters is clear)
Also:
-
M<T1: int, T4: string>(...)
(T1
isint
,T4
isstring
, infer all the rest. Potential overload issue for multiple candidates) -
M<int,,, string>(...)
(T1
isint
,T4
isstring
, infer all the rest, empty space with commas. Probably won't work without the commas for position indication) -
M<int, var, var, string>(...)
(T1
isint
,T4
isstring
, infer all the rest,var
for non-specified ones. More verbose than the previous, but nicer than just "empty" commas)
For that, I'm leaning more towards 3 than 2 (while 2 is the current way of specifying an open generic with typeof
). Multiple commas next to each other, with no real content don't resonate with me at all, even when the var
in there makes it pretty verbose.
I'm personally not a friend of named arguments (mostly because I'm not a friend of default parameters; they've far too often bit me in the behind in shared code bases, but that's not the discussion here), but I can see merit for 1 when the number of generic parameters reaches a certain number (which might as well be some other indication of code-smell or similar, with very few exceptions)
Great that language team is considering improving type inference in C#. This is something I've been waiting for a long time.
Basically I think C# should ~~just copy~~ get inspiration from F# type inference. I mean, when it comes to local variable inference in function's body, not full Hindley Milner function's argument type inference.
Type inference in F# was quite intuitive for me. I just discovered it naturally. In C# we recently got _
wildcard so it wasn't used that often hence this might not be such easily discoverable as in F# when you expect things to just get inferred and sometimes put _
to help compiler.
Anyway, I like F# syntax a lot and for me it's good if similar features have similar syntax in both languages.
Partial type inference
F#:
let testGeneric<'T1,'T2>(arg1 : 'T1) =
printfn "T1: %s T2: %s" (typeof<'T1>).FullName (typeof<'T2>).FullName
() //returns nothing
testGeneric<_, string>(1)
result printed:
T1: System.Int32 T2: System.String
The inspired syntax in C# could look like that:
class Program
{
static void testGeneric<T1, T2>(T1 arg1)
{
Console.WriteLine("T1: {0} T2: {1}", typeof(T1).FullName, typeof(T2).FullName);
}
static void Main(string[] args)
{
//doesn't work but could (now I need to write testGeneric<int, string>(1);)
testGeneric<_, string>(1);
}
}
Because some people mentioned other possible type inference features I'll try to show how F# type inference could work in C#.
Initialization based on later methods call
F#:
let list = new System.Collections.Generic.List<_>()
list.Add(1)
list.Add(2)
let list2 = System.Collections.Generic.List()
list2.Add(1)
list2.Add(2)
//list2.Add("foo")
//error if commented out
//This expression was expected to have type 'int' but here has type 'string'
let list2 = System.Collections.Generic.List<_>()
// Error:
// Value restriction. The value 'list2' has been inferred to have generic type
// val list2 : Collections.Generic.List<'_a>
// Either define 'list2' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.
list
is inferred to be Collections.Generic.List<int>
.
Notice the errors when you call generic methods with object of different type or don't call any method that helps compiler know what to infer.
Hypothetical C# syntax:
// Should be the same syntax (just use `var` instead of let and don't skip `new` keyword)
Constructors type inference
Type inference for calling constructors should be the same as for calling function.
F# syntax:
let arrayOfNumbers = [| 1;2;3;4|]
let listOfNumbers = new System.Collections.Generic.List<_>(arrayOfNumbers);
//or don't use "new" (more common in F#) and skip passing wildcard ("_") as type parameter
let listOfNumbers2 = System.Collections.Generic.List(arrayOfNumbers);
listofNumbers
and listofNumbers2
has type List<int>
Hypothetical C# syntax:
var arrayOfNumbers = new[] { 1, 2, 3, 4 };
var listOfNumbers = new System.Collections.Generic.List(arrayOfNumbers);
It's very annoying to have to specify this type in constructor. For example why I can't write this?:
var listOfAnonymousTypes = new[] { 1, 2, 3, 4 }.Select((number, index) => new { number, index });
var hashSetOfAnonymousTypes = new HashSet(listOfAnonymousTypes);
Inference from Object and Collection Initializers
F# doesn't have the same Object and Collection Initializers syntax as C#.
However, F# has similar syntax to object initializer when initializing immutable record objects and we can also initialize mutable properties of class similar to how we set properties for attributes in C# (link)
F# syntax (record):
type SomeTypeWithImmutableProperties<'T1,'T2> = {
Foo1 : 'T1;
Bar2 : 'T2;
}
let immutableObject = { Foo1 = 1; Bar2 = "bar" }
type of immutableObject
is SomeTypeWithImmutableProperties<int,string>
F# syntax (class):
type SomeClassWithMutableProperties<'T1,'T2>() =
member val Foo3 : 'T1 = Unchecked.defaultof<'T1> with get, set
member val Bar3 : 'T2 = Unchecked.defaultof<'T2> with get, set
let someMutableObject1 = SomeClassWithMutableProperties(Foo3 = 1, Bar3 = "bar");
let someMutableObject2 = SomeClassWithMutableProperties<_,_>(Foo3 = 1, Bar3 = "bar");
type of someMutableObject1
and someMutableObject2
is of type SomeClassWithMutableProperties<int,string>
Hypothetical C# syntax:
class SomeClassWithMutableProperties<T1,T2>
{
public T1 Foo { get; set; }
public T2 Bar { get; set; }
}
class Program
{
static void Main(string[] args)
{
var someObject1 = new SomeClassWithMutableProperties
{
Foo = 1,
Bar = "bar"
};
var someObject2 = new SomeClassWithMutableProperties<_,_>
{
Foo = 1,
Bar = "bar"
};
}
}
C# has also nice syntax for collection initializers, why not allow this code to work?:
var list1 = new List { 1, 2, 3 };
var list2 = new List() { 1, 2, 3 };
var list3 = new List<_> { 1, 2, 3 };
var list4 = new List<_>() { 1, 2, 3 };
Or with index initializers it could look like this (and infer variables to be Dictionary<int,string>
):
var dict1= new Dictionary()
{
[1] = "Please",
[2] = "implement",
[3] = "this :)"
};
var dict2 = new Dictionary
{
[1] = "It",
[2] = "would",
[3] = "be awesome!"
};
var dict3 = new Dictionary<_, _>()
{
[1] = "Let's",
[2] = "make",
[3] = "C#"
};
var dict4 = new Dictionary<_, _>
{
[1] = "great",
[2] = "again!",
[3] = ";-)"
};
This one should be an error:
var dict5 = new Dictionary
{
[1] = "",
["foo"] = 1,
};
Probably type inference should work similar as type inference for implicitly-typed arrays (new[]{}
syntax)
Return Type Inference
Mentioned in #92
F#:
let row<'T1>(name : string) : 'T1 =
let objToReturn = Unchecked.defaultof<'T1>
objToReturn
let id: int = row("id")
let name: string = row("name")
let birthDate: DateTime = row("birthDate")
why not allow the same in C#: (example from #92)
int id = row.Field("id"); // legal, generic type argument inferred to be int
var id = row.Field("id"); // not legal, circular inference
row.Field("id"); // not legal, no assignment to a type
All of the example are working in F# not because it's fundamentally different language and it won't fit in C# style. I believe all of this proposed features would fit nicely in C# and greatly improve day-to-day work of C# programmer.
Would this partially mitigate https://github.com/dotnet/csharplang/issues/129? I've been running up against this one a lot.
I propose next implementation:
- When type of generic parameter can be inferred full, skip angle braces
- When compiler can take parameters partial, specify only needed, without special syntax to skipped places - no additional commas.
- When compiler see conflict - use _ as possible type replace or concrete type specifier.
- When compiler infer type, look on generic type constraint and analyze visible types in scope for find concrete type, otherwise require complete definition. Sample (For all):
public interface IAdd<T>
{
T Zero { get; }
T Add(T a, T b);
}
public static class SumAddedd
{
public static T Fold<TAdd, T>(this IEnumerable<T> en)
when TAdd : struct, IAdd<T>
=> en.Aggregate(default(TAdd).Zero, (a, p) => default(TAdd).Add(a, p));
}
Simple inference:
public struct IntAdd : IAdd<int>
{
public int Zero => 0;
public int Add(int a, int b) => a + b;
}
public struct LongAdd : IAdd<long>
{
public long Zero => 0;
public long Add(long a, long b) => a + b;
}
var a : IEnumerable<int>;
var b : IEnumerable<long>;
var c : IEnumerable<T>;
// This can be inferred
var fa = a.Fold(); // Take <IntAdd, int>
var fb = b.Fold(); // Take <LongAdd, int>
var fc = c.Fold<IntAdd>(); // Take <IntAdd, int>
var fc = c.Fold<long>(); // Take <LongAdd, long>
Complex inference:
public struct IntAdd : IAdd<int>
{
public int Zero => 0;
public int Add(int a, int b) => a + b;
}
public struct IntMul : IAdd<int>
{
public int Zero => 1;
public int Add(int a, int b) => a * b;
}
public struct LongAdd : IAdd<long>
{
public long Zero => 0;
public long Add(long a, long b) => a + b;
}
var a : IEnumerable<int>;
var b : IEnumerable<long>;
var c : IEnumerable<T>;
// This can be inferred
var fa = a.Fold(); -- error
var fa = a.Fold<IntAdd>(); // Take <IntAdd, int>
var fa = a.Fold<MulAdd>(); // Take <MulAdd, int>
var fb = b.Fold(); // Take <LongAdd, int>
var fc = c.Fold<IntAdd>(); // Take <IntAdd, int>
var fc = c.Fold<long>(); // Take <LongAdd, long>
I'm in favor of 1 and 3, but 3 only if something like F#'s _
is used instead of var
. var
does not seem to be an appropriate name for this given that (a) nearly everyone will read that as "variable" and (b) type arguments are not necessarily used for variables. I know that's bikeshedding a little bit, but when a suitable alternative like underscore is available and straightforward I'd suggest adopting that before creating a proposal. Underscore has the additional benefits of being less characters than var
(better reflecting the benefit of type argument inference) as well as having no English meaning.
The only reason I suggested var was (in my head) it's similarity to using var in variable declaration, and the low cognitive change. "'var' means the inferred type"
var myInt = 3;
// potentially
IEnumerable<var> myEnumerable = new [] { myInt };
var list = new List<var>(myEnumerable);
var casted = list.Select<var, long>(x => x);
@rcmdh I totally get that, and it was a fine starting point for discussion. And perhaps the name var
for variable type inference was a little short-sighted initially (i.e. C++ with this hindsight went with auto
).
Given your example, I would prefer Java-style syntax when all of the type parameters can be inferred, but _
for partial type inference (which is the subject of this proposal). Example:
var myInt = 3;
// potentially
IEnumerable<> myEnumerable = new [] { myInt };
var list = new List<>(myEnumerable);
var casted = list.Select<_, long>(x => x);
Even though the syntax of the first two examples is ambiguous with an open generic type, it should not be a breaking change because (AFAIK) an open generic type can only be used with typeof
.
Java style wildcards can be another option- <?>
There are (at least) three possible forms this could take:
- Named type arguments (#280, #1348), e.g.
M<TElement: int>(args)
- Omitted type arguments separated by commas, e.g.
M<int, >(args)
- Types to be inferred specified by var (#1348), e.g.
M<int, var>(args)
In my point of view, the key differences between 1 and (2, 3) is:
- With 1, position of implicitly specified type argument and count of type arguments for a method is not mutter. It requires specifying only type arguments, that can not be inferred, but It sensitive to the names of type arguments (when name is changed invocations should be updated)
- (2, 3) may be useful to (additionally) explicitly specify a number of generic arguments for a specify proper overload and not linked to the names of type arguments.
Sometimes I would like to have one, and sometimes - another.
Between 2 and 3 I prefer 3, because a chain of commas (M<,,int,,,,,>(args)
), I think, looks not good.