csharplang icon indicating copy to clipboard operation
csharplang copied to clipboard

Champion: "Partial Type Inference"

Open gafter opened this issue 6 years ago • 86 comments

  • [ ] Proposal added
  • [ ] Discussed in LDM
  • [ ] Decision in LDM
  • [ ] Finalized (done, rejected, inactive)
  • [ ] Spec'ed

See #1348

Currently, in an invocation of a generic method, you either have to specify all of the type arguments or none of them, in the latter case depending on the compiler to infer them for you. The proposal is to permit the programmer to provide some of the type arguments, and have the compiler infer only those that were not provided.

There are (at least) three possible forms this could take:

  1. Named type arguments (#280, #1348), e.g. M<TElement: int>(args)
  2. Omitted type arguments separated by commas, e.g. M<int, >(args)
  3. Types to be inferred specified by var (#1348), e.g. M<int, var>(args)

Design Meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2024/LDM-2024-02-07.md#partial-type-inference

gafter avatar Mar 01 '18 15:03 gafter

Pros/cons for 2 vs 3?

Both are DRY but I prefer 2... you don't buy much visually with 3.

bondsbw avatar Mar 01 '18 16:03 bondsbw

1 is okay but verbose. I prefer 2 because I would like this potential progression in levels of needed disambiguation for Foo<TKey, TValue>, assuming Foo<> or Foo also existing or not existing: Foo<TKey,>, Foo<,>, Foo.

jnm2 avatar Mar 01 '18 16:03 jnm2

I'm pretty sure there was an issue regarding recursive generic, but I can't find it. Without recursive type parameters, I think 2 is better because is less verbose, since verbosity is part of the issue here. But with recursive type parameters, I think 1 would fit better.

Logerfo avatar Mar 01 '18 17:03 Logerfo

I think that either 2 or 3 is the best one. 2 is certainly the least verbose (which is nice), but 3 looks more complete and full. I don't think the slight verbosity of 3 would really be an issue because it still requires no thinking of what the type name is, and is only three keystrokes.

I slightly prefer 3, but 2 is also perfectly fine. 1 is a bit too verbose.

TheUnlocked avatar Mar 01 '18 18:03 TheUnlocked

I support 1 even with the inclusion of 2 or 3. It helps to document the parameters.

bondsbw avatar Mar 01 '18 19:03 bondsbw

I believe, F# is using _:

> open System.Collections.Generic;;
> let s = [ 1; 2; 3 ] :> IEnumerable<_>;;
val s : IEnumerable<int> = [1; 2; 3]
> let l = List<_>([1; 2; 3]);;
val l : List<int>

(So we could consider using it too.)

vladd avatar Mar 02 '18 01:03 vladd

Isn't 2 already used in other constructs?

typeof(IDictionary<,>)

qrli avatar Mar 02 '18 01:03 qrli

@qrli typeof(IDictionary<,>) gets you open generic type. However this proposal is about infering fixed type parameters. So I think symmetry is not necessary.

jveselka avatar Mar 02 '18 09:03 jveselka

@zippec typeof(IDictionary<string,>) is also valid open generic.

qrli avatar Mar 02 '18 13:03 qrli

@qrli Say what? Type objects cannot have partially-open generics.

jnm2 avatar Mar 02 '18 13:03 jnm2

@jnm2 IIRC typeof expressions cannot have partially-open generics but you can construct such Types via reflection.

federicoce-moravia avatar Mar 02 '18 13:03 federicoce-moravia

@federicoce-moravia I don't believe either is possible. Can you show me?

Btw, closing over string, TValue is still 100% closed, not partially closed. There is an important difference between an open generic and a generic that is closed over TKey, TValue.

jnm2 avatar Mar 02 '18 13:03 jnm2

@jnm2

You can do it by grabbing the generic type parameter from the open generic type:

Type type = typeof(Dictionary<,>);
Type[] args = type.GetGenericArguments();
Type type2 = type.MakeGenericType(typeof(string), args[1]);

HaloFour avatar Mar 02 '18 13:03 HaloFour

@HaloFour That's typeof(Dictionary<string, TValue>) which is completely closed over those two types. For example.

jnm2 avatar Mar 02 '18 13:03 jnm2

@jnm2

It's something that cannot be expressed in C#. It is typeof(Dictionary<string,>) because the second type parameter is still the open generic type argument from the open generic type Dictionary<,>. The Type.ContainsGenericParameters property returns true (as it does with open generic types) and you cannot instantiate that type.

HaloFour avatar Mar 02 '18 14:03 HaloFour

@HaloFour I understand that it inexpressible in C#. Here's the dillema: it is fully constructed to close over typeof(Foo<TValue, TKey>). Why should it not also be fully constructed closing over typeof(Foo<TValue, TValue>) or typeof(Foo<TKey, TValue>)?

public static class Program
{
    public static void Main()
    {
        new Foo<string, int>().CloseForwards();
        new Foo<string, int>().CloseBackwards();
    }
}

public class Foo<TKey, TValue>
{
    public void CloseForwards()
    {
        // This does the equivalent of:
        _ = typeof(Foo<TKey, TValue>);

        var openGeneric = typeof(Foo<,>);
        var fooTypeArguments = openGeneric.GetGenericArguments();
        var closeOverFooTypeArguments = openGeneric.MakeGenericType(fooTypeArguments[0], fooTypeArguments[1]);

        Console.WriteLine(openGeneric == closeOverFooTypeArguments); // true (!)
        Console.WriteLine(closeOverFooTypeArguments.IsConstructedGenericType); // false (!)
    }
    
    public void CloseBackwards() 
    {
        // This does the equivalent of:
        _ = typeof(Foo<TValue, TKey>);
        
        var openGeneric = typeof(Foo<,>);
        var fooTypeArguments = openGeneric.GetGenericArguments();
        var closeOverFooTypeArguments = openGeneric.MakeGenericType(fooTypeArguments[1], fooTypeArguments[0]);

        Console.WriteLine(openGeneric == closeOverFooTypeArguments); // false
        Console.WriteLine(closeOverFooTypeArguments.IsConstructedGenericType); // true
    }
}

So I'm not talking about being able to instantiate from a separate class. I'm talking a level more abstract: what can you specify using typeof inside the definition of the type which declares TKey and TValue?

jnm2 avatar Mar 02 '18 14:03 jnm2

@jnm2

I think we're walking in a weird gray area not well handled by the BCL and more academic than practical.

The documentation for IsConstructedGenericType states the following:

Gets a value that indicates whether this object represents a constructed generic type. You can create instances of a constructed generic type.

Whereas the remarks of ContainsGenericParameters states the following:

Since types can be arbitrarily complex, making this determination is difficult. For convenience and to reduce the chance of error, the ContainsGenericParameters property provides a standard way to distinguish between closed constructed types, which can be instantiated, and open constructed types, which cannot. If the ContainsGenericParameters property returns true, the type cannot be instantiated.

Per the documentation it would seem that it wouldn't be legal for those two properties to both return true. I'd almost call it a bug.

Either way, I don't think this should hold up any conversation about a placeholder syntax (or lack thereof) to denote a generic type argument that is inferred.

HaloFour avatar Mar 02 '18 14:03 HaloFour

Either way, I don't think this should hold up any conversation about a placeholder syntax (or lack thereof) to denote a generic type argument that is inferred.

Certainly not. 😁

jnm2 avatar Mar 02 '18 14:03 jnm2

@gafter Is omitting the type args altogether another option? F# supports the syntax let d = Dictionary() for generic dictionary.

FunctionalFirst avatar Mar 02 '18 16:03 FunctionalFirst

@FunctionalFirst We infer the type arguments to methods, not to types. This is about method type argument inference.

gafter avatar Mar 02 '18 17:03 gafter

@gafter You mean this doesn't cover constructors? I think there are a few other proposals about that and type inference in general (for instance: https://github.com/dotnet/csharplang/issues/92). I hope this can led to considering those as well.

alrz avatar Mar 02 '18 19:03 alrz

Any thoughts on methods that have more than 2 type parameters and omitting multiple trailing ones? My personal feeling: No, since they're not optional like default-parameters With M<T1, T2, T3, T4>(...), is any of the following permitted?

  • M(...) (no angle-brackets at all, we can use this today if type parameters can be fully inferred)
  • M<>(...) (no mention of any type parameters, not sure if this has any use whatsoever)
  • M<int>(...) (T1 is int, infer all the rest. Chances are we meant the first parameter, but did we really?)
  • M<int, >(...) (T1 is int, infer all the rest. Looks more like we meant the first parameter, but would probably be ambiguous if there are multiple overloads with a different number of type parameters)
  • M<T1: int>(...) (T1 is int, infer all the rest. Specific, concise, but again the potential overload issue)
  • M<int,,,>(...) (T1 is int, infer all the rest, empty space with comma for each of them. Number and placement of parameters is clear)
  • M<int, var, var, var>(...) (T1 is int, infer all the rest, var for non-specified ones. Number and placement of parameters is clear)

Also:

  • M<T1: int, T4: string>(...) (T1 is int, T4 is string, infer all the rest. Potential overload issue for multiple candidates)
  • M<int,,, string>(...) (T1 is int, T4 is string, infer all the rest, empty space with commas. Probably won't work without the commas for position indication)
  • M<int, var, var, string>(...) (T1 is int, T4 is string, infer all the rest, var for non-specified ones. More verbose than the previous, but nicer than just "empty" commas)

For that, I'm leaning more towards 3 than 2 (while 2 is the current way of specifying an open generic with typeof). Multiple commas next to each other, with no real content don't resonate with me at all, even when the var in there makes it pretty verbose. I'm personally not a friend of named arguments (mostly because I'm not a friend of default parameters; they've far too often bit me in the behind in shared code bases, but that's not the discussion here), but I can see merit for 1 when the number of generic parameters reaches a certain number (which might as well be some other indication of code-smell or similar, with very few exceptions)

BhaaLseN avatar Mar 03 '18 14:03 BhaaLseN

Great that language team is considering improving type inference in C#. This is something I've been waiting for a long time.

Basically I think C# should ~~just copy~~ get inspiration from F# type inference. I mean, when it comes to local variable inference in function's body, not full Hindley Milner function's argument type inference.

Type inference in F# was quite intuitive for me. I just discovered it naturally. In C# we recently got _ wildcard so it wasn't used that often hence this might not be such easily discoverable as in F# when you expect things to just get inferred and sometimes put _ to help compiler.

Anyway, I like F# syntax a lot and for me it's good if similar features have similar syntax in both languages.

Partial type inference

F#:

let testGeneric<'T1,'T2>(arg1 : 'T1) =     
    printfn "T1: %s T2: %s" (typeof<'T1>).FullName (typeof<'T2>).FullName
    () //returns nothing

testGeneric<_, string>(1)

result printed: T1: System.Int32 T2: System.String

The inspired syntax in C# could look like that:

    class Program
    {
        static void testGeneric<T1, T2>(T1 arg1)
        {
            Console.WriteLine("T1: {0} T2: {1}", typeof(T1).FullName, typeof(T2).FullName);
        }

        static void Main(string[] args)
        {            
            //doesn't work but could (now I need to write testGeneric<int, string>(1);)
            testGeneric<_, string>(1); 
        }
    }

Because some people mentioned other possible type inference features I'll try to show how F# type inference could work in C#.

Initialization based on later methods call

F#:

let list = new System.Collections.Generic.List<_>()
list.Add(1)
list.Add(2)

let list2 = System.Collections.Generic.List()
list2.Add(1)
list2.Add(2)
//list2.Add("foo") 
//error if commented out
//This expression was expected to have type 'int' but here has type 'string'
let list2 = System.Collections.Generic.List<_>()
// Error:
// Value restriction. The value 'list2' has been inferred to have generic type
//    val list2 : Collections.Generic.List<'_a>    
// Either define 'list2' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

list is inferred to be Collections.Generic.List<int>.

Notice the errors when you call generic methods with object of different type or don't call any method that helps compiler know what to infer.

Hypothetical C# syntax:

// Should be the same syntax (just use `var` instead of let and don't skip `new` keyword)

Constructors type inference

Type inference for calling constructors should be the same as for calling function.

F# syntax:

let arrayOfNumbers = [| 1;2;3;4|]
let listOfNumbers = new System.Collections.Generic.List<_>(arrayOfNumbers);
//or don't use "new" (more common in F#) and skip passing wildcard ("_") as type parameter
let listOfNumbers2 = System.Collections.Generic.List(arrayOfNumbers);

listofNumbers and listofNumbers2 has type List<int>

Hypothetical C# syntax:

var arrayOfNumbers = new[] { 1, 2, 3, 4 };
var listOfNumbers = new System.Collections.Generic.List(arrayOfNumbers);

It's very annoying to have to specify this type in constructor. For example why I can't write this?:

var listOfAnonymousTypes = new[] { 1, 2, 3, 4 }.Select((number, index) => new { number, index });
var hashSetOfAnonymousTypes = new HashSet(listOfAnonymousTypes);

Inference from Object and Collection Initializers

F# doesn't have the same Object and Collection Initializers syntax as C#.

However, F# has similar syntax to object initializer when initializing immutable record objects and we can also initialize mutable properties of class similar to how we set properties for attributes in C# (link)

F# syntax (record):

type SomeTypeWithImmutableProperties<'T1,'T2> = {
    Foo1 : 'T1;
    Bar2 : 'T2;
}

let immutableObject = { Foo1 = 1; Bar2 = "bar" }

type of immutableObject is SomeTypeWithImmutableProperties<int,string>

F# syntax (class):

type SomeClassWithMutableProperties<'T1,'T2>() = 
    member val Foo3 : 'T1 = Unchecked.defaultof<'T1> with get, set
    member val Bar3 : 'T2 = Unchecked.defaultof<'T2> with get, set

let someMutableObject1 = SomeClassWithMutableProperties(Foo3 = 1, Bar3 = "bar");
let someMutableObject2 = SomeClassWithMutableProperties<_,_>(Foo3 = 1, Bar3 = "bar");

type of someMutableObject1 and someMutableObject2 is of type SomeClassWithMutableProperties<int,string>

Hypothetical C# syntax:

class SomeClassWithMutableProperties<T1,T2>
{
    public T1 Foo { get; set; }
    public T2 Bar { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        var someObject1 = new SomeClassWithMutableProperties
        {
            Foo = 1,
            Bar = "bar"
        };

        var someObject2 = new SomeClassWithMutableProperties<_,_>
        {
            Foo = 1,
            Bar = "bar"
        };
    }
}


C# has also nice syntax for collection initializers, why not allow this code to work?:

var list1 = new List { 1, 2, 3 };
var list2 = new List() { 1, 2, 3 };
var list3 = new List<_> { 1, 2, 3 };
var list4 = new List<_>() { 1, 2, 3 };

Or with index initializers it could look like this (and infer variables to be Dictionary<int,string>):

var dict1= new Dictionary()
{
    [1] = "Please",
    [2] = "implement",
    [3] = "this :)"
};
var dict2 = new Dictionary
{
    [1] = "It",
    [2] = "would",
    [3] = "be awesome!"
};
var dict3 = new Dictionary<_, _>()
{
    [1] = "Let's",
    [2] = "make",
    [3] = "C#"
};
var dict4 = new Dictionary<_, _>
{
    [1] = "great",
    [2] = "again!",
    [3] = ";-)"
};

This one should be an error:

var dict5 = new Dictionary
{
    [1] = "",
    ["foo"] = 1,
};

Probably type inference should work similar as type inference for implicitly-typed arrays (new[]{} syntax)

Return Type Inference

Mentioned in #92

F#:

let row<'T1>(name : string) : 'T1 = 
    let objToReturn = Unchecked.defaultof<'T1>
    objToReturn
    
let id: int = row("id")
let name: string = row("name")
let birthDate: DateTime = row("birthDate")

why not allow the same in C#: (example from #92)

int id = row.Field("id"); // legal, generic type argument inferred to be int
var id = row.Field("id"); // not legal, circular inference
row.Field("id"); // not legal, no assignment to a type

All of the example are working in F# not because it's fundamentally different language and it won't fit in C# style. I believe all of this proposed features would fit nicely in C# and greatly improve day-to-day work of C# programmer.

mpawelski avatar Mar 06 '18 13:03 mpawelski

Would this partially mitigate https://github.com/dotnet/csharplang/issues/129? I've been running up against this one a lot.

jnm2 avatar Mar 06 '18 13:03 jnm2

I propose next implementation:

  1. When type of generic parameter can be inferred full, skip angle braces
  2. When compiler can take parameters partial, specify only needed, without special syntax to skipped places - no additional commas.
  3. When compiler see conflict - use _ as possible type replace or concrete type specifier.
  4. When compiler infer type, look on generic type constraint and analyze visible types in scope for find concrete type, otherwise require complete definition. Sample (For all):
public interface IAdd<T> 
{
    T Zero { get; }
    T Add(T a, T b);
}

public static class SumAddedd
{
    public static T Fold<TAdd, T>(this IEnumerable<T> en) 
        when TAdd : struct, IAdd<T>
      => en.Aggregate(default(TAdd).Zero, (a, p) => default(TAdd).Add(a, p));
}

Simple inference:

public struct IntAdd : IAdd<int>
{
      public int Zero => 0;
      public int Add(int a, int b) => a + b;
}  

public struct LongAdd : IAdd<long>
{
      public long Zero => 0;
      public long Add(long a, long b) => a + b;
}  

var a : IEnumerable<int>;
var b : IEnumerable<long>;
var c  : IEnumerable<T>;

// This can be inferred
var fa = a.Fold(); // Take <IntAdd, int> 
var fb = b.Fold(); // Take <LongAdd, int> 
var fc = c.Fold<IntAdd>(); // Take <IntAdd, int> 
var fc = c.Fold<long>(); // Take <LongAdd, long> 

Complex inference:

public struct IntAdd : IAdd<int>
{
      public int Zero => 0;
      public int Add(int a, int b) => a + b;
}  

public struct IntMul : IAdd<int>
{
      public int Zero => 1;
      public int Add(int a, int b) => a * b;
}  


public struct LongAdd : IAdd<long>
{
      public long Zero => 0;
      public long Add(long a, long b) => a + b;
}  

var a : IEnumerable<int>;
var b : IEnumerable<long>;
var c  : IEnumerable<T>;

// This can be inferred
var fa = a.Fold(); -- error
 var fa = a.Fold<IntAdd>(); // Take <IntAdd, int> 
 var fa = a.Fold<MulAdd>(); // Take <MulAdd, int> 
var fb = b.Fold(); // Take <LongAdd, int> 
var fc = c.Fold<IntAdd>(); // Take <IntAdd, int> 
var fc = c.Fold<long>(); // Take <LongAdd, long> 

ijsgaus avatar Mar 06 '18 19:03 ijsgaus

I'm in favor of 1 and 3, but 3 only if something like F#'s _ is used instead of var. var does not seem to be an appropriate name for this given that (a) nearly everyone will read that as "variable" and (b) type arguments are not necessarily used for variables. I know that's bikeshedding a little bit, but when a suitable alternative like underscore is available and straightforward I'd suggest adopting that before creating a proposal. Underscore has the additional benefits of being less characters than var (better reflecting the benefit of type argument inference) as well as having no English meaning.

paulirwin avatar Mar 14 '18 19:03 paulirwin

The only reason I suggested var was (in my head) it's similarity to using var in variable declaration, and the low cognitive change. "'var' means the inferred type"

var myInt = 3;
// potentially
IEnumerable<var> myEnumerable = new [] { myInt };
var list = new List<var>(myEnumerable);
var casted = list.Select<var, long>(x => x);

rcmdh avatar Mar 15 '18 07:03 rcmdh

@rcmdh I totally get that, and it was a fine starting point for discussion. And perhaps the name var for variable type inference was a little short-sighted initially (i.e. C++ with this hindsight went with auto).

Given your example, I would prefer Java-style syntax when all of the type parameters can be inferred, but _ for partial type inference (which is the subject of this proposal). Example:

var myInt = 3;
// potentially
IEnumerable<> myEnumerable = new [] { myInt };
var list = new List<>(myEnumerable);
var casted = list.Select<_, long>(x => x);

Even though the syntax of the first two examples is ambiguous with an open generic type, it should not be a breaking change because (AFAIK) an open generic type can only be used with typeof.

paulirwin avatar Mar 15 '18 16:03 paulirwin

Java style wildcards can be another option- <?>

gulshan avatar Mar 15 '18 18:03 gulshan

There are (at least) three possible forms this could take:

  • Named type arguments (#280, #1348), e.g. M<TElement: int>(args)
  • Omitted type arguments separated by commas, e.g. M<int, >(args)
  • Types to be inferred specified by var (#1348), e.g. M<int, var>(args)

In my point of view, the key differences between 1 and (2, 3) is:

  • With 1, position of implicitly specified type argument and count of type arguments for a method is not mutter. It requires specifying only type arguments, that can not be inferred, but It sensitive to the names of type arguments (when name is changed invocations should be updated)
  • (2, 3) may be useful to (additionally) explicitly specify a number of generic arguments for a specify proper overload and not linked to the names of type arguments.

Sometimes I would like to have one, and sometimes - another.

Between 2 and 3 I prefer 3, because a chain of commas (M<,,int,,,,,>(args)), I think, looks not good.

ViIvanov avatar May 17 '18 04:05 ViIvanov