fsharp
fsharp copied to clipboard
[RFC FS-1043] Extension members visible to trait constraints
Continuation of #6286 from a feature branch
RFC https://github.com/fsharp/fslang-design/blob/master/RFCs/FS-1043-extension-members-for-operators-and-srtp-constraints.md
This is work by @tobyshaw and myself to implement RFC FS-1043. This PR brings https://github.com/Microsoft/visualfsharp/pull/3582 up-to-date with master
Design points:
- [x] determine any necessary/desired changes to SRTP resolution
Things to test
- [x] Complete FSharpPlus acceptance testing without
--langversion:preview - [x] Complete FSharpPlus acceptance testing with
--langversion:preview - [x] Continue to expand systematic usage testing
Related bugs to investigate:
-
[ ] https://github.com/dotnet/fsharp/issues/9382
-
[ ] https://github.com/dotnet/fsharp/issues/5973
-
[ ] https://github.com/dotnet/fsharp/issues/7384
-
[ ] https://github.com/dotnet/fsharp/issues/4924
-
[ ] https://github.com/dotnet/fsharp/issues/8098
-
[ ] https://github.com/dotnet/fsharp/issues/8523
-
[ ] https://github.com/dotnet/fsharp/issues/8690
-
[ ] https://github.com/dotnet/fsharp/issues/9416
-
[ ] https://github.com/dotnet/fsharp/issues/9633
-
[ ] https://github.com/fsprojects/FSharpPlus/issues/331
-
[ ] https://github.com/dotnet/fsharp/issues/7917
-
[ ] https://github.com/dotnet/fsharp/issues/8794
I'd love to see this in, but I'm wondering about the rfc status. It appears empty, and points to the previous PR. Is there another location that has a summary of scope and features?
@abelbraaksma this won't be merged until an RFC is written. The current one has no design.
Thanks, I'd love to help, but I may not know enough of the internals to write a quality rfc pr
Anything we can do about the RFC?
@dsyme is it possible that this will be merged for F# 5 after you were able to fix some bugs with the SRTP's?
Anything we can do about the RFC?
A sketch draft of the RFC is complete. It's not actually a very complicated change, but there are some subtle points in the interaction with other elements of logic in the compiler
@dsyme is it possible that this will be merged for F# 5 after you were able to fix some bugs with the SRTP's?
A key thing is to continue to improve testing, for
-
Common scenarios of expected positive usage (cases where this construct makes code simpler and better). Roughly speaking this is where instances are provided to augment existing types with witnesses for existing SRTP-constraints implied by FSharp.Core (mainly arithmetic operators).
-
Cases related to the technical interactions described in the RFC and any other technical interactions we expect
-
Cases covering the "whackier edge cases" where we have previously observed subtle change in SRTP resolutions.
Note I'm not particularly interested in this change supporting the SRTP-laden code that does (largely pointless) over-abstraction of code in the name of "maximal sharing" or simulating category theory - this tends to serve little actual value and makes code highly obscure, and SRTP was never designed for these purposes. Indeed if possible I'd prefer to disallow or deprecate that kind of code.
@dsyme
over-abstraction of code in the name of "maximal sharing" or simulating category theory
I can share the feeling, but at the same time, the type of code you are refering to is coming to existence (even in a coming PR to fsharp core) because F# currently lacks the constructs enabling such code to be written with more adequate idioms than SRTP.
see usage of ^TaskLike and Priority1, 2 and 3 in this code from the Task Support PR:
https://github.com/dotnet/fsharp/blob/b7f0199d1ff33b724cc33c02404320167b38a9a8/src/fsharp/FSharp.Core/tasks.fs#L453-L455
As for the usefulness of SRTP, it is very useful and critical, here is a single SRTP function that I've defined and is critical in my day work codebase:
let inline CreateCommand (createCommand: unit -> SqlCommand) : 'a =
let dummyCommand = createCommand()
let cx = dummyCommand.Connection
let tx = dummyCommand.Transaction
let timeout = dummyCommand.CommandTimeout
let command = (^a : (new : SqlConnection * SqlTransaction * int -> ^a) (cx, tx, timeout)) // SRTP abstracting over
// possibly some more stuff
command
(NB: I don't use SRTP constructs casually in my work codebase)
This is to work around design choices made in a TP library where I don't have control on constructor and definitely don't want useless boiler plate code to show up at every instanciation of hundreds of disparate types.
The normal "legit" ways to handle this with the vanilla OO polymorphism just don't work (especially for type constructors), the usage of reflection comes with lots of issues, and the zero cost abstraction and safety that SRTP enables works strongly to support the great aspects of F# design, even if for SRTP, it is better burried deep down in the implementation of a library.
My point is to show that between SRTP enabling critical abstractions (like numeric operators you mentionned or the function I was showing) and what is over abstraction, it really depends the context.
Other languages like C++ and Rust have similar features (with worse or better syntax / idioms) that enable compile time monomorphization, and F# having some support for it is a great thing, despite all the bad rep of SRTP syntax.
AFAIU, C++ will someday having the concept of concepts, and Rust uses traits, those approach enable better user story in terms of error messages, which in case of C++ template code and F# SRTP, tend to produce error messages that aren't conveying the likely cause of an issue.
All that being said, this very PR, AFAIU, is going to enable haskell type class like approach where the instance can be defined outside of the type or the abstraction, so I won't be surprised if this turns out into new idioms developing until there is support for "Extend everything" from C# land, or new and more elaborated type constructs landing in the CLR.
As for the usefulness of SRTP, it is very useful and critical, here is a single SRTP function that I've defined and is critical in my day work codebase...
Yes, I'm aware of complex new SRTP constraints being used to capture the TaskLike pattern in RFC FS-1072 #6811 - this is something I'm not totally happy about and likely wouldn't have authored, but has been brought through the initial desire to base the work on TaskBuilder.fs.
Regarding this code:
let inline CreateCommand (createCommand: unit -> SqlCommand) : 'a =
let dummyCommand = createCommand()
let cx = dummyCommand.Connection
let tx = dummyCommand.Transaction
let timeout = dummyCommand.CommandTimeout
let command = (^a : (new : SqlConnection * SqlTransaction * int -> ^a) (cx, tx, timeout)) // SRTP abstracting over
// possibly some more stuff
command
Could you show some callsites? I presume they need type annotations, e.g.
let x : SomeCommandType = CreateCommand (fun () -> ...)
But how bad is it if you pass the constructor in explicitly?
let x = CreateCommand SomeCommandType (fun () -> ...)
Anyway, this RFC is not removing the ability to write user-defined SRTP constraints, I'm just trying to scope intended use cases for the whole mechanism if we need to make tradeoffs for compat etc. It would be a separate RFC to make some adjustment (e.g. add a warning for any user-defined SRTP constraint, or add a warning for user-defined two-type-parameter SRTP constraints, or something). I'm far from deciding what the principles for anything like this would be.
Right now I'm trying to determine when/if this whole RFC is a breaking change.
-
I'm assuming it must be a breaking change, because additional methods are taken into account in the overload resolution used in SRTP constraint resolution. That must surely cause it to fail where it would have succeeded before.
-
However, all the new methods are extension methods, which are lower priority in overload resolution
-
Also, it's hardly the worse kind of breaking change, because it's a lot like an addition to the .NET libraries causing overload resolution to fail. We don't really consider that a breaking change (partly because this is measured differently for C# and F#, as they have different sensitivities to adding overloads).
-
Also, it's not the worst breaking change because we give warnings when users try to add extension members for operators like
+ -
Also, nearly all SRTP constraints are on static members, and C# code can't introduce extension members that are static - just instance ones. So for C# extension members to cause a problem to arise we'd have to have F# code introducing SRTP constraints on instance members
Still, I'm sure this must be a breaking change. I would actually appreciate help trying to construct a case where it is if anyone has a spare evening :)
Added notes on the compat concerns here: https://github.com/fsharp/fslang-design/pull/430
I started to add some compatibility testing here: https://github.com/dotnet/fsharp/pull/6805/commits/f3901d07b5e5dc0e76122bf2b852c577f4afe938
I would like some contributions of the following if you can help...
Please list the best examples (if any) that you know of for these:
-
Cases where a .NET Framework type (other than a primitive) doesn't support a particular method corresponding to an existing SRTP constraint used in FSharp.Core. For example, is missing
op_Additionor some other operator or method (Sin, Cos, Zero, One, etc.) -
Cases of fresh SRTP code (i.e. using user-defined SRTP constraint calls) that captures common existing patterns in .NET libraries like
TryGetValueor the like (and especially cases where it would be highly awkward to pass the witnesses to these patterns manually). -
Cases of fresh SRTP code can define reasonable and simple new patterns that you might like to retrofit on to a range of existing or new .NET or F# types. For example, patterns you might reasonably try to retrofit a pattern over all collections, or all code to undergo JSON serialization etc. (I'm especially interested in cases where it would be highly awkward to pass the witnesses to these patterns manually).
I'm after examples of good SRTP code capturing useful patterns, used in a way that makes code simpler end to end (convince me!), if this feature were supported.
Could you show some callsites? I presume they need type annotations, e.g.
Right, it looks like this
let command : MyCommandType = CreateCommand createCommand
But how bad is it if you pass the constructor in explicitly?
let command = CreateCommandWithCtor createCommand (fun (a,b,c) -> new MyCommandType(a,b,c))
In my case I find it would be prohibitive to fallback to write this (or rather discourage the use of function I used as example), the only work around is reflection (not sure that even works with erased provided types I'm using here) which would offer no compile time guarantees as to which type argument is passed.
There is also the risk, by doing what is shown in that SRTP-less code, that if I need to call another constructor later on, I have to go over all those lambdas and shuffle things, that is many callsites, instead of just changing that single infrastructure function; also prohibitive.
With SRTP approach, I enforce only proper types are constructed and the invariants in my CreateCommand with practically zero noise and zero cost beside compile time verification of the constraint.
This is for me, compelling use of SRTP, so I don't have to review code which would be missusing or missing critical setters calls on the object and the invariants baked in the function, the syntax to use the function is actually the smallest thing a developer has to write with nothing competing as more convenient which could be faulty / needs more time reviewing.
(This is code using http://fsprojects.github.io/FSharp.Data.SqlClient/)
Thanks a lot for the clarifications and inviting to contribute valuable SRTP samples!
Sorry I have no code here - but I will do some blog posts when this becomes available - comparing the different approaches.
I have a deeply nested object graph (8 - 10 nested types before the whole thing becomes recursive) that I need to (de)JSONify - however I have usually between 4-5 representations of JSON for the same data structure.
The most common approach for this would be to encode each JSON representation and the corresponding types as Json-types . however this would yield an explosion of similar types and the corresponding FromType and ToType methods/functions.
The benefit is of course that I could use Newtonsofts auto serializer. However that is not wanted as I need finegrained (non-automatic) conversions and I need to be able to read older versions of JSON without breaking the system. This is really important as we are handling tax-data and everybody is very sensitive to schema changes.
At the moment I have settled for a solution like the new System.Text.Json from .net core. That is I build a big registry of JsonReaders and JsonWriters with a lookup of typeof<_> and some additional key. It works and it is stable but basically all type safety was thrown out of the window. Therefore all calling code needs to permantly check if we got an error (usually because I converter couldnt be found).
when in reality the compiler should be perfectly able to determine if for a given type (and that includes nested types in an object graph) there are converters in scoper or not.
The code that exists atm is roughly 7KLOC and I am sure I could strip away 30% if not 50% if I had something akin to TCs.
The downside would be some potentially scary error messages by the compiler (usually something like a screen full of overloaded messages that do not fit). However having used and abused SRTPs (and also extension methods in general) for the last 3 1/2 years I dont see that as a real problem. After 2-3 times one encounters such intimidating messages one gets an intuition what to do about it.
plus nobody is forced to use it. Which is great.
IMO the level of breaking change here is worth taking:
- It makes a material, positive change towards making F# more expressive for DSLs and scenarios where you do arithmetic on custom types
- It is roughly the same as .NET adding a new overload, as was one with
Span<'T>when introduced across the entire framework
- Cases where a .NET Framework type (other than a primitive) doesn't support a particular method corresponding to an existing SRTP constraint used in FSharp.Core. For example, is missing op_Addition or some other operator or method (Sin, Cos, Zero, One, etc.)
If BigIntegers count as not a primitive, looks like fsharp/fsharp#457 remains unfixed as of VS 2019 16.4.2--I provided an SRTP workaround in the thread. I can't think of any others I've stumbled upon for this particular category.
@drvink Yes, dotnet/fsharp#457 is now fsharp/fslang-suggestions#472.
My example was bit more suited in additional to having a full featured type class feature in addition to this. I have added of use-case in the type classes suggestion instead here: https://github.com/fsharp/fslang-suggestions/issues/243#issuecomment-580061426
Compiling FSharpPlus with this language feature enabled causes one compilation error - I'll look at it now
https://github.com/fsprojects/FSharpPlus/issues/261#issuecomment-580253710
I've looked through the problem with FSharpPLus, and produced a minimum case, and it appears to be related to a more general problem with processing SRTP constraints, which is that the presence of any witnesses at all (as opposed to no witnesses) can affect the processing of code that is inline and intended to be generic.
A repro based on the above is as follows:
module Test
[<AutoOpen>]
module Extensions =
type Async<'T> with
static member Quack (x:seq<Async<'T>>) : Async<seq<'T>> = failwith ""
type Option<'T> with
static member Quack (x: seq<option<'T>>) : option<seq<'T>> = failwith ""
let inline CallQuack (x: ^a) : ^Output = (^a : (static member Quack : ^a -> ^Output) x)
type Witnesses =
static member inline QuackWitness (x: ^a, _output: ^Output, _impl: Witnesses) : ^Output = CallQuack x
static member inline QuackWitness (x: ref<_>, _output: ^Output, _impl: Witnesses) : ^Output = Unchecked.defaultof<_>
let inline CallQuackWitness (x: ^a, output: ^Output, witnesses: ^Witnesses) =
((^a or ^Output or ^Witnesses) : (static member QuackWitness : _*_*_ -> _) (x, output, witnesses))
let inline call (x: seq< ^b > ) : ^Output =
CallQuackWitness (x, Unchecked.defaultof< ^Output >, Unchecked.defaultof<Witnesses>)
The problem occurs in the last line, which is generic code.
- If there are no witnesses in scope, the code compiles with both old and new compiler, and the signature of the last function is the fully generic:
val inline call :
x:seq< ^b> -> ^Output
when (seq< ^b> or ^Output or Witnesses) : (static member QuackWitness : seq< ^b> * ^Output * Witnesses -> ^Output)
However if the witnesses are introduced, then they are now "seen" by the new compiler. Overload resolution is incorrectly attempted for the QuackWitness and Quack constraints in the code example, even though the code is generic - the presence of the seq type seems to be enough to force some overload resolution to happen.
I need to understand this better as it is a general problem with SRTP processing.
@dsyme If I'm understanding the problem correctly, this is not entirely new per se, or is at least related to the long-standing issue that has always thrown folks for a loop in writing SRTP code (i.e. the fact that one has to use term-level dummy/null values in order to get the compiler to "see" the witnesses in the way that is desired). I don't know if fixing that in general is within the scope of this issue, but it would do wonders to simplifying not just complex scenarios as seen in FSharpPlus but for nearly all use of SRTP.
[edit] an example where things work fine:
type unsigned = class
static member inline unsigned (x : sbyte) = uint8 x
static member inline unsigned (x : byte) = x
static member inline unsigned (x : int16) = uint16 x
static member inline unsigned (x : uint16) = x
static member inline unsigned (x : int32) = uint32 x
static member inline unsigned (x : uint32) = x
static member inline unsigned (x : int64) = uint64 x
static member inline unsigned (x : uint64) = x
end
let inline unsigned_instance< ^a, ^b, ^c when
(^a or ^b) : (static member unsigned : ^b -> ^c)>
(x : ^b) =
((^a or ^b) : (static member unsigned : ^b -> ^c) x)
let inline unsigned num = unsigned_instance<unsigned, _, _> num
and one where the indirection is needed:
open System
open System.Numerics
type from_bigint = class
static member _uint8max = bigint (uint32 Byte.MaxValue)
static member _uint16max = bigint (uint32 UInt16.MaxValue)
static member _uint32max = bigint UInt32.MaxValue
static member _uint64max = bigint UInt64.MaxValue
static member inline from_bigint (x : bigint, _ : int32) =
int (uint32 (x &&& from_bigint._uint32max))
static member inline from_bigint (x : bigint, _ : int64) =
int64 (uint64 (x &&& from_bigint._uint64max))
//static member inline from_bigint (x : bigint, _ : nativeint) =
// to_native (x &&& from_bigint._uint64max)
//static member inline from_bigint (x : bigint, _ : unativeint) =
// to_unative (x &&& from_bigint._uint64max)
static member inline from_bigint (x : bigint, _ : bigint) = x
static member inline from_bigint (x : bigint, _ : float) = float x
static member inline from_bigint (x : bigint, _ : sbyte) =
sbyte (byte (x &&& from_bigint._uint8max))
static member inline from_bigint (x : bigint, _ : int16) =
int16 (uint16 (x &&& from_bigint._uint16max))
static member inline from_bigint (x : bigint, _ : byte) =
byte (x &&& from_bigint._uint8max)
static member inline from_bigint (x : bigint, _ : uint16) =
uint16 (x &&& from_bigint._uint16max)
static member inline from_bigint (x : bigint, _ : uint32) =
uint32 (x &&& from_bigint._uint32max)
static member inline from_bigint (x : bigint, _ : uint64) =
uint64 (x &&& from_bigint._uint64max)
static member inline from_bigint (x : bigint, _ : float32) = float32 x
static member inline from_bigint (x : bigint, _ : decimal) = decimal x
static member inline from_bigint (x : bigint, _ : Complex) =
Complex(float x, 0.0)
end
let inline from_bigint_instance< ^a, ^b, ^c when
(^a or ^b or ^c) : (static member from_bigint : ^b * ^c -> ^c)>
(b : ^b, c : ^c) =
((^a or ^b or ^c) : (static member from_bigint : ^b * ^c -> ^c) (b, c))
let inline from_bigint num =
from_bigint_instance<from_bigint, _, _> (num, Unchecked.defaultof<'b>)
and one where things get really nasty, merely trying to provide a common form of map operation in FSharpPlus.
@drvink Let me write some notes on a very core design point in how we process SRTP constraints, summarized in generic inline code we apply weak resolution to constraints that could otherwise be generalised.
Consider this:
let inline f1 (x: System.DateTime) y = x + y;;
let inline f2 (x: System.DateTime) y = x - y;;
The first is generalized to non-generic code, the second to generic code:
val inline f1 : x:System.DateTime -> y:System.TimeSpan -> System.DateTime
val inline f2 : x:.DateTime -> y: ^a -> ^b when (System.DateTime or ^a) : (static member ( - ) : System.DateTime * ^a -> ^b)
Why? Well, prior to generalization we invoke "weak resolution" for both inline and non-inline code. This proceeds with overload resolution even though the second parameter type of "y" is not known.
-
In the first case, overload resolution succeeds because there is only one overload (DateTime + TimeSpan -> DateTime)
-
In the second case it fails (there are two overloads, DateTIme - DateTime and DateTime - TimeSpan). The failure is ignored, and the code is left generic.
Now, for non-inline code this "weak resolution" process is pretty reasonable. But for inline code it is dubious, especially in the context of this RFC - because future extension methods may provide additional witnesses for + on DateTime and some other type. The code has not been made as generic as it could be. Further if not witnesses at all were available then the code would actually be made generic, e.g. for some other operator like *. for which there are no overloads at all available:
> let inline f5 (x: System.DateTime) y = x *. y;;
val inline f5 : x:DateTime -> y: ^a -> ^_arg3 when (DateTime or ^a) : (static member ( *. ) : DateTime * ^a -> ^_arg3)
The reason we apply weak resolution even to inline code is that without it some inline code becomes very, very generic, e.g. consider
> let inline f3 (x:double) y z a b c d = x + y + z + a + b + c + d;;
val inline f3 : x:double -> y:double -> z:double -> double
Here the weak resolution process simplifies the types to double rather than giving some massive generic signature. It was this kind of case that made us add weak resolution in the first place and also apply it to inline code.
Putting aside backward compat, we could perhaps turn off weak resolution for inline code for cases that involve true overload/witness resolution, perhaps a change like this: https://github.com/dsyme/fsharp/commit/e439cc81f31eb1e3d123655594c91ab0938bb652 though only applied to inline code. Thie would change inferred types, e.g.
> let inline f (x: System.DateTime) y = x + y;;
val inline f : x:DateTime -> y: ^a -> ^b when (DateTime or ^a) : (static member ( + ) : DateTime * ^a -> ^b)
This would be sound. However the a compat problem is real: though perhaps not a disastrous one given that the code is becoming strictly more general. Signatures files may need to be updated for example.
As an aside, just to note that any tweak like this should only be applied to suppressing weak resolution at inline functions. If we apply it more generally, the compat problem becomes much worse - e.g. consider the non-inline function:
let f (x: System.DateTime) y = x + y
or this kind of code
let mutable x = (fun (x: DateTime) y -> x + y)
In each case a compat problem would occur if weak resolution is disabled.
Just so we're on the same page, some questions/discussion:
-
Since generic arithmetic is one of the major driving forces of this RFC, would you like to elaborate on the semantics of overload resolution in non-
inlinecircumstances?[^1] For example,let add x y = x + yis resolved tointin the absence of annotations, butintisn't necessarily a better choice than any other visible overload. -
Related:
let add1 (x : float) = x + 1is a type error due to non-overloaded literals; I don't think this really needs discussion but people seem to ask about it from time to time (they can always go implementNumericLiteralGand/or use theLanguagePrimitivesstuff). -
Now, for non-inline code this "weak resolution" process is pretty reasonable. But for inline code it is dubious, especially in the context of this RFC - because future extension methods may provide additional witnesses for + on DateTime and some other type.
Indeed, we get into hairy territory here, and the extension method visibility issue is directly analogous to orphan instances.
-
Putting aside backward compat, we could perhaps turn off weak resolution for inline code for cases that involve true overload/witness resolution, perhaps a change like this: dsyme@e439cc8 though only applied to inline code. Thie would change inferred types, e.g. [...snip...] However the a compat problem is real: though perhaps not a disastrous one given that the code is becoming strictly more general. Signatures files may need to be updated for example.
My gut feeling is that it makes inline behave more in the way that people actually expect, but I don't have good visibility of how common code that uses inline actually is, nor what the best way to survey how much grumbling might occur if this change were to be introduced.
[^1]: IMO the work you've done on bringing "principled overloading" to ML even in the absence of (direct) typeclass-style functionality is great; the way arithmetic operators (including the behavior introduced with inline) and casts work in F# is substantially friendlier than e.g. SML's pseudo-overloads on op+ or OCaml's separation of + and +..
The notes you've made here are valuable; this should be put into the F# documentation or an FAQ somewhere, maybe, or at least a summary of the behavior once it's established what semantics are actually desired.
Since generic arithmetic is one of the major driving forces of this RFC, would you like to elaborate on the semantics of overload resolution in non-inline circumstances? For example, let add x y = x + y is resolved to int in the absence of annotations, but int isn't necessarily a better choice than any other visible overload.
Related: let add1 (x : float) = x + 1 is a type error due to non-overloaded literals; I don't think this really needs discussion but people seem to ask about it from time to time (they can always go implement NumericLiteralG and/or use the LanguagePrimitives stuff).
These both seem too off topic TBH. Let's keep this thread focused?
I'll work on this further tomorrow to see if there's a reasonable change we can make
Now that I read your explanation I think it's also weak resolution what forces me to write the "impossible" overload https://github.com/fsprojects/FSharpPlus/blob/eb79954c0c3bc95d71c8dc84670a02f5523705bc/src/FSharpPlus/Control/Functor.fs#L112
to avoid the (intended) generic function resolving prematurely to the possible one (right above it).
Now that I read your explanation I think it's also weak resolution what forces me to write the "impossible" overload https://github.com/fsprojects/FSharpPlus/blob/eb79954c0c3bc95d71c8dc84670a02f5523705bc/src/FSharpPlus/Control/Functor.fs#L112
Yes, this is highly likely. This is almost certainly the cause any time the presence or absence of witnesses impacts the checking of generic inline code.
@gusty @drvink See https://github.com/dotnet/fsharp/pull/8435 for more testing around some of the SRTP techniques used in FSharpPlus. Not all of them though :)
@dsyme I'm not sure if this example falls under number 2 https://github.com/jackfoxy/DependentTypes/blob/0a36775d572a4863e8c4cebf11e39998fd0b80e5/src/DependentTypes/DependentTypes.fs#L25
See https://github.com/dotnet/fsharp/pull/6805/commits/5603467ffe69339cc45cbe3aa60dc837578c953e for extensive additional testing, and there is a feature adjustment which I will write into the RFC