fslang-suggestions
fslang-suggestions copied to clipboard
Ease conversion between Units of Measure (UoM) and undecorated numerals and simplify casting
Ease of use for Units of Measure (UoM)
After being asked 3-4x in just as many weeks on Slack by different newcomers and experienced programmers alike, I propose we introduce a set of intrinsics related to UoM to make it less painful to convert to undecorated numerals and back, and to simplify conversion between numeric types.
I suggest to do that by adding a few sensibly-named functions to the LanguagePrimitives
.
Remove UoM
Pair each LanguagePrimitives.XXXWithMeasure
with a LanguagePrimitives.XXXWithoutMeasure
.
Remove UoM generically
Add LanguagePrimitives.RemoveMeasure: 'T<UoM> -> 'T
.
Add UoM generically
It's already possible to add UoM for specific types, but not generically for any type. We could add LanguagePrimitives.WithMeasure: 'T -> 'T<UoM>
, where 'T
is one of the signed integral types.
Support conversion
If at all possible, I suggest to make all cast operators sensitive to the target, to make this seamless. That means, allowing:
let x: float<cm> = float 45<cm>
See also SO question: https://stackoverflow.com/questions/1894250/f-unit-of-measure-casting-without-losing-the-measure-type
Allow no-op cast on collections for UoM
It's currently impossible to remove Array<int<cm>>
from its cm
UoM, unless you loop over it. Same is true for any other collection.
EDIT: the retype
trick below can actually be adopted for this, thanks to @BillHally (comment).
let myArray = [|3<cm>; 42<cm>|]
let x: int array = myArray |> Array.map LanguagePrimitives.WithoutMeasure // this should be optimized away
Existing way of doing this
The existing way of approaching this problem in F# by multiplying by 1<_>
or 1.0<_>
, but this is type-specific. For generics, you can only do so by using unsafe compiler-specific code, like this (courtesy @gusty):
// use compiler specific code
let inline retype (x: 'T) : 'U = (# "" x: 'U #)
// remove UoM decoration
let inline removeUoM (x: '``T<'M>``) =
let _ = x * (LanguagePrimitives.GenericOne : 'T)
retype x :'T
Conversion from float<cm>
to int<cm>
etc is currently possible by reinterpreting the cast, but hard to discover even for experienced users:
x |> float |> LanguagePrimitives.FloatWithMeasure
Pros and Cons
The advantages of making this adjustment to F# are numerous, just to name a few:
- Better adoption of F# and simpler use, and therefor more use, of UoM
- Solving a hard, and often-asked real-life problem, that really should be trivial to answer
- Prevent users to ubiquitously use box/unbox tricks to reach the same
- Allow coding above scenarios without resorting to deprecated code
- Improve overall code generation by erasing the types when requested
- Prevent looping over entire collections to just remove UoM
The disadvantages of making this adjustment to F# are:
- It's a surface area change
- Leaving this stuff inside
LanguagePrimitives
may still be hard to discover, but at least it gives a "single, good way" of doing things
Extra information
Estimated cost (XS, S, M, L, XL, XXL): S
I think this isn't very hard to do, except for some optimizations perhaps. The compiler mostly already has ways of coding this internally. Ultimately, we should try to keep the balance of type-safety with UoM, but really shouldn't make it too hard to work with them.
Related library is FSharp.UMX: https://github.com/fsprojects/FSharp.UMX
Related language suggestions:
- https://github.com/fsharp/fslang-suggestions/issues/911 (suggestion refused on the grounds the up-scale conversion between UoM's is explicitly not supported)
- https://github.com/fsharp/fslang-suggestions/issues/901 (approved, implemented)
- https://github.com/fsharp/fslang-suggestions/issues/452 (suggestion not approved as UoM cannot be adopted to any type without it being a breaking change)
Related suggestions (and SO questions):
Just a bunch of the many roaming questions related to UoM, illustrating the trouble people perceive with it:
- SO: casting without loosing UoM
- SO: strip UoM
- SO: strip UoM generically
- SO: add UoM in lists or arrays
- SO: strip UoM in lists or arrays
- SO: convert between types having the same UoM
- SO: using external non-UoM-aware functions and get back UoM type you put in
- Most recent slack discussion started here: https://fsharp.slack.com/archives/C1R50TKEU/p1593558127321300, scroll down to @gusty's suggestion (my solution didn't work in his case).
Affidavit (please submit!)
Please tick this by placing a cross in the box:
- [x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
- [x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
- [x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.
Please tick all that apply:
- [x] This is not a breaking change to the F# language design
- [x] I or my company would be willing to help implement and/or test this: I've looked at the code a bit and would like to tackle this, if an RFC were to be accepted :).
I would love this. Would greatly simplify some of the work that I am doing. UoM are great but there are some painful edge cases that need to be sanded down and smoothed over.
@dsyme, there were several discussions on slack recently, which led to this suggestion, and with the awesome work done by @pblasucci on widening the UoM types to all primitives, I'd like to add this as well to the (relatively painless) improvements for UoM's.
Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC for this.
So, recapping, there are 5 different suggestions here, each of which improves quality of life for working with units of measure:
- Remove UoM
- Remove UoM generically
- Add UoM generically
- Support conversion
- Allow no-op cast on collections for UoM
Of the five, 1 is trivial. 2, 3 seem like they'd be straight-forward -- but I've not done much analysis. 4, 5 are more concerning -- but only in the case of ensuring well-optimized code.
As an aside, for my own work, item 4 (conversions) is the one I smack up against most frequently.
@abelbraaksma @dsyme Might it be worth taking decisions on 1, 2+3, 4+5 separately (as I could see the merits of carving up the RFCs, and the associated work, into those three somewhat more digestible chunks)?
I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go. I think only (5) is slightly problematic from an implementation standpoint.
The easiest ones are those that only need library functions.
This is maybe not the best place to ask, but... I have a process question:
Would we "feature gate" these behind multiple different "language features"? Would we reuse one of the existing gates? Does it really matter?
A bunch of these will end up in FSharp.Core, which complicates "feature gates" due to it being a compiled lib, and afaik, apart from complex features like string interpolation, a new function is just a new (isolated) function in FSharp.Core, meaning that no "gate" is necessary to enable it: people can opt-in by calling that function.
For the part that requires changes in the compiler, I'd assume a single feature gate for like uom-improvements
or something would be more than sufficient, but ultimately that's up to @cartermp and @dsyme, to whom I'd like to kindly ask:
Could you consider this "approved-in-principle"? If so, I'll gladly write the RFC(s) for this.
@abelbraaksma wrote:
I've no problem splitting them in two RFCs, but ideally, I (or you ;) would implement the chunk in one go.
I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved :wink:
@abelbraaksma wrote:
- Allow no-op cast on collections for UoM
and:
I think only (5) is slightly problematic from an implementation standpoint.
I may be misunderstanding something, but the retype
function you provide will also work for collections.
We've been using essentially the same code for a couple of years to convert e.g. int<a>[]
to int<b>[]
without any problems.
Here's some example code in sharplab.
@BillHally, thanks for pointing that out. Retyping indeed works, but we may need to ensure only primitives are targeted to prevent illegal conversions. I haven't looked that deep in potential corner cases, but as you said, it should be a noop as long as the primitive types match.
I'd be happy to work on this (with Abel or solo)... assuming @cartermp or @dsyme will go ahead an mark it as approved :wink:
Ping?
I apologise for missing this despite being prompted several times. Put it down to the pandemic or something.
I can see the issues here. When UoM were originally designed, the aim was for "unit soundness" that is quite a strict criteria. Over time this has been weakened allowing conversions, integers etc.
Concretely I agree with addressing each of the above in LanguagePrimitives but am concerned about the actual details. Specifically anything involving the conversion operators may be problematic, since these are integrated with constraint solving, but perhaps it's doable if a unitized target type is known.
(2) and (3) require 'T<'m>
as a construct, or a new kind of constraint, or a special type-checking rule. Thus they look complex.
I'll belatedly mark this as approved but I'm concerned that the details won't satisfy when it comes to an RFC
Thanks @dsyme to approve this in principle. I'll set out to write an RFC for this, or perhaps two, so we can split the feature (i.e. to have the complexities that 'T<'m>
brings in only one of them).
For what it's worth, I was recently playing around with this for something that came up at my day job. I now fully understand what @dsyme means about items 2 and 3. I would definitely split the two of them into a separate RFC.
But stepping back a moment, and thinking about the larger "request", which I understand as "make it easier to do conversions involving units of measure", I'd like to suggest that part of adding items 1, 4, and 5 could be to surface them somewhere more discoverable than LanguagePrimitives
. In fact, it probably makes sense to also alias (move?) the existing ___WithMeasure
functions. Everything could be neatly packaged into a static type called Units
or UnitsOfMeasure
or Whatever-is-deemed-most-appropriate
. But why a static type? Because something like Units.clear(number : float<'M>) : float
, with ~13 overloads (one per numeric primitive) has arguably better "developer ergonomics" than 13 separate ___WithoutMeasure
functions.
At least, that's what's come out of the things I've been doing at work lately.
@pblasucci, I like those suggestions. Ease-of-discovery would certainly help. But I'm not sure what @dsyme, @KevinRansom et. al. would think about adding "another way of doing the same thing". Perhaps LanguagePrimitives
in hind-sight was a bad location for any of this, and we should fix that, or we are just stuck with it.