fslang-suggestions icon indicating copy to clipboard operation
fslang-suggestions copied to clipboard

Literals as types

Open jbeeko opened this issue 7 years ago • 9 comments

Add literals as types, als Typescript See also literal types in type script, https://www.typescriptlang.org/docs/handbook/2/everyday-types.html#literal-types

Also relates to erased type-tagged unions #538


This language suggestion proposes adding Enumerate Unions. These are union types were rather than having different cases each of which defines any number of items the union is an enumeration of the same data for each case.

One example domain currencies. There is a fixed list of these and a handful data elements are associated with each currenty:

  • name
  • number
  • minor units

Syntax

A proposed syntax for EnumerateUnions is:

type Currency of Name: string; Number: int =
| CAD Name = "CAD"; Number = 123
| USD Name = "USD"; Number = 223
| GBP Name = "GBP"; Number = 444

printfn "The number of currency: '%s' is: %i" CAD.Name CAD.Number

This is very succinct (4 lines vs about 16 using the current languate), easy to understand, models all the data and prevents the construction of invalid currencies.

Here is a more complex example modeling ISO currency https://en.wikipedia.org/wiki/ISO_4217 and country codes https://en.wikipedia.org/wiki/ISO_3166-1.

type Country of Name: string; Code: string; LongCode: string; IsoNumber: int; Independant: bool; Currencies : Currency list =
| CA Name = "Canada"; Code = "CA"; LongCode = "CAN"; Number = 124; Independant = true; Currencies = [CAD]
| GB Name = "United Kingdom of Great Britain and Northern Ireland"; Code = "GB"; LongCode = "GBR"; Number = 826; Independant = true; Currencies = [CAD]
| US Name = "United States of America"; Code = "US"; LongCode = "USA"; Number = 840; Independant = true; Currencies = [USD]
| VG Name = "British Virgin Islands"; Code = "VG"; LongCode = "VGB"; Number = 92; Independant = false; Currencies = [USD]
| IO Name = "British Indian Ocean Teritory"; Code = "IO"; LongCode = "IOT"; Number = 86; Independant = false; Currencies = [USD; GBP]

and Currency of Name: string; Number: int; Digits: int UsedIn: Country list =
| CAD Code = "CAD" Name = "Canadian Dollar"; Number = 124; Digits = 2; UsedIn = [CA]
| USD Code = "USD" Name = "United States Dollar"; Number = 840; Digits = 2; UsedIn = [US; VG; IO]
| GBP Code = "GBP" Name = "Pound sterling"; Number = 826; Digits = 2; UsedIn = [GB; IO]

printfn "The currency: '%s' is used in: %A" USD.Name USD.UsedIn
printfn "The Country: '%s' uses the currencies: %A" IO.Name IO.Currencies

Again a lot of information is represented very clearly and compactly in a typesafe way.

Uses

Where would this be useful? All cases where there domains contain an infrequently changing enumeration of items with richer data than just strings or integers. These may be globally defined lists (currencies) or specific to the application. Examples are:

  • Chemical elements (possibly including isotopes or isomers)
  • Countries
  • Other geographic elements such as mountains, states or cities
  • Planets
  • Currencies
  • BloodTypes
  • Languages
  • Slow chaning lists of application elements such as suppliers or parts
  • Positions in organizations
  • As an alternative to databases where the data set is small (1000 records) and changes infrequently, or on a controlled release schedule.
  • Defining menu items or actions

Related Proposals

Open Proposal https://github.com/fsharp/fslang-suggestions/issues/564

Is a proposal to ease access to data elements common to all cases. Those elements are then accessible as properties using dot notation.

type ShoppingCart = 
| EmptyCart of AppliedDiscountCode: string
| CartWithItems of AppliedDiscountCode: string * Items: string list
| CompletedCart of AppliedDiscountCode: string * Items: string list * CalculatedTotal: decimal

v.AppliedDiscount //where v is of type ShoppingCart

This comment https://github.com/fsharp/fslang-suggestions/issues/564#issuecomment-298170856 suggests that commen field be defined explicitly like this:

type ShoppingCart =
    val DiscountCode: string
    | EmptyCart
    | CartWithItems of Items: string list
    | CompletedCart of Items: string list * CalculatedTotal: decimal

with constructor like this: EmptyCart(DiscountCode=code).

The two proposals could perhaps be merged like this:

type ShoppingCart =
    val DiscountCode: string
    | EmptyCart of DiscountCode = "Free"
    | CartWithItems of DiscountCode = "Moderate"; Items: string list
    | CompletedCart of DiscountCode = "Full"; Items: string list * CalculatedTotal: decimal

In this case if the value is defined in the defintion that value is fixed for all "instantiations" of the case. One issue with this (other than being more complex) is that it blurs the line between "constant values" and dynamic values.

In this syntax the simple currency example is as follows:

type Currency =
val Name: string
val Number: int 
| CAD Name = "CAD"; Number = 123
| USD Name = "USD"; Number = 223
| GBP Name = "GBP"; Number = 444

printfn "The number of currency: '%s' is: %i" CAD.Name CAD.Number

Closed Proposal https://github.com/fsharp/fslang-suggestions/issues/558

Suggests being able to define constant data for DU like so:

type Planets =
| MERCURY of {Mass=3.303e+23; Radius=2.4397e6)
| VENUS of {Mass=4.869e+24; Radius=6.0518e6),
| EARTH of {Mass=5.976e+24; Radius=6.37814e6),
| MARS of {Mass=6.421e+23; Radius=3.3972e6),
| JUPITER of {Mass=1.9e+27; Radius=7.1492e7),
| SATURN of {Mass=5.688e+26; Radius=6.0268e7),
| URANUS of {Mass=8.686e+25; Radius=2.5559e7),
| NEPTUNE of {Mass=1.024e+26; Radius=2.4746e7);

However it does not suggest a way to make it easier to access the common set of data.

Existing approaches to this

Unions without Data

type Currency = 
| CAD
| USD
| GBP

This means the requirement that all currencies are enumerated but the data associated with Currencies is not modeled. The code can be obtained as a string using reflection but that is all.

Enums

type Currency = 
| CAD=123
| USD=223
| GBP=444

In this case the integer value is available as the data representing the Currency but again not all the needed data.

Link unions cases with records via let

type Details = {
    Code : Currency
    Name : string
    Number: int
  } 
type Currency =
    Curency of Details

let CAD = Currency {Name = "CAD"; Number = 123}
let USD = Currency {Name = "USD"; Number = 223}
let GBP = Currency {Name = "GBP"; Number = 444}

printfn "The number of currency: '%s' is: %i" CAD.Name CAD.Number

This supports all the data but is verbose. And to hide the Currency constructor to prevent the creation of invalid currencies will make it even more verbose.

Link unions cases with records via static members

type Currency =
| CAD 
| USD
| GBP
with 
static private member AllDetails =
  [
  {Code = CAD; Name = "CAD"; Number = 123}
  {Code = USD; Name = "USD"; Number = 223}]
  {Code = GBP; Name = "GBP"; Number = 444}]
member this.Details = 
    Currency.AllDetails |> List.find (fun e -> e.Code = this)
member this.Name = 
   this.Details.Name
member this.Number = 
   this.Details.Number

and Details = {
Code : Currency
Name : string
Number: int
} 

printfn "The number of currency: '%s' is: %i" CAD.Name CAD.Number

Again a bit verbose and tedious. It also suffers from the disadvantage of needing to keep the Union of currencies in line with the list of details.

This could also be done using a module to hold a function with AllDetails.

Pros and Cons

The advantages of making this adjustment to F# are that modeling complex static data in a typesafe way becomes very straight forward. See the example modeling Countries and Currencies above. Also see the examples of modeling the same domain with the current language definition.

The disadvantages of making this adjustment to F# are ...

Extra information

Estimated cost (XS, S, M, L, XL, XXL): no clue

Related suggestions: https://github.com/fsharp/fslang-suggestions/issues/564, https://github.com/fsharp/fslang-suggestions/issues/558

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • [x] This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • [x] I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • [x] This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • [x] This is not a breaking change to the F# language design (a least not as far as I can tell, which is not far)
  • [x] I or my company would be willing to help implement and/or test this (though my skills are limited)

jbeeko avatar Apr 05 '18 18:04 jbeeko

Alternative approach could be:

type CurrencyCode = CAD | USD | GBP
type CurrencyDetails = { Name: string; Number: int; Digits: int }
module Currency =
    let details = function
    | CAD -> { Name = "Canadian Dollar"; Number = 124; Digits = 2 }
    | USD -> { Name = "United States Dollar"; Number = 840; Digits = 2 }
    | GBP -> { Name = "Pound sterling"; Number = 826; Digits = 2 }
let { Name = name; Number = number } = CAD |> Currency.details
printfn "The number of currency: '%s' is: %i" name number

type CountryCode = CA | GB | US | VG | IO
type CountryDetails = { Name: string; Code: string; LongCode: string; Number: int; Independant: bool; Currencies : CurrencyCode list }
module Country =
    let details = function
    | CA -> { Name = "Canada"; Code = "CA"; LongCode = "CAN"; Number = 124; Independant = true; Currencies = [CAD] }
    | GB -> { Name = "United Kingdom of Great Britain and Northern Ireland"; Code = "GB"; LongCode = "GBR"; Number = 826; Independant = true; Currencies = [CAD] }
    | US -> { Name = "United States of America"; Code = "US"; LongCode = "USA"; Number = 840; Independant = true; Currencies = [USD] }
    | VG -> { Name = "British Virgin Islands"; Code = "VG"; LongCode = "VGB"; Number = 92; Independant = false; Currencies = [USD] }
    | IO -> { Name = "British Indian Ocean Teritory"; Code = "IO"; LongCode = "IOT"; Number = 86; Independant = false; Currencies = [USD; GBP] }

let { Name = countryName; Currencies = currencies } = IO |> Country.details
printfn "The '%s' country uses the currencies: %A" countryName currencies

eugbaranov avatar Apr 05 '18 21:04 eugbaranov

This proposal isn't needed.

type Currency =
    CAD | USD | GBP
    member t.Name   = match t with CAD -> "CAD" | USD -> "USD" | GBP -> "GBP"
    member t.Number = match t with CAD -> 123   | USD -> 223   | GBP -> 444

charlesroddie avatar Apr 10 '18 15:04 charlesroddie

@charlesroddie that's true, but there could be value in providing the suggested (or similar) syntactic sugar. If it would get used enough, I think it could be a valuable shorthand. Though I suggest we also add the with and and keywords and make it a block, which would let it be a bit more syntactically flexible. For example, you'd be able to unambiguously break it out on separate lines like so:

type Currency =
    | CAD
        with Name = "CAD"
        and Number = 123
    | ...

It reads kinda nicely, too. My question then would be, would we also want to be able to have the associated value depend on values contained by the DU case? For example, I'd find the following valuable:

type Shape =
    | Rectangle of width:int * height: int
        with Area = width * height
    | RightTriangle of base:int * height:int
        with Area = width * height / 2

This would imply that this feature is actually a shorthand for properties (or maybe methods?).

jwosty avatar Apr 10 '18 21:04 jwosty

Continuing on my last comment, these shorthand properties could instead take the associated arguments as parameters, to handle things like nested tuples nicely, at the expense of slightly cluttering simpler examples (like Shape):

type Geometry =
| Line of vec1:(float * float) * vec2:(float * float)
    with Length((x1,y1),(x2,y2)) = sqrt (((x1 - x2) ** 2.) + ((y1 - y2) ** 2.))

That much said, I think this is interesting to consider.

jwosty avatar Apr 10 '18 21:04 jwosty

I propose using anonymous records now that we have them. Something like that.

type Planets =
| MERCURY of {|Mass=3.303e+23; Radius=2.4397e6|}
| VENUS of {|Mass=4.869e+24; Radius=6.0518e6|}
| EARTH of {|Mass=5.976e+24; Radius=6.37814e6|}
| MARS of {|Mass=6.421e+23; Radius=3.3972e6|}
| JUPITER of {|Mass=1.9e+27; Radius=7.1492e7|}
| SATURN of {|Mass=5.688e+26; Radius=6.0268e7|}
| URANUS of {|Mass=8.686e+25; Radius=2.5559e7|}
| NEPTUNE of {|Mass=1.024e+26; Radius=2.4746e7|}

So, in this way no new syntax is needed, we can use what we already have.

Luiz-Monad avatar Aug 13 '19 00:08 Luiz-Monad

Mroe realistic would be a typescript-like combination of literals-as-types plus adhoc unions, e.g. "CDN" | "USD" | int

dsyme avatar Jun 16 '22 13:06 dsyme

Rescript used a special syntax for this purpose, the hashtag intentionally tries to distinguish syntax of literal types from literal values, still their semantics remain as in typescript.

type color = [#red | #green | #blue]

3xau1o avatar Jul 22 '22 20:07 3xau1o

Mroe realistic would be a typescript-like combination of literals-as-types plus adhoc unions, e.g. "CDN" | "USD" | int

that's the way Scala embraced

// the following constant can only store ints from 1 to 3
val three: 1 | 2 | 3 = 3

val one: 1 = 1                     // val declaration
def foo(x: 1): Option[1] = Some(x) // param type, type arg
def bar[T <: 1](t: T): T = t       // type parameter bound
foo(1: 1)                          // type ascription

3xau1o avatar Jul 22 '22 21:07 3xau1o

Adding an alternative. We'll use static data, then it typically gets moved to a DB after a few updates. I use this approach which works for both.

type Item = { ... }
let items : Item list = ... // loaded or statically listed
let codeDict = items |> List.map (fun x -> x.Code, x) |> dict

// once you know the lookup can't fail
let item = codeDict.[code]

If it's static data, Code could be a DU or module constants, so the indexer above is safe to use. But this doesn't work if data is moved to a DB -- data changes could cause crashes. For me, something like Code typically comes from outside my program (e.g. front end) and is validated by checking if it's a dictionary key. So using the indexer is not a problem.

Contextually, this approach is for small amounts of frequently-used or expensive-to-compute data. Otherwise it's easier to load a specific item from db when needed.

kspeakman avatar Jul 31 '22 08:07 kspeakman

Closing preferring #1195 for now

dsyme avatar Oct 26 '22 14:10 dsyme