rescript-core icon indicating copy to clipboard operation
rescript-core copied to clipboard

[Discussion] Immutable defaults for mutable JS API:s

Open zth opened this issue 2 years ago • 42 comments

Following the discussion about immutable defaults in https://forum.rescript-lang.org/t/ann-rescript-core-a-new-batteries-included-drop-in-standard-library/4149, let's discuss which API:s are problematic and what we can do to remedy that.

First off, there's a few things we need to keep in mind in this discussion:

  • A goal is to stay close to the JS API:s (deliberately not saying "as close as possible", but close). This means we have wiggle room to brush over inconsistencies and "improve" the JS API:s we bind to so they make more sense for idiomatic ReScript usage, where it makes sense. But a complete overhaul of API:s isn't something we'll pursue for Core. Not everyone is going to agree with this, and that's fine. But that's the way forward for Core.
  • For full immutability, Belt is still a great alternative, and Belt isn't going anywhere, it'll keep being around.

With that said, providing immutable defaults where it makes sense is definitively a good idea. So, let's detail what API:s are problematic right now. Please write down the API:s with mutability you worry about, and we'll discuss what we can do about it, if anything.

Let's also discuss what we can do about the mutable API:s to make it apparent that they're mutable. A few things come to mind:

  • Return unit always, even if the underlying JS API returns itself mutated. One example for this is Map.set which mutates and returns itself. This would return unit instead to signify that it's mutating.
  • Naming. This is already the case, suffixing functions mutating the original thing with inPlace. We can check that that's consistent.
  • Doc strings. Have a bolded **This mutates the original array** on top of each doc string, plus suggesting the immutable alternative. This will make it very apparent that you're using a mutable API as you're working in your editor.

zth avatar Feb 10 '23 09:02 zth

Also, there's https://github.com/tc39/proposal-change-array-by-copy. The question is how far we should go from the JS API

DZakh avatar Feb 10 '23 09:02 DZakh

From perspective of a ReScript module API, making dangerous functions more visible is good, but I’m not sure if it’s good enough. When a module Foo has no mutating methods, you have a strong guarantee that Foo.t is immutable, and the invariant new value == new ref is enforced. Which have consequences for pattens that take advantage of this invariant, e.g. the React hooks with dependencies arrays.

But when you simply discourage mutating, say, arrays, instead of completely ruling it out, you don’t have those guarantees. In a team of app developers that have other things on their mind than just remembering not to mutate things, you’d probably at least want a linter to strongly discourage calling *inPlace functions. And I think a totally immutable Array module (with a way to copy-and-convert from/to a mutable array) is safer than a linter.

That said, it’s rather weird to omit bindings when binding to Array.prototype, and a totally immutable Array module might actually belong in Belt.

Return unit always, even if the underlying JS API returns itself mutated. One example for this is Map.set which mutates and returns itself. This would return unit instead to signify that it's mutating.

I’m torn on this. I like the idea of clearly separating getters and setters, but then bindings like this are not true to the underlying API, which might be too opinionated for a standard JS lib, and I bet some people are going to miss the chained calls those return values enable.

hoichi avatar Feb 10 '23 14:02 hoichi

A goal is to stay close to the JS API:s

I totally agree going on this direction, for me it's more important to be able to use the real JS api, and be consistent with the real JS api, than having some pure immutable layer, and for several reasons including performance and porting existing JS code. Javascript it's mixed, in the sense that has both mutable and immutable APIs, in the case of Array, for example one cannot expect expect to be close to JS without mutability.

Return unit always, even if the underlying JS API returns itself mutated. One example for this is Map.set which mutates and returns itself. This would return unit instead to signify that it's mutating.

This makes sense even though there are some benefits like chaining, it's clearer to leave mutable operations as unit

Naming. This is already the case, suffixing functions mutating the original thing with inPlace. We can check that that's consistent.

I would prefer to be consistent with JS api if one really cares about the JS ecosystem, it may be consistent with rescript ecosystem, but not with JS. Being consistent with JS apis will also allow to have better tools for generating rescript bindings from typescript/javascript

3xau1o avatar Feb 10 '23 14:02 3xau1o

Fo example Scala.js has a set of both mutable and immutable collections, while also keeping the JS.Array for interop, and their JS api the closest to native JS, because they have plenty of their own Scala standard library without having to sacrifice any JS api consistency, which in turns enables Scala.js code converters like ScalablyTyped to use most of the Typescript ecosystem, with automatically generated bindings.

3xau1o avatar Feb 10 '23 14:02 3xau1o

That said, it’s rather weird to omit bindings when binding to Array.prototype, and a totally immutable Array module might actually belong in Belt.

Yeah this pretty much sums it up well I think. We're trying to stay close to the JS array API, which is ultimately what we're binding to, not just as a backing structure. An immutable array module is of course very welcome in Belt, or as something standalone.

zth avatar Feb 10 '23 15:02 zth

I haven't really made up my mind yet, but I want to share where I'm coming from and what my thoughts are to facilitate further discussion:

My hope / expectations for a new rescript standard library has been a "full-fledged belt-like" lib embracing functional idioms.

I understand this is a non-goal of Core and absolutely see the necessity to have all JS bindings in one place and well documented.

Regarding immutability: I don't have an issue per se with immutable APIs being present in Core.
I guess for me, it's all about positioning Core and setting expectations:

  • Core are mainly bindings to vanilla js?
  • But not exclusively?
  • What's the maximum desired deviation from vanilla JS bindings?
  • What's the guideline for deciding what to include and what's too far off?
  • If Core were just bindings, the distinction on what to include here and in another "higher order" lib would be obvious. Including parts of Belt and other handy functions (but not bindings) blurs the line further.
  • It would be technically possible to provide two distinct api surfaces: mutable / immutable - but this would unnecessarily split up the js bindings, therefore I don't like this approach. Duplicating APIs is even worse in my opinion.
  • I'm also thinking about what would be the next step forward for a "functional immutable standard lib"? Would it make sense to vendor Core or use it as a dependency?

I'm aware this is becoming slightly off-topic, but I think the implications are pretty relevant to the discussion.

woeps avatar Feb 12 '23 12:02 woeps

@woeps I'm guessing you saw this https://github.com/rescript-association/rescript-core#guiding-principles?

zth avatar Feb 14 '23 08:02 zth

hoichi has a good point ‘When a module Foo has no mutating methods, you have a strong guarantee that Foo.t is immutable’

What about keeping the api very close to the original js (with a few small tweaks to remove warts etc.. eg returning unit for mutation) and separate immutable datatypes eg as per tc39 proposal. https://github.com/tc39/proposal-record-tuple

yunti avatar Feb 22 '23 21:02 yunti

Yes, that's the best way forward here. Array binds to JS array, which is mutable, the same way that Set binds to the mutable JS Set. For immutable sets, we have Belt.Set. For immutable arrays, if there's demand for it and we have contributors willing to put in the work to build it, we could have either Core.ImmutableArray or Belt.ImmutableArray, depending on where we want to draw the line for what is included where.

zth avatar Feb 23 '23 09:02 zth

I would be down to try and implement this.

I have some questions I would like feedback on first.

Where does this go?

Should it be in Belt or Core?

My thoughts are we could have Core.ImmutableArray that has most of the same functions as Core.Array but it just doesn't mutate things and returns a copy of the original array. That would leave room for us to do something like Belt.List for a more custom data type. Same thing for Core.ImmutableObject and something like Belt.Record. I took names from Immutable.js for these and was thinking of using that as a starting point? That could come later so it doesn't need to be decided now. We could also wait for Tuple and Record to be added to JS.

What should ImmutableArray and ImmutableObject do?

Should ImmutableArray only have immutable versions of mutable array functions like set? Or should it have most of the same functionality as Array? Should it follow a different design and be like an Immutable.js List?

Should Array and ImmutableArray be compatible?

I don't think you should be able to swap functions on these two and that they need distinct types. We could add functions to convert from one to the other like ImmutableArray.fromArray and Array.fromImmutableArray that would allow you to swap if needed.

How do you make an ImmutableArray?

If we want to do this without adding any syntax to the language I think we can use ImmutableArray.of([1,2,3]) to create one.

Do we even want Core.ImmutableArray or should we do Belt.List?

Rather than duplicating the API for Core.Array we could make a totally new thing that is simpler with just a few functions like set, get, filter, reverse, etc...

jderochervlk avatar May 15 '23 16:05 jderochervlk

Technically, it's possible to have a single set of bindings that map closely to the JS API, and also have a subset of them work on a distinct immutable array type, by adding a phantom type argument.

// Instead of
type array<'a>

// we add a phantom type argument to specify mutability
type array<'mut, 'a>

// then a couple of abstract phantom types to put there
type mut
type imm

// and now you can define functions that work on every kind of array, or just one kind
@get_index external get: (array<_, 'a>, int) => option<'a> = "" // can be called on any kind of array
@set_index external set: (array<mut, 'a>, int, 'a) => unit = "" // can only be called on mutable arrays

Not saying this is a good idea, there are certainly significant downsides to it, like making the types and type error messages more complicated, but it does make for a smaller API surface and is quite intuitive I think.

glennsl avatar May 15 '23 19:05 glennsl

Wouldn't that mean that while my call of set wouldn't mutate the value, but another call of that function could mutate it? I think that's why I like the idea of types. But in theory we could some up with a way to share the functions that are duplicated on each? Maybe some type of InternalArray that we don't export and use in Array and ImmutableArray.

jderochervlk avatar May 15 '23 19:05 jderochervlk

Not sure what you mean. set mutates. On every call. But it can only be called on arrays that are tagged as mutable. A set function that makes a copy instead of mutating would have to live elsewhere, but that also goes with the idea of having the Array module stick close to the JS array API. You could instead have an Array.Immutable module with immutable versions of the mutable bindings though.

glennsl avatar May 15 '23 21:05 glennsl

Sorry I wasn't clear.

If you have an Array.set function that mutates an array<'a> you wouldn't want to be able to pass it to an Array.Immutable.set function.

I think there should be a clear distinction between an array and an immutable array.

jderochervlk avatar May 16 '23 01:05 jderochervlk

Technically, it's possible to have a single set of bindings that map closely to the JS API, and also have a subset of them work on a distinct immutable array type, by adding a phantom type argument.

// Instead of
type array<'a>

// we add a phantom type argument to specify mutability
type array<'mut, 'a>

// then a couple of abstract phantom types to put there
type mut
type imm

// and now you can define functions that work on every kind of array, or just one kind
@get_index external get: (array<_, 'a>, int) => option<'a> = "" // can be called on any kind of array
@set_index external set: (array<mut, 'a>, int, 'a) => unit = "" // can only be called on mutable arrays

Not saying this is a good idea, there are certainly significant downsides to it, like making the types and type error messages more complicated, but it does make for a smaller API surface and is quite intuitive I think.

The issue with that is going to be variance. Immutable arrays are covariant, and mutable ones are invariant (no subtyping at all). I don't think the language lets you have both in one definition.

cristianoc avatar May 16 '23 01:05 cristianoc

As for where this should live, there was talks a while ago about a separate package dedicated to ReScript data structures, where this and several other things would fit well. Might be good to start with that, and then we can decide on where it should live more long term when it's tried and tested.

zth avatar May 16 '23 06:05 zth

If you have an Array.set function that mutates an array<'a> you wouldn't want to be able to pass it to an Array.Immutable.set function.

I think there should be a clear distinction between an array and an immutable array.

That is is exactly what I'm proposing. There would be no array<'a>, but instead array<mut, 'a> and array<imm, 'a>. Array.set would take only the former, Array.Immutable.set would take only the latter, and Array.get would take either.

glennsl avatar May 16 '23 08:05 glennsl

The issue with that is going to be variance. Immutable arrays are covariant, and mutable ones are invariant (no subtyping at all). I don't think the language lets you have both in one definition.

Can you illustrate the issue with an example? If I understand correctly, covariance would just mean that if 'a is a subtype of 'b, then array<'a> is also a subtype of array<'b>, while if it's invariant that relation wouldn't transfer? And in practice just means you'd have to map it?

glennsl avatar May 16 '23 08:05 glennsl

The issue with that is going to be variance. Immutable arrays are covariant, and mutable ones are invariant (no subtyping at all). I don't think the language lets you have both in one definition.

Can you illustrate the issue with an example? If I understand correctly, covariance would just mean that if 'a is a subtype of 'b, then array<'a> is also a subtype of array<'b>, while if it's invariant that relation wouldn't transfer? And in practice just means you'd have to map it?

Yes that's what it means. Combining both kinds of arrays in one type would give up the opportunity to make immutable ones covariant. This has come up a number of times, an it seems that having covariance is desirable.

cristianoc avatar May 16 '23 08:05 cristianoc

As for where this should live, there was talks a while ago about a separate package dedicated to ReScript data structures, where this and several other things would fit well. Might be good to start with that, and then we can decide on where it should live more long term when it's tried and tested.

Would it be a good idea for me to start a repo and publish some type of alpha release for this? I could do @jvlk/rescript-immutable-data?

jderochervlk avatar May 16 '23 13:05 jderochervlk

~~We'll need a way to convert a regular array into an immutable array. How would I go about that? The types from the input don't let me just take in an array and return it as an immutableArray.~~

Nevermind, I just needed a .resi file.

jderochervlk avatar May 16 '23 13:05 jderochervlk

~~We'll need a way to convert a regular array into an immutable array. How would I go about that? The types from the input don't let me just take in an array and return it as an immutableArray.~~

Nevermind, I just needed a .resi file.

You can't convert a mutable array to an immutable one without making a copy. It's unsafe.

cristianoc avatar May 16 '23 14:05 cristianoc

Same for the other way round.

cristianoc avatar May 16 '23 14:05 cristianoc

The issue with that is going to be variance. Immutable arrays are covariant, and mutable ones are invariant (no subtyping at all). I don't think the language lets you have both in one definition.

Can you illustrate the issue with an example? If I understand correctly, covariance would just mean that if 'a is a subtype of 'b, then array<'a> is also a subtype of array<'b>, while if it's invariant that relation wouldn't transfer? And in practice just means you'd have to map it?

I did not really answer your question: why it matters. And empty mutable array is tricky with typing. Just like references. An immutable one is not.

That's just one example.

Another one: how comes I can't type coerce array of this record to array of that record? (Question people asked). Because arrays are not covariant.

cristianoc avatar May 16 '23 14:05 cristianoc

Ok, I based it off from the work done here in this thread: https://forum.rescript-lang.org/t/immutable-array-implementation/761

module Array: {
  /* Put this in the interface file */
  type t<'a>
  let make: array<'a> => t<'a>
  let isEmpty: t<'a> => bool
  let head: t<'a> => option<'a>
  let tail: t<'a> => t<'a>
  let toArray: t<'a> => array<'a>
  let map: (t<'a>, 'a => 'b) => t<'b>
  let concat: (t<'a>, 'a) => t<'a>
  let append: (t<'a>, 'a) => t<'a>
  let get: (t<'a>, int) => option<'a>
  let set: (t<'a>, int, 'a) => t<'a>
} = {
  /* Put this in the module file */
  type t<'a> = array<'a>
  let make = arr => arr->Array.copy
  let isEmpty = t => t->Array.length == 0
  let head = t => t->Array.get(0)
  let tail = t => isEmpty(t) ? [] : t->Array.slice(~start = 1, ~end = t->Array.length)
  let toArray = make
  let map = Array.map
  let concat = (t, a) => [a]->Array.concat(t)
  let append = (t, a) => t->Array.concat([a])
  let get = (t, i) => t->Array.get(i)
  let set = (t, i, v) => {
    let c = t->Array.copy
    c->Array.set(i, v)
    c
  }
}
let t1 = [1,2,3]

let t2 = t1->Immutable.Array.make->Immutable.Array.head // no error
let t3 = t1->Immutable.Array.head // error

I'm going to clean this up and add more functions and some basic documentation.

jderochervlk avatar May 16 '23 14:05 jderochervlk

The way I have it above it's not a zero cost binding. Trying to see if I can do that now.

jderochervlk avatar May 16 '23 15:05 jderochervlk

The more I poke at this I don't think there is a way to have this pass directly to JS functions directly. There has to be a middle layer for these types to work and to not mutate the input values.

But that's something I might be able to improve on.

jderochervlk avatar May 16 '23 16:05 jderochervlk

The more I poke at this I don't think there is a way to have this pass directly to JS functions directly. There has to be a middle layer for these types to work and to not mutate the input values.

But that's something I might be able to improve on.

On needs to decide what immutable means. To some people it means: don't mutate. To others it means: make a copy for every single operation. The former can be done with just omitting mutating functions. The latter cannot.

cristianoc avatar May 16 '23 17:05 cristianoc

To me I would expect it not to mutate and to return a copy.

The first case of not mutating can be done just simply not using the functions that could be omitted.

jderochervlk avatar May 16 '23 18:05 jderochervlk

I did not really answer your question: why it matters. And empty mutable array is tricky with typing. Just like references. An immutable one is not.

That's just one example.

Another one: how comes I can't type coerce array of this record to array of that record? (Question people asked). Because arrays are not covariant.

Thanks for the examples! These problems aren't going to go away by adding a separate second-class immutable array type though. As long as the default is a mutable array, and that is what's used by all sorts of bindings and such, people are still going to stumble across these issues and get confused. And at that point, are these issues severe enough that they'll choose to switch over to an immutable array, converting to and from mutable arrays where needed and adapt to different usage patterns?

glennsl avatar May 16 '23 20:05 glennsl