rhombus-prototype icon indicating copy to clipboard operation
rhombus-prototype copied to clipboard

Propose a design for generic lists

Open jackfirth opened this issue 1 year ago • 7 comments

This builds off #201, #221, #259, and #330, providing a more concrete design and plan for a generic interface for immutable lists. Currently still in draft status.

I'm working on this with @OxSon, who is handling most of the RRB tree implementation and benchmarking.

Rendered

jackfirth avatar Sep 05 '23 01:09 jackfirth

A few scattered thoughts:

  • translating [1,2,3] to builder notation seems like it would be a significant performance hit; can't we translate it to something like List.unsafeFromVector(racket.vector(1,2,3)) (or use quoted vectors for the argument)?
  • what does cons on various kinds of lists do? Does it copy?
  • does it make sense to have hybrid representations to avoid copying?

samth avatar Sep 12 '23 14:09 samth

* translating `[1,2,3]` to builder notation seems like it would be a significant performance hit; can't we translate it to something like `List.unsafeFromVector(racket.vector(1,2,3))` (or use quoted vectors for the argument)?

Sure. Alternatively, we could translate it to List(1, 2, 3) and then do whatever optimization we do for List(x, ...). Now that I think of it that's probably better.

* what does `cons` on various kinds of lists do? Does it copy?

There is no List.cons() method. You could write List.insert(0, element), which would do whatever the underlying insert() method does. That method is expected to avoid copying in any persistent implementation of List. For ConsList specifically, there could be a cons() function that takes an element and another ConsList. That's what we've got today anyway.

* does it make sense to have hybrid representations to avoid copying?

Maybe. We've certainly got the option to try it in the future. I'd like to get more done before exploring it though, since mutable collections and non-list collections might complicate the picture significantly.

jackfirth avatar Sep 13 '23 02:09 jackfirth

This is ready for review now.

jackfirth avatar Sep 19 '23 01:09 jackfirth

One question I have is about append: will that be part of the builder interface? (It seems like a common pattern to build up a list with appends, not that I've done any formal studies or anything!)

rfindler avatar Sep 19 '23 13:09 rfindler

One question I have is about append: will that be part of the builder interface? (It seems like a common pattern to build up a list with appends, not that I've done any formal studies or anything!)

It will, in the form of a ListBuilder.addAll() method. I'm not planning on implementing it or the other batch methods until everything else is done though.

jackfirth avatar Sep 19 '23 19:09 jackfirth

The proposal should make clearer that ConsList is intended as distinct from a Racket list (that I learned via Discord). A ConsList might have Racket pairs internally, but with a wrapper around the head pair or empty list. That's different than the current Rhombus List, which is a Racket list (so, I did not understand that facet of this proposal, at first).

In retrospect, if List can refer to something with a different representation than Racket lists (e.g., using an RBB-tree), then I have to agree that a Racket list cannot be a Rhombus List. Trying to make that correspondence work would create more problems than it solves in terms of implementing ==, bridging to Racket, and managing programmer expectations. A Racket list can be used as a sequence, I expect, but not as something that is == to any List.

This seems like a significant performance challenge right at the start, since not using Racket lists is likely to impose a cost (relative to Racket) on most programs. I had previously imagined that multiple List implementations might be handled by inlining dispatch to the common(?) case of cons cells—which would inflate code but maybe not make things much slower—but I doubt that it works out if lists have an extra wrapper.

mflatt avatar Sep 29 '23 18:09 mflatt

More generally related to what @mflatt said, I think we need to be aware that most of the models we've considered for collections generally are in typed languages with compilers (often the JVM) that can eliminate more of the cost of dynamic dispatch and particularly wrappers relative to what Racket can do. We may need to consider different collections of tradeoffs than those systems ended up with (such as different choices about equality).

samth avatar Sep 29 '23 18:09 samth