purescript-lists icon indicating copy to clipboard operation
purescript-lists copied to clipboard

Add `ToList` class

Open natefaubion opened this issue 8 years ago • 9 comments

As an alternative to https://github.com/purescript/purescript-foldable-traversable/issues/69, we could add a class:

class ToList f where
  toList :: forall a. f a -> LL.List a
  foldrLazy :: forall a b. Z.Lazy b => (a -> b -> b) -> b -> f a -> b

Where

toList f = foldrLazy LL.cons LL.nil f
foldrLazy cons nil f = LL.foldrLazy cons nil (toList f)

natefaubion avatar Jun 08 '17 15:06 natefaubion

What do you think are the pros and cons of this approach vs modifying Foldable?

paf31 avatar Jun 08 '17 16:06 paf31

Pro: Doesn't break the entire ecosystem Con: Must depend on purescript-lists (most do anyway, though). Feels somewhat redundant, since it's just an alternative formulation of Foldable with laziness.

natefaubion avatar Jun 08 '17 16:06 natefaubion

@natefaubion do you think we should not do this, for the same reason as the one you gave in https://github.com/purescript/purescript-foldable-traversable/issues/69#issuecomment-709562001 ?

hdgarrood avatar Oct 16 '20 13:10 hdgarrood

I think that toList makes more sense than adding foldrLazy because by reifying it as a lazy data structure you can write a tail-recursive consumer to avoid stack issues.

natefaubion avatar Oct 16 '20 20:10 natefaubion

To clarify, lets take all as an example. The naive implementation with foldrLazy would be

all = foldrLazy (\a b -> Lazy.defer \_ -> a && Lazy.force b) (Lazy.defer \_ -> true)

I consider this naive because foldrLazy requires a Lazy constraint, and you can just lift anything into Data.Lazy. This is basically what you'd write in Haskell, but this just basically allocates a list of thunks that will probably blow the stack when forced. This is equivalent to a stack-unsafe foldr. In my opinion, using foldrLazy to fold into Data.Lazy is an anti-pattern. foldrLazy should not be used as a consumer, it should be used to build a producer! If you instead just had toList as the canonical producer, then you could write

all = go <<< toList where
  go list = case List.step list of
    Cons b tail
      | b -> go tail
      | otherwise -> false
    Nil ->
      true

Which, while ugly, is totally safe and still short circuits.

natefaubion avatar Oct 16 '20 20:10 natefaubion

I would actually probably even prefer something like

data Producer a = Done | Next a (Unit -> Producer a)

As opposed to Data.List.Lazy because the extra allocations for dealing with Data.Lazy is non-trivial IME.

natefaubion avatar Oct 16 '20 20:10 natefaubion

That makes sense. This feels like something I'd like to prototype outside of core first, probably.

hdgarrood avatar Oct 17 '20 15:10 hdgarrood

Just for comparison, here are some benches for and (I incorrectly called it all above) with 1000 trues, implemented as described above.

Array unsafeIndex:
mean   = 5.81 μs
stddev = 9.51 μs
min    = 4.38 μs
max    = 617.31 μs

Foldable Array:
mean   = 90.88 μs
stddev = 49.84 μs
min    = 67.80 μs
max    = 1.47 ms

Producer:
mean   = 36.46 μs
stddev = 29.04 μs
min    = 27.49 μs
max    = 647.35 μs

toList:
mean   = 66.11 μs
stddev = 49.43 μs
min    = 49.41 μs
max    = 935.74 μs

foldrLazy:
mean   = 261.97 μs
stddev = 90.12 μs
min    = 212.11 μs
max    = 2.75 ms

natefaubion avatar Oct 18 '20 17:10 natefaubion

The times obviously improve for everything except Foldable when the whole list is not true.

natefaubion avatar Oct 18 '20 17:10 natefaubion