random icon indicating copy to clipboard operation
random copied to clipboard

Utilities to get a random element from a list and shuffle a list

Open subterfugue opened this issue 1 year ago • 7 comments

What I expect the signatures of these utilities to be:

  1. random-fu implements the random ekement utility as randomElement :: [a] -> RVar a, though it doesn't have to be that specifically. Maybe a more random implementation would be randomElement :: RandomGen g => [a] -> g -> a?

  2. shuffle :: RandomGen g => [a] -> g -> [a].

I know that there was an issue precisely about this about 3 years ago, but honestly, I think this should be heavily rethought.

These two functionalities are repeatedly provided and reimplemented in many other libraries, with a varying interface, which makes it all the more confusing whether it should be manually implemented by the user, or whether they should add in these libraries (which depend on random anyway) just to use these 2 utilities that aren't in random.

Forgive me if I'm making assumptions but this seems like this doesn't add many maintenance burdens, and makes dealing with randomness much easier than it is right now. This doesn't seem like it's introducing any sort of feature creep either, it's well within the scope of what random does, or at least to me.

There are only a few criticisms of this in the other issue, mainly

  1. Too opinionated
  2. Performance and laziness
  3. Already implemented by other libraries

For the first point, I think that being opinionated in this scenario is the better option than not. So far many libraries have implemented this and we have an overview of how performant/convenient different approaches are. This is aside from the fact that no matter what implementation we choose, we'll be giving a better experience for people than they're having right now. With pretty much an imperceptible number of downsides. If the opinion we make for the user doesn't suit them, then at that point they can make it on their own.

For the second point, I think this is something that most Haskellers have come to expect. Yes, bottoms and laziness will affect implementations. Yes, using linked lists is not recommended as the most performant collection. Even only providing it for linked list will, again, improve people's situations.

For the third point, random has already reimplemented behavior from other libraries. It seems to me way more than simply adding two utilities.

I'm not really a library maintainer, but as a Haskell programmer... To me, this addition is only an improvement, and it has no downsides.

subterfugue avatar Jul 27 '23 13:07 subterfugue

I support this proposal. From discussion on the FP Discord server, it sounds like a number of other people support it too.

There may also be scope to implement other utility functions, not just the two suggested here. I’m not sure what else, though, and those two would be a good place to start.

bradrn avatar Jul 27 '23 13:07 bradrn

These two are the most basic random utilities I can think of and I'd expect any decent random library to have them.

I think it makes more sense to make g the first argument, for currying. As for the names, shuffle is a no-brainer imo, but perhaps choose instead of randomElement (the latter is rather verbose)?

konsumlamm avatar Jul 27 '23 14:07 konsumlamm

Here is a PR with implementation of shuffling: #140

I'd be really happy to see some reviews.

shuffleList is implemented. Please submit your feedback in a form of a PR.

Please don't ask to rename to shuffle, there are other data structures in Haskell than list, we never know in the future what we'll want to shuffle.

With respect to random element, I am thinking of a function:

splitListAtRandom :: RandomGen g => [a] -> g -> ([a], a, [a], g)

Which can then be used to implement:

  • randomListElement :: RandomGen g => [a] -> g -> (a, g)
  • extractRandomListElement :: RandomGen g => [a] -> g -> (a, [a], g)

@konsumlamm I disagree with you on both points:

I think it makes more sense to make g the first argument, for currying.

It breaks convention of the whole library

perhaps choose instead of randomElement (the latter is rather verbose)?

First of all choose is already used in QuickCheck. Second of all, we, hopefully, in the future will have capability to do similar operations on other data structures, eg. Sets, Maps, Arrays. Ambiguous names like choose or even randomElement would become problematic.

lehins avatar Jul 27 '23 23:07 lehins

@konsumlamm I disagree with you on both points:

I think it makes more sense to make g the first argument, for currying.

It breaks convention of the whole library

You're right, it's better to be consistent with the rest of the library.

perhaps choose instead of randomElement (the latter is rather verbose)?

First of all choose is already used in QuickCheck.

IMO that's a reason for calling it that, not against.

Second of all, we, hopefully, in the future will have capability to do similar operations on other data structures, eg. Sets, Maps, Arrays. Ambiguous names like choose or even randomElement would become problematic.

This is a good point though.

konsumlamm avatar Jul 28 '23 14:07 konsumlamm

randomElement neatly generalizes to arbitrary Foldable:

choose :: (Fodlable f, StatefulGen g m) => f a -> g -> m a
choose f g = do
  i <- uniformRM (0,n-1) g
  return $ toList f !! i  
  where
    n = length f

While splitListAtRandom would be useful for taking random element from list.

Shimuuar avatar Jul 31 '23 13:07 Shimuuar

randomElement neatly generalizes to arbitrary Foldable:

Yes, I thought about this implementation as well, but that would be sub-optimal for Map and Vector, due to !! and potential construction of an intermediate list. Although latter problem is probably alleviated by the list fusion.

lehins avatar Jul 31 '23 14:07 lehins

Indeed. But it still beats choose . toList since we can use mode efficient length

Shimuuar avatar Jul 31 '23 16:07 Shimuuar