purrr icon indicating copy to clipboard operation
purrr copied to clipboard

Feature: unfold function

Open jnolis opened this issue 6 years ago • 5 comments

The functional programming language F# has a fantastic function that I believe should be considered for adding to purrr. It's basically the reverse of reduce/accumulate. It's the "unfold" function and takes as an input:

  1. an initial state
  2. a function that takes a state and returns a state-option and a value.

Unfold applies the function to the initial state, then if the result contains a new state the function is applied to that state, and this continues until the function stops returning a state.

As a practical example, if you wanted to make a sequence of all the powers of 2 under 1000, you could use unfold with:

  1. initial state: s <- 0
  2. a processing function: f <- function(s) if((s+1)^2 < 1000) list(s+1, s^2) else list(NA, s^2)

Then you would run unfold(s, f) to generate the sequence you want. You can also imagine that like reduce/accumulate you could have two versions, one that returns all the intermediate output and one that returns the final output only.

Here is a tweet by @ryantimpe where an unfold function would be perfect. https://twitter.com/ryantimpe/status/1182705055549579264?s=20

Does this seem like something that could/should be added? If there is some interest I have a bunch more thoughts on how it could be implemented.

jnolis avatar Oct 14 '19 17:10 jnolis

Would it be something like this ? I don't fully understand the code of your example (btw these are not powers of 2 but square numbers :) ) :

unfold <- function(.x, .condition, .f, ..., .max_iter = 10^3){
  .f <- rlang::as_function(.f)
    for (i in seq(.max_iter)){
      x_new <- .f(.x, ...)
      if(eval(substitute(.condition),envir = list(. = x_new), enclos = parent.frame())) break
      .x <- x_new
    }
  if(i == .max_iter) stop(".max_iter was reached")
  .x
}
unfold(0, .^2 > 10, ~ .+1)
#> [1] 3

unfold2 <- function(.x, .condition, .f, ..., .max_iter = 10^3){
  .f <- rlang::as_function(.f)
  res <- list(.x)
  for (i in seq(.max_iter)){
    x_new <- .f(.x, ...)
    if(eval(substitute(.condition),envir = list(. = x_new), enclos = parent.frame())) break
    .x <- x_new
    res <- append(res, x_new)
  }
  if(i == .max_iter) stop(".max_iter was reached")
  res
}
unfold2(0, .^2 > 10, ~ .+1)
#> [[1]]
#> [1] 0
#> 
#> [[2]]
#> [1] 1
#> 
#> [[3]]
#> [1] 2
#> 
#> [[4]]
#> [1] 3

Created on 2019-12-02 by the reprex package (v0.3.0)

moodymudskipper avatar Dec 02 '19 17:12 moodymudskipper

Hi! I think it would be something like that, with a few alterations

  1. I changed .f before .condition first, because that seems more logical for me for some reason
  2. I would have .condition be a function like .f so they're consistent
  3. I like the variable .max_iter but I set it to default to infinite (and switched to a while loop)
  4. I would have the code actually check if the initial state satisfies the condition, which makes the code a little cleaner and is consistent with how F# does it (and kind of logically makes sense)
  5. I think it should return the whole set of results as a list, so like your unfold2.

Unfortunately in making all of these changes I somehow broke it and I don't know enough about non-standard evaluation to quickly debug it...

Does this seem like it makes sense to you? Thanks!

unfold2 <- function(.x,  .f, .condition, ..., .max_iter = NULL){
  .f <- rlang::as_function(.f)
  .condition <- rlang::as_function(.condition)
  res <- list(.x)
  iter <- 0
  while(!eval(substitute(.condition),envir = list(. = .x), enclos = parent.frame()) & (is.null(.max_iter) | iter < .max_iter)){
    .x <- .f(.x, ...)
    res <- append(res, .x)
  }
  if(!is.null(.max_iter) && i == .max_iter) stop(".max_iter was reached")
  res
}

jnolis avatar Dec 03 '19 18:12 jnolis

Hi @jnolis, my post aimed mainly at clarifying your request, I'm happy to help though but I think I'd rather move it to stack overflow rather than cluttering this issue with implementation details.

Quickly though and maybe that'll be enough

  • you could just have used Inf rather than NULL but bad idea in my opinion for a default
  • using a predicate function is a good idea, but then no need for nse, you just need to actually call the function rather than evaluate the function object as you do now
  • for loops are safer than while but if you want a while loop you'll need to increment i
  • my order of argument was designed to place. f next to its dots

moodymudskipper avatar Dec 03 '19 19:12 moodymudskipper

It seems like you can get there in a backwards sort of way using reduce and done for the problem you're describing above and probably the solution for the problem described in the tweet you posted.

library(purrr)

my_f <- function(x, y) {
  if(y^2 < 1000) y^2
  else done(y^2)
}

l <- 1:1000

reduce(l, my_f)
#> [1] 1024
accumulate(l, my_f)
#>  [1]    1    4    9   16   25   36   49   64   81  100  121  144  169  196
#> [15]  225  256  289  324  361  400  441  484  529  576  625  676  729  784
#> [29]  841  900  961 1024

Created on 2019-12-05 by the reprex package (v0.3.0)

Of course you'll have to meet the requirements of reduce as well, which involves creating a function requiring two arguments only, and you also don't have any neat capabilities with predicate functions.

saadaslam avatar Dec 05 '19 22:12 saadaslam

Here is an implementation that would better fit in purrr:

unfold <- function(.init, .f, ...) {
  out <- NULL
  state <- .init

  while (!is_done_box(curr <- .f(state, ...))) {
    out <- new_node(curr[[1]], out)
    state <- curr[[2]]
  }

  final <- unbox(curr)
  if (!is_missing(final)) {
    out <- new_node(final, out)
  }

  rev(as.list(out))
}

evens_up_to <- function(n) {
  vec_simplify(unfold(0L, function(state) {
    if (state > n) {
      done()
    } else {
      list(state, state + 2)
    }
  }))
}
squares_up_to <- function(n) {
  vec_simplify(unfold(2L, function(state) {
    if (state > n) {
      done()
    } else {
      list(state, state^2)
    }
  }))
}

evens_up_to(10)
#> [1]  0  2  4  6  8 10

squares_up_to(1000)
#> [1]   2   4  16 256

lionel- avatar Aug 05 '20 14:08 lionel-

I think this is an interesting idea but doesn't quite fit with the current vision for purrr, which is striving to stay as simple as possible. We're currently steering away from more experimental ideas and new FP paradigms, but unfold() seems like an interesting idea and might be able to find a home in another FP package.

hadley avatar Aug 27 '22 11:08 hadley