purrr icon indicating copy to clipboard operation
purrr copied to clipboard

Progress bars

Open hadley opened this issue 8 years ago • 43 comments

Would be nice to have support for progress bars in all map functions. This is a nice feature of plyr.

Could use https://github.com/gaborcsardi/progress, although we might need to ask @gaborcsardi to also provide a C API.

hadley avatar Dec 10 '15 20:12 hadley

There is a header only C++ API, isn't that good? Although I have to say that is not very well tested and have less features. https://github.com/gaborcsardi/progress#c-api

gaborcsardi avatar Dec 10 '15 20:12 gaborcsardi

It would also make sense to put the C++ part in another package, as it is completely independent.

gaborcsardi avatar Dec 10 '15 20:12 gaborcsardi

This will not work with mapping functions because we eval R functions from the C code :/ Any user interruption or R error will cause a long jump that bypasses all C++ destructors.

Thus if you have any data on the heap, you'll get leaks. For example all STL containers or even a simple std::string allocate memory dynamically so need to be destructed appropriately. See discussion in https://github.com/hadley/purrr/commit/e2def88a4039b15e2c8f92247808264fd03bcf4a

lionel- avatar Dec 10 '15 20:12 lionel-

Well, there is an R API and a C++ API. It seems reasonable that you would be able to use at least one of them. :)

gaborcsardi avatar Dec 15 '15 12:12 gaborcsardi

+1 for this feature

vixr avatar Jan 22 '16 02:01 vixr

FWIW I wanted to note that I am adding some new progress bar API, which has the nice feature of having (almost) zero overhead when the progress bars are not shown (e.g. non-interactive use), in addition to ease of use. This is how it will look:

progress %~~% lapply(seq, fun, ...)

If progress bars are turned off, then it simply runs the lapply. If progress bars are on, then it appends the progress bar ticks to fun.

I am saying this, because it would be great to use it for purrr functions as well.

gaborcsardi avatar Mar 23 '16 22:03 gaborcsardi

It'd probably be more purrr-like to have an adverb functional or function operator that takes mapping functionals and add progress bars to them. With a functional:

# ..f must be another functional that takes a .x and a .f
# ..f must have the usual purrr signature ..f(.x, .f, ...)
with_progress <- function(..x, ..f, .f, ...) {
  .f <- add_progress(.f, length(..x))
  ..f(..x, .f, ...)
}

mtcars %>% with_progress(map, as.character)

lionel- avatar Mar 24 '16 10:03 lionel-

A quick thought: I think it's natural to have adverb functionals when we're modifying another functional, like in the example above. But otoh it's natural to have adverb function operators when we're modifying a regular function, e.g. safely(), lift(), etc.

lionel- avatar Mar 24 '16 10:03 lionel-

@lionel- Hmmm, maybe I misunderstand sg, but why not

mtcars %>% with_progress(map)(as.character)

then? Or is this what you mean in your second comment?

gaborcsardi avatar Mar 24 '16 11:03 gaborcsardi

Or is this what you mean in your second comment?

yes this is what I mean. Maybe @hadley has another opinion though.

lionel- avatar Mar 24 '16 11:03 lionel-

Hmmm, actually I quite like this, no extra operator needed. Maybe I should do it with lapply as well. Unfortunately I cannot really do it with for loops. They have to use an operator:

with_pb %~~% for (i in 1:100) { }

gaborcsardi avatar Mar 24 '16 11:03 gaborcsardi

Maybe I should do it with lapply as well

lapply(), vapply() etc should work for free with this approach since they take a vector as first argument and a function as second :)

mtcars %>% with_progress(vapply, sum, numeric(1))

Unfortunately I cannot really do it with for loops. They have to use an operator:

There is some discussion about function-like looping in #168 and #135.

lionel- avatar Mar 24 '16 12:03 lionel-

I think that progress bars are so useful, there should be minimal friction to use them in purrr. That makes me think that they should be an option (like plyr), or possibly even automatically display given some conditional (e.g. loop has run for 2 seconds and has at least two more to go, like dplyr).

hadley avatar Mar 24 '16 12:03 hadley

How about automatically displaying them unless (a) this is not an interactive session (b) a global option is set to disable them?

This is a case where it makes sense to have a global option since this is a side effect for user convenience that shouldn't have an impact on the return value. Also it'll still be possible to use withr::with_options() in case it's important to control the option on a case by case basis (though I don't see when that would be useful). I think that's preferable to a plyr-like option that would clutter the function signatures.

Also it's still nice to have a functional to add progress bars to lapply() etc.

lionel- avatar Mar 24 '16 13:03 lionel-

Yes, agreed about non-interactive use + global option to turn off. That's what dplyr has too.

hadley avatar Mar 24 '16 14:03 hadley

Will have to look if pbapply could help here. It uses global option and turned off when non-interactive.

psolymos avatar Sep 26 '16 05:09 psolymos

Also useful to display names if they're present.

hadley avatar Apr 24 '17 01:04 hadley

Any news on this issue? Has any sort of progress bar been implemented?

sillasgonzaga avatar Jan 11 '18 23:01 sillasgonzaga

Yes I'm with @sillasgonzaga here on looking for an update. In some situations (particularly if there's an API call in the function) I'm dropping back to pbapply.

chris-billingham avatar Jan 14 '18 20:01 chris-billingham

Any progress on the progress bar?

tiernanmartin avatar Jan 21 '18 00:01 tiernanmartin

@sillasgonzaga, @chris-billingham, @tiernanmartin , I am not part of the tidyverse team but I happen to know that they work on each development work by phase. There will be a purrr phase, don't worry ! So I think it does not help to ask for status update every 2 days or every week.

As you seem to be pretty interested in progress bar, if you don't already, know that currently, even if it is not transparent in purrr, you can create progress bar in the tidyverse. Here is a dummy example you can run in your session, and it will display a progress bar.

# you can also load all the tidyverse 
library(dplyr)
library(purrr)

# dummy list of 10 elements with random numbers
dummy_list <- rerun(10, runif(5))
# create the progress bar with a dplyr function. 
pb <- progress_estimated(length(dummy_list))
res <- dummy_list %>%
  map(~{
    # update the progress bar (tick()) and print progress (print())
    pb$tick()$print()
    Sys.sleep(0.5)
    sum(.x)
  })

As you see it is just two lines to add to your code. Pretty simple. One to create the progress bar element with dplyr::progress_estimated. It will create an object pb here that is an R6 class element. You can find the different methods with pb$<method>. For updating progress bar and print progress, you can just use pb$tick()$print() as you see in the example. You should read the help: help("progress_estimated", package = "dplyr")

It works very well with purrr function. Only drawback : makes your piped code a little less concise.

Hope it helps, and it will keep you waiting until better integration in purrr

cderv avatar Jan 21 '18 09:01 cderv

I think we have 3 options to integrate progress bars functionality in purrr

  1. create an adverb to modify the user function, adding tickers on it.
  2. add a .progress= parameter inside the map functions.
  3. create an adverb to modify the map functions.

(1) is easier to code but will force the user to learn a new adverb that depends on the original function and the input (at least the input length). (2) is harder but is straightforward to the user. (3) is the most general but also the hardest to understand

To solve (1), I was thinking something like this adverb using @gaborcsardi progress package

progressively <- function(.f, .n, ...) {
  pb <- progress::progress_bar$new(total = .n, ...)
  function(...) {
    pb$tick()
    .f(...)
  }
}

Simple example:

input <- 1:5
fun <- function(x) {
  Sys.sleep(.2)
  sample(x)
}
progress_fun <- progressively(fun, length(input))
purrr::map(input, progress_fun)

The problem is that if we run this two times the progress bar is not shown, because pb is already complete. But I think it is easy to find a way to restart it when this happens using some environment tricks.

If (1) is not enough, I think that (2) - add .progress= option - is the best option, because (3) - modify map functions - is hard to understand. But I also think it will be difficult to code.

jtrecenti avatar Feb 13 '18 13:02 jtrecenti

There's an fourth option as suggested by @lionel- and @hadley

  1. Add progress bar as default if the loop takes more than s seconds and the length of the input is greater than n. Control this in the global options.

That's better than (2) so it's the best approach. Would it require big changes in map functions?

jtrecenti avatar Feb 13 '18 15:02 jtrecenti

This needs to be tackled at the same time as parallelism support, which we'll start working on soon.

lionel- avatar Feb 13 '18 18:02 lionel-

@jtrecenti my vote is toward option 2, .progress = T. Also progressively is just too many characters IMO.

sdanielzafar avatar Mar 04 '18 21:03 sdanielzafar

Can't wait! :)

johncassil avatar May 01 '18 17:05 johncassil

We've been using furrr package for a while now. It uses the future package to do the hard job. @ctlente created a function named abjutils::pvec() inside abjutils package, that maps a function on a vector safely, in parallel and using progress bars. It has many bugs yet but I found it really really useful.

jtrecenti avatar May 01 '18 17:05 jtrecenti

just wanted to chime in to say that I really dig @gaborcsardi progress package; much prefer the greater customisability over the simpler dplyr::progress_estimated() (which already works with purrr as per above example). So if purrr could support progress, that'd be great.

maxheld83 avatar Aug 01 '18 19:08 maxheld83

I probably should have checked here first, but i have produced wrapped versions of the purrr iterators which produce progress bars using the progress package. You can find my very early version here. purrrgress with the caveat that nothing has been tested yet outside my own use cases.

TylerGrantSmith avatar Sep 30 '19 05:09 TylerGrantSmith

call for this feature too

kongdd avatar Oct 15 '19 08:10 kongdd