Error: Cannot find progress bar (req_perform_parallel)
When I run a req_perform_parallel with a progress bar and abort the operation, every new call to req_perform_parallel will error until the R session is restarted. Here is a minimal example:
library(httr2)
reqs <- list(
request(example_url()) |>
req_url_path("/delay") |>
req_url_path_append(sample(1:10, 1))
) |>
rep(100)
resps <- req_perform_parallel(reqs, progress = TRUE)
### stop with Esc ##
resps <- req_perform_parallel(reqs, progress = TRUE)
#> Error in cli::cli_progress_update(..., id = id) :
#> Cannot find progress bar `cli-25573-7`
#> Error in cli::cli_progress_update(..., id = id) :
#> Cannot find progress bar `cli-25573-7`
This does not apply to req_perform_sequential.
Hmmm create_progress_bar() should probably add an exit handler to the calling environment that automatically closes the progress bar. But I'm a bit surprised that's necessary because I don't understand why it would try and reuse the previous progress bar.
I also have no idea. And here is some more weirdness. I just noticed that it does not happen every time! If I hit Esc too fast, it does not seem to reuse the id???
@gaborcsardi any idea what's going on here? The progress bar is created in a helper function: https://github.com/r-lib/httr2/blob/e972770199f674eca4c64ca8161235e5745683dd/R/utils.R#L242-L284
Maybe something is going wrong with how I set .envir?
I am not sure, but my guess is that the progress bar's env is already removed, but the curl multi-handle is still alive and you get a final update() call for it.
Ooooh I bet that's it.
I think we can fix this by using a separate pool for each req_perform_parallel() call, rather than relying on the default global pool.
I think you can also tryCatch() and ignore the error.
I can't reliably reproduce this. I tried what I thought would illustrate it more reliably:
library(httr2)
options(cli.progress_show_after = 0)
reqs <- list(
request(example_url()) |> req_url_path("/delay/10"),
request(example_url()) |> req_url_path("/delay/1"),
request(example_url()) |> req_url_path("/delay/10"),
request(example_url()) |> req_url_path("/delay/1"),
request(example_url()) |> req_url_path("/delay/10"),
request(example_url()) |> req_url_path("/delay/1")
)
resps <- req_perform_parallel(reqs, progress = TRUE)
But I only manage to see the error once, so I suspect it might be more of a problem on windows.
I'm on Arch Linux (BTW)
It only seems to happen when I interrupt the process after a couple of seconds and not every time. Which makes this the weirdest behaviour that I've experienced so far in R.
Can you start another process after you see the error once? I need to restart R.
But my experiments suggest that there might be a bigger problem — I think terminating req_perform_parallel() doesn't actually terminate the other ongoing requests.
...
Hmmmm, some experiments suggest that maybe that's not the case?
@JBGruber could you please try #602 and see if it fixes the problem for you?
This fixes the error :blush:. But I observed some other unexpected behaviour.
- After interrupting, I get
! Operation timed out after 10000 milliseconds with 0 bytes receivedon the next attempt when that shouldn't be the case. I decreased the delay to 2 seconds on the second attempt to make sure this isn't a coincidence. If the delay of first + second attempt is below 10 seconds, it does not seem to happen. - If I wait until 50% requests are done and then interrupt, the return object is still produced, with half of responses missing (if I wait until 80%, 20% are empty and so on)
Sorry for the screen recording, but I don't know how else to show it. Until second 46, you see the first issue. I wait a little bit and then show issue 2 from secon 56.
I think this is all far less problematic than having to restart R to be able to run req_perform_parallel() with a progress bar. But I think it suggests that you might have been right and terminating req_perform_parallel() doesn't actually terminate the other ongoing requests.
@JBGruber I did a bunch of experiments and convinced myself that they're getting cancelled on my machine, but maybe that code path doesn't work correctly on linux? I'll think about how to test that hypothesis.
@gaborcsardi any ideas on why this might be different on Arch?
Different libcurl version or settings, potentially. But it can also just be random.
curl::curl_version()
#> $version
#> [1] "8.11.1"
#>
#> $headers
#> [1] "8.11.0"
#>
#> $ssl_version
#> [1] "OpenSSL/3.4.0"
#>
#> $libz_version
#> [1] "1.3.1"
#>
#> $libssh_version
#> [1] "libssh2/1.11.1"
#>
#> $libidn_version
#> [1] "2.3.7"
#>
#> $host
#> [1] "x86_64-pc-linux-gnu"
#>
#> $protocols
#> [1] "dict" "file" "ftp" "ftps" "gopher" "gophers" "http"
#> [8] "https" "imap" "imaps" "mqtt" "pop3" "pop3s" "rtsp"
#> [15] "scp" "sftp" "smb" "smbs" "smtp" "smtps" "telnet"
#> [22] "tftp" "ws" "wss"
#>
#> $ipv6
#> [1] TRUE
#>
#> $http2
#> [1] TRUE
#>
#> $idn
#> [1] TRUE
#>
#> $url_parser
#> [1] TRUE
Created on 2025-01-06 with reprex v2.1.1
One particular difference could be HTTP/1.1 vs HTTP/2, i.e. whether libcurl has HTTP/2 support and whether it is used by default. AFAIR cancellation is very different for HTTP/2, because of the multiplexing.
Since I can't reproduce the problem easily, and I need to get this release out of the door, I'm going to push this to the future.