gh icon indicating copy to clipboard operation
gh copied to clipboard

FR: Support returning incomplete results after an interrupt

Open krlmlr opened this issue 4 years ago • 9 comments

Scenario: I'm running a query and didn't realize how many results are returned. The progress bar (#26) indicates that it's going to be too long, but I'd like to look at partial results. I'm pressing Escape (or Ctrl + C) but the intermediate results are gone.

As discussed in #86, we could store intermediate results in a private environment and expose with a function, e.g. gh_partial() . Catching interrupts is a bad idea.

krlmlr avatar Oct 13 '19 11:10 krlmlr

I am not convinced that this is a good idea. It seems hard to implement it in a way that ensures that the incomplete storage belongs to the last interrupted gh() call.

gaborcsardi avatar Jan 21 '20 14:01 gaborcsardi

Catching interrupts could actually work, except that curl is buggy: https://github.com/jeroen/curl/issues/216

gaborcsardi avatar Jan 21 '20 14:01 gaborcsardi

Sorry, what I meant above is that returning the incomplete results is a great idea, but storing them in a private place and making sure that they belong to the last request seems hard.

E.g. what if you get an interrupt before anything was downloaded? Then the previous incomplete results are there. What if you call gh() from async code (eg. the async package)? Any kind of concurrency makes this hard.

Maybe we can fix the interrupt issue in curl, and then catch the interrupt. That would be pretty cool, because you could even continue the program, from the browser, with .tryResumeInterrupt().

But of course catching the interrupt is not so simple, either, because we don't want to do that if gh() is in downstream code.

gaborcsardi avatar Jan 21 '20 16:01 gaborcsardi

It may be easier to implement this with the async curl api, like so:

get_data <- function(){
  url <- 'https://nghttp2.org/httpbin/drip?duration=10&numbytes=500'
  pool <- curl::new_pool()
  buf <- rawConnection(raw(0), "r+")
  on.exit({
    out <- rawConnectionValue(buf)
    close(buf)
    return(out)
  })
  curl::curl_fetch_multi(url, pool = pool, data = function(x, ...){
    writeBin(x, buf)
  })
  curl::multi_run(pool = pool)
}

# Interrupt this after a few sec:
get_data()

jeroen avatar Jan 21 '20 21:01 jeroen

Yeah, but why do the callbacks of the multi api support interrupt handlers and the easy api callbacks don't?

gaborcsardi avatar Jan 21 '20 21:01 gaborcsardi

I just don't really know a good way to do implement this in C.

jeroen avatar Jan 22 '20 12:01 jeroen

You could use an approach that is in the cleancall package, and just call R_CheckUserInterrupt() normally.

gaborcsardi avatar Jan 23 '20 17:01 gaborcsardi

We could just return the result, with a warning, instead of storing it somewhere?

gaborcsardi avatar Apr 30 '21 07:04 gaborcsardi

I forgot where, but I know we've used this pattern elsewhere (i.e. on interrupt, warn and return the current values). But maybe we should only do this when executing in the global environment, to avoid some additional layer of functions from getting an incomplete result?

hadley avatar Feb 07 '23 13:02 hadley