vcr icon indicating copy to clipboard operation
vcr copied to clipboard

httr with upload_file in body can not be re-created internally in vcr

Open sckott opened this issue 6 years ago • 7 comments

cc @aaronwolen

In our work to fix how webmockr/vcr handle bodies correctly, I noticed that we cannot re-create requests that have an upload passed directly to body (not in a list). This concerns this code block most importantly https://github.com/ropensci/vcr/blob/master/R/request_handler-httr.R#L102-L109 but also this https://github.com/ropensci/vcr/blob/master/R/request_handler-httr.R#L76-L81 if a user uses the ignored requests feature.

Re-creating what essentially is happening in those linked code blocks above, ...

library(httr)

This works (character string to body)

e <- POST("https://httpbin.org/post", body = "foo bar")
# re-create the request
POST("https://httpbin.org/post", do.call(config, e$request$options))

And this works (upload in a list)

b <- POST("https://httpbin.org/post",
      body = list(y = upload_file(system.file("CITATION"))))
# re-create the request
POST("https://httpbin.org/post",
    body = list(y = upload_file(system.file("CITATION"))),
    do.call(config, b$request$options))

But this doesn't work (it just hangs indefinitely)

d <- POST("https://httpbin.org/post",
      body = upload_file(system.file("CITATION")))
# re-create the request
POST("https://httpbin.org/post", do.call(config, d$request$options))

The problem stems from that we CAN get what's passed in the body from the request object created by httr EXCEPT when upload_file is passed directly to body (and there may be other cases I don't know about) . When upload_file is passed directly to body, the details are al in the $options in the request object, and there is no trace of the file path for the upload, or the content type. When an upload is passed in a list, we can get the needed information from the request.

The way webmockr/vcr are setup, I don't think we have access to what was passed to body originally, only what's in the httr $request object

Any thoughts on this @jennybc ?

sckott avatar Dec 20 '19 20:12 sckott

Having read the above once / quickly, I don't immediately have any insights. I've never tinkered on a part of httr that left with knowledge that helps me. That is, I'd have to study up and do my own analysis to contribute anything.

jennybc avatar Dec 21 '19 23:12 jennybc

Afraid I don't have an answer either but I can reliably reproduce the issue. It looks like the recreated httr request hangs on curl::curl_fetch_memory(), which times out after 60s for me. Incidentally, the wait time can be modified:

POST("https://httpbin.org/post", do.call(config, c(d$request$options, timeout_ms = 1000)))

I did notice the two packages handle a NULL connection differently within the body of readfunction:

  • httr returns raw()

    if (is.null(con)) {
      return(raw())
    }
    
  • whereas crul assigns the filepath of body

    if (is.null(con)) con <<- file(filePath, "rb")
    if (is.null(con)) return(raw())
    

    so return(raw()) is never encountered

The recreated request does work if httr is modified to return a connection to the file body

readfunction = function(nbytes, ...) {
  if (is.null(con)) {
+   con <- file(body$path, "rb")
-   return(raw())
  }
  bin <- readBin(con, "raw", nbytes)
  if (length(bin) < nbytes) {
    close(con)
    con <<- NULL
  }
  bin
}

but it's not clear to me whether that's the "right" thing to do or not.

aaronwolen avatar Dec 22 '19 21:12 aaronwolen

thanks @jennybc - no worries

sckott avatar Dec 31 '19 17:12 sckott

thanks @aaronwolen for that insight. I imagine something that has to be changed within httr will probably not work as a solution - but if that's the only or best solution then we can at least approach them and see if they're open to the change.

sckott avatar Dec 31 '19 17:12 sckott

Not sure if it's helpful but I use httr::upload_file() in a function and cannot create the fixture corresponding to that, I get a 500 error. https://github.com/maelle/goodpress/pull/8 I am happy to provide more information (the same code/test works without vcr).

maelle avatar Jun 20 '20 11:06 maelle

in the case of the API I'm working with, the filename is also passed as a header actually (Content-Disposition).

maelle avatar Jun 20 '20 12:06 maelle

Thanks @maelle - I'll have another look at this very soon since you are encountering this

sckott avatar Jun 22 '20 15:06 sckott